## Tags
- Part of: [[Psychology]] [[Reinforcement learning]] [[AI safety]] [[Mathematical theory of artificial intelligence]]
- Related:
- Includes:
- Additional:
## Definitions
- Research program aimed at explaining the systematic relationships between the reinforcement schedules and learned values of reinforcement-learning agents
## Main resources
- [Shard Theory: An Overview — LessWrong](https://www.lesswrong.com/posts/xqkGmfikqapbJ2YMj/shard-theory-an-overview)
- <iframe src="https://www.lesswrong.com/posts/xqkGmfikqapbJ2YMj/shard-theory-an-overview" allow="fullscreen" allowfullscreen="" style="height:100%;width:100%; aspect-ratio: 16 / 5; "></iframe>
## Landscapes
- [Shard Theory - LessWrong](https://www.lesswrong.com/s/nyEFg3AuJpdAozmoX)
## Contents
- [Understanding and controlling a maze-solving policy network — LessWrong](https://www.lesswrong.com/posts/cAC4AXiNC5ig6jQnc/understanding-and-controlling-a-maze-solving-policy-network)
- [Steering GPT-2-XL by adding an activation vector — LessWrong](https://www.lesswrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector)