AIXI - Burny

## Tags - Part of: [[Mathematical theory of artificial intelligence]] - Related: - Includes: - Additional: ## Technical summaries - Noncomputable idealization of intelligence - AIXI is a theoretical model of artificial intelligence that combines [[decision theory ]]and [[algorithmic information theory]] to make optimal decisions by considering all possible computable models of the environment, updating them based on past experiences, and selecting actions that maximize expected future rewards. ## Main resources - [\[cs/0004001\] A Theory of Universal Artificial Intelligence based on Algorithmic Complexity](https://arxiv.org/abs/cs/0004001) [AIXI - LessWrong](https://www.lesswrong.com/tag/aixi) - <iframe src="https://en.wikipedia.org/wiki/AIXI" allow="fullscreen" allowfullscreen="" style="height:100%;width:100%; aspect-ratio: 16 / 5; "></iframe> ## Deep dive - AIXI is a [[reinforcement learning]] (RL) agent. It maximizes the expected total rewards received from the environment. Intuitively, it simultaneously considers every computable hypothesis (or environment). In each time step, it looks at every possible program and evaluates how many rewards that program generates depending on the next action taken. The promised rewards are then weighted by the subjective belief that this program constitutes the true environment. This belief is computed from the length of the program: longer programs are considered less likely, in line with [[Occam's razor.]] AIXI then selects the action that has the highest expected total reward in the weighted sum of all these programs. - It combines [[Solomonoff induction]] from [[Probability theory]] and [[Computer science]] with [[sequential decision theory]].