===Science -> Technology -> Artificial Intelligence -> Modalities -> Multimodal -> Boston Dynamics with GPT=== [Making Chat (ro)Bots - YouTube](https://www.youtube.com/watch?v=djzOBZUFzTw) ===Science -> Technology -> Artificial Intelligence -> Modalities -> Multimodal -> Future house AI scientist=== “Over the next 10 years, we are aiming to build a fully autonomous AI Scientist, and to use that AI Scientist to scale up biology research.” [Future House]([Future House]([Future House]([Future House](https://www.futurehouse.org/articles/announcing-future-house)))) ===Science -> Technology -> Artificial Intelligence -> Taxonomy=== [[[2311.02462] Levels of AGI: Operationalizing Progress on the Path to AGI]([[2311.02462] Levels of AGI: Operationalizing Progress on the Path to AGI](https://arxiv.org/abs/2311.02462)) Levels of AGI: Operationalizing Progress on the Path to AGI], [[Imgur: The magic of the Internet](https://imgur.com/x3iWmQu) Levels of AGI table], [[Imgur: The magic of the Internet](https://imgur.com/Q5XVg2g) AI Autonomy level table] ===Science -> Technology -> Artificial Intelligence -> Taxonomy -> Narrow intelligence=== ===Science -> Technology -> Artificial Intelligence -> Taxonomy -> Narrow intelligence -> Deepmind's AlphaGo=== Beating humans at chess ===Science -> Technology -> Artificial Intelligence -> Taxonomy -> Narrow intelligence -> DeepMind's AlphaStar=== Beating humans in StarCraft ===Science -> Technology -> Artificial Intelligence -> Taxonomy -> Narrow intelligence -> OpenAI's Five=== Beating humans in Dota ===Science -> Technology -> Artificial Intelligence -> Taxonomy -> General intelligence=== ===Science -> Technology -> Artificial Intelligence -> Taxonomy -> Superintelligence=== ===Science -> Technology -> Artificial Intelligence -> Explanability and safety=== [Explainable artificial intelligence - Wikipedia]([Explainable artificial intelligence - Wikipedia]([Explainable artificial intelligence - Wikipedia](https://en.wikipedia.org/wiki/Explainable_artificial_intelligence))) [US presidents rate AI alignment agendas - YouTube]([US presidents rate AI alignment agendas - YouTube](https://www.youtube.com/watch?v=02kbWY5mahQ)) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Reinforcement learning from human feedback=== Used for ChatGPT. Reinforcement learning from human feedback (RLHF) is a technique that trains a "reward model" directly from human feedback and uses the model as a reward function to optimize an agent's policy using reinforcement learning (RL). The reward model is trained in advance to the policy being optimized to predict if a given output is good (high reward) or bad (low reward). More intuitively it's like giving the LLM a bunch of "yes" or "no" feedback on its actions and then training it to do more of the "yes" stuff. [Reinforcement learning - Wikipedia]([Reinforcement learning - Wikipedia]([Reinforcement learning - Wikipedia](https://en.wikipedia.org/wiki/Reinforcement_learning)))_from_human_feedback [[[2310.12036] A General Theoretical Paradigm to Understand Learning from Human Preferences]([[2310.12036] A General Theoretical Paradigm to Understand Learning from Human Preferences](https://arxiv.org/abs/2310.12036))#deepmind A General Theoretical Paradigm to Understand Learning from Human Preferences] ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Direct Preference Optimization=== RLHF involves training a reward model and then optimizing a policy to maximize this model, while DPO optimizes directly from human preferences, avoiding the reward modeling stage. This method directly uses people's preferences to teach the program, without the intermediate step of converting these preferences into "yes" or "no" feedback. [[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://arxiv.org/abs/2305.18290) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Circuit interpretability, cracking superposition=== "Using a sparse autoencoder, extracting a large number of interpretable features from a one-layer transformer. Sparse autoencoder features can be used to intervene on and steer transformer generation. Features connect in "finite-state automata"-like systems that implement complex behaviors. For example, finding features that work together to generate valid HTML."" https://www.anthropic.com/index/decomposing-language-models-into-understandable-components [Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning]([Towards Monosemanticity: Decomposing Language Models With Dictionary Learning](https://transformer-circuits.pub/2023/monosemantic-features/index.html)))))))))) Toy Models of Superposition: [[2209.10652] Toy Models of Superposition](https://arxiv.org/abs/2209.10652) A Mathematical Framework for Transformer Circuits: [A Mathematical Framework for Transformer Circuits](https://transformer-circuits.pub/2021/framework/index.html) [AI Mechanistic Interpretability Explained [Anthropic Research] - YouTube]([AI Mechanistic Interpretability Explained [Anthropic Research] - YouTube](https://www.youtube.com/watch?v=Mhp8vpOksWw)) Potentially scalable using [A comparison of causal scrubbing, causal abstractions, and related methods — AI Alignment Forum]([A comparison of causal scrubbing, causal abstractions, and related methods — AI Alignment Forum](https://www.alignmentforum.org/posts/uLMWMeBG3ruoBRhMW/a-comparison-of-causal-scrubbing-causal-abstractions-and)) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Neel Nanda's Mechanistic Interpretability=== "As a case study, we investigate the recently-discovered phenomenon of ``grokking'' exhibited by small transformers trained on modular addition tasks. We fully reverse engineer the algorithm learned by these networks, which uses discrete Fourier transforms and trigonometric identities to convert addition to rotation about a circle." [[2301.05217] Progress measures for grokking via mechanistic interpretability]([[2301.05217] Progress measures for grokking via mechanistic interpretability]([[2301.05217] Progress measures for grokking via mechanistic interpretability](https://arxiv.org/abs/2301.05217))) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Steering by adding an activation vector=== https://www.lesswrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Topdown representation engineering=== [Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency]([Representation Engineering: A Top-Down Approach to AI Transparency](https://www.ai-transparency.org/)))))))) All moral intuitions and moral systems of humans can be abstracted as cooperation protocols: aka patterns in behavior resulting increased cooperation, [Moral foundations theory - Wikipedia]([Moral foundations theory - Wikipedia]([Moral foundations theory - Wikipedia](https://en.wikipedia.org/wiki/Moral_foundations_theory))) [Cultural universal - Wikipedia]([Cultural universal - Wikipedia]([Cultural universal - Wikipedia](https://en.wikipedia.org/wiki/Cultural_universal))) for example having shared goals and infering states of others. [Entropy | Free Full-Text | An Active Inference Model of Collective Intelligence]([Entropy | Free Full-Text | An Active Inference Model of Collective Intelligence]([Entropy | Free Full-Text | An Active Inference Model of Collective Intelligence]([Entropy | Free Full-Text | An Active Inference Model of Collective Intelligence]([Entropy | Free Full-Text | An Active Inference Model of Collective Intelligence](https://www.mdpi.com/1099-4300/23/7/830))))) You can take all data sets that include those patterns and set the mas ground truth in existing models, either through prompt engineering, RLHF, adding vectors to inference, or training the model in the context of all those linguistic cooperation protocols in from the very beggining. ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Activation vector steering with BCI=== [Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund](https://manifund.org/projects/activation-vector-steering-with-bci))))) The before mentioned cooperation protocols have neural correlates - localisable features encoded in topology of information networks. [Toy Models of Superposition]([Toy Models of Superposition]([Toy Models of Superposition]([Toy Models of Superposition]([Toy Models of Superposition]([Toy Models of Superposition](https://transformer-circuits.pub/2022/toy_model/index.html)))))) [Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund]([Activation vector steering with BCI | Manifund](https://manifund.org/projects/activation-vector-steering-with-bci))))) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Agent foundations (MIRI)=== MIRI's homepage: https://intelligence.org/ Embedded Agency: [[1902.09469] Embedded Agency](https://arxiv.org/abs/1902.09469) Functional Decision Theory: A New Theory of Instrumental Rationality: [[1710.05060] Functional Decision Theory: A New Theory of Instrumental Rationality](https://arxiv.org/abs/1710.05060) Logical Induction: [[1609.03543] Logical Induction](https://arxiv.org/abs/1609.03543) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> CoEms (Conjecture)=== Conjecture's homepage: [Conjecture](https://www.conjecture.dev/) Cognitive Emulation (CoEm): A Naive AI Safety Proposal: https://www.lesswrong.com/posts/ngEvKav9w57XrGQnb/cognitive-emulation-a-naive-ai-safety-proposal ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Infrabayesianism=== Introduction To The Infra-Bayesianism Sequence: https://www.lesswrong.com/s/CmrW8fCmSLK7E25sa ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Shard theory=== Shard Theory: An Overview: https://www.lesswrong.com/posts/xqkGmfikqapbJ2YMj/shard-theory-an-overview The Shard Theory sequence: https://www.lesswrong.com/s/nyEFg3AuJpdAozmoX Understanding and controlling a maze-solving policy network: https://www.lesswrong.com/posts/cAC4AXiNC5ig6jQnc/understanding-and-controlling-a-maze-solving-policy-network Steering GPT-2-XL by adding an activation vector: https://www.lesswrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Eliciting latent knowledge=== Eliciting latent knowledge: How to tell if your eyes deceive you [Eliciting Latent Knowledge - Google Docs](https://docs.google.com/document/d/1WwsnJQstPq91_Yh-Ch2XRL8H_EpsnjrC1dwZXR37PC8/edit#heading=h.jrzi4atzacns) ===Science -> Technology -> Artificial Intelligence -> Explanability and safety -> Active Inference explanability=== ===Science -> Technology -> Artificial Intelligence -> Intersection between AI and biological systems=== Some people want intelligence as similar to humans as possible, so they try to make the hardware and software architecture as similar to the brain as possible, some one more "unrestricted" or "alien" intelligence and see what other engineering hacks or mathematically maybe more optimal methods work, and the equilibrium we have currently is our biggest models being giant stacks of transformers with lots of small optimization improvements and tons of pre and postprocess filtering (RLHF) that are mainly in the language modality running on classical GPU chips optimized for linear algebra computing. I have mentioned many more human inspired architectures already, such as Active Inference, Yann LeCun's Image Joint Embedding Predictive Architecture, or Geoffrey Hinton's Forward-Forward Algorithm. There are more attempts. [[Catalyzing next-generation Artificial Intelligence through NeuroAI | Nature Communications]([Catalyzing next-generation Artificial Intelligence through NeuroAI | Nature Communications](https://www.nature.com/articles/s41467-023-37180-x)) Catalyzing next-generation Artificial Intelligence through NeuroAI] ===Science -> Technology -> Artificial Intelligence -> Intersection between AI and biological systems and comparisions -> Spiking neural network=== Spiking neural networks (SNNs) are artificial neural networks that more closely mimic natural neural networks. SNNs incorporate the concept of time into their operating model. The idea is that neurons in the SNN do not transmit information at each propagation cycle (as it happens with typical multi-layer perceptron networks), but rather transmit information only when a membrane potential—an intrinsic quality of the neuron related to its membrane electrical charge—reaches a specific value, called the threshold. When the membrane potential reaches the threshold, the neuron fires, and generates a signal that travels to other neurons which, in turn, increase or decrease their potentials in response to this signal. A neuron model that fires at the moment of threshold crossing is also called a spiking neuron model. [Spiking neural network - Wikipedia](https://en.wikipedia.org/wiki/Spiking_neural_network) ===Science -> Technology -> Artificial Intelligence -> Intersection between AI and biological systems and comparisions -> Neuromorphic engineering=== Neuromorphic computing is an approach to computing that is inspired by the structure and function of the human brain. A neuromorphic computer/chip is any device that uses physical artificial neurons to do computations. In recent times, the term neuromorphic has been used to describe analog, digital, mixed-mode analog/digital VLSI, and software systems that implement models of neural systems (for perception, motor control, or multisensory integration). [Neuromorphic engineering - Wikipedia]([Neuromorphic engineering - Wikipedia]([Neuromorphic engineering - Wikipedia](https://en.wikipedia.org/wiki/Neuromorphic_engineering))) ===Science -> Technology -> Artificial Intelligence -> Robots made of biological tissue -> Xenobots=== Xenobots are synthetic lifeforms that are designed by computers to perform some desired function and built by combining together different biological tissues. [Xenobot - Wikipedia]([Xenobot - Wikipedia](https://en.wikipedia.org/wiki/Xenobot)) ===Science -> Technology -> Artificial Intelligence -> Vision for Omniintelligent AGSI=== AGIbrain will have omnimodal internal world model of everything, access to all of internet with studies, lectures, books, wikis, data about countries, access to all of virtual or nonvirtual tools like scraping and tons of algorithms and libraries in programming languages, access to APIs of all other existing general or specialized LLMs and other AI models and data science methods for selfupgrade and selfgrowth, fact checking using bayesian methods, access to empirical data and their (matematical) models from all fields via internet or doing experiments according to its resources, data collecting using questionares, discussions, irl, online call, mail, email, sociall media, PMs, public posts, infering causal relationshipbs between arbitrary infered latent variables, infering what approach is optimal (minimizing complexity, maximizing simplicity, costefectiveness, minimizing disadvantages, maximizing advantages, knowing the exact evolutionary niche of tools and data selecting for current question, task, goal,...) using reasoning, studies, internet, experiments, counterfactual foresighting and assigning probabilities to possible futures, hiring people to tasks where AIs are not capable of yet, arbitrary long flexible giant short term and long term memory allowing for infinite context window, arbitrary compute amount and speed and space, getting better by selfcode rewriting and metacognition to internal processes, prompt ugrading, subagent partitioning, ideal shapeshifting to correct context and mood, to maximize evolutionaty selfupgrading for optimal bayesoptimal agentic fitness, metaprompting its CPU each tick, all of which can now be built using open source models and closed source APIs to models and multiagent frameworks with possibility of human in the loop, spawning instances of its template as needed, ethicslly grounded in morality by fubdamentally implemented coordination protocols in all contexts, updating itself with state of the art algorithms and LLMs on different tasks according to benchmarks, metaprompting itself: can people or AI do this better? It gets a task, spawns a goal, asks for settings, specifications, what tools and capabilitues are needed, take deep breath, divide into subtasks, fond prompthacking methods, spawn needed subagents using concrete tools with specific goals, tries to minimize reinventing the wheel, and ducttapes the end using best practices in that fields, some tasks can ve fully made by one ChatGPT/GPTs/other (multiagent) LLMs call, some not, test all potential fit candidate possible tools in parralel and pic the best resoult by metarevieving efficiency using benchmarks and learning from that. For software, companies, planetary decision making governance. Hypothetical architecture highlevel sketch for it in hypothetical selfmade language: [GitHub - BurnyCoder/meta-ai: How to harness the power of existing AI systems? How about ductaping together existing general and highly specialized reasoning systems and all tools we developers use, from searching for data on the internet to using python libraries for machine learning to calling LLMs to making external memory for them, in one metareasoning system!](https://github.com/BurnyCoder/meta-ai) Many papers or engineered products on the market cover many of these functionalities if you google for them ===Science -> Humanity=== ===Science -> Humanity -> General models=== ===Science -> Humanity -> General models -> Active Inference, Free energy principle === ===Science -> Humanity -> General models -> Evolutionary game theory=== ===Science -> Humanity -> General models -> Nonlinear fluid dynamics=== ===Science -> Humanity -> General models -> Gauge theory=== ===Science -> Humanity -> Phenomena=== ===Science -> Humanity -> Phenomena -> Causal power inequality===