Fundamental Principles of Intelligence **Solomonoff Induction** * Universal Prior, Algorithmic Probability, Kolmogorov Complexity, Occam's Razor * **Algorithmic Information Theory** * Kolmogorov Complexity, variants, and limitations, Program-size complexity, self-delimiting code, Universal Turing Machines (UTMs), compilers, Levin Search, modifications, and speed limitations, Algorithmic randomness & incomputability * **Formal Systems** * Gödel Incompleteness Theorems, Turing Machines, Universal Turing Machines, Lambda Calculus * **Computational Complexity** * Time and space complexity, Exponential vs. polynomial growth * **Epistemology** * Inductive bias, The Problem of Induction, Bayesian inference * **Algorithmic Complexity Theory** * Resource-bounded Kolmogorov complexity * Computable measures, Semi-measures, Levin's Kt complexity, Chaitin's Omega * **Recursion Theory** * Recursively enumerable sets, The halting problem revisited, Self-delimiting programs, Speed prior * **Philosophy of Science** * Occam's Razor formalization, Epistemic uncertainty vs. aleatoric uncertainty, The 'no free lunch' theorem * **Metamathematics** * Proof theory, Consistency & soundness, Gödel's incompleteness theorems (first & second)* Löb's theorem * **Incompleteness Beyond Arithmetic** * Tiling problems (Wang tiles)* Rice's theorem * Algorithmic randomness in dynamical systems * **Solomonoff Induction Variants** * Speed prior limitations, Levin Search modifications, Incorporating domain-specific knowledge, Resource bounds & approximations * **Program Synthesis** * Genetic programming, Inductive logic programming, Neural program synthesis * **Meta-Induction & Hypercompression** * Learning how to learn through induction, Identifying higher-order program patterns, Compression of compressors * **Logical Uncertainty** * Quantifying doubt in inductive conclusions, Measures of epistemic uncertainty, Relationship to Solomonoff's prior **AIXI** * Bayesian Reasoning, Reinforcement Learning, Environment Models, Time Horizons, Incomputability * **Decision Theory** * Expected utility, Utility functions, Bellman Equations * **Optimal Control** * Markov Decision Processes (MDPs)* Partially-Observable MDPs (POMDPs)* Dynamic Programming, Value iteration, Policy iteration * **Exploration/Exploitation Tradeoff** * Epsilon-greedy strategies, Boltzmann exploration, Upper Confidence Bounds (UCB), Thompson sampling * Regret bounds and optimality analysis, Uncertainty quantification in RL * **Game Theory** * Zero-sum, non-zero-sum, mixed strategies, Prisoner's dilemma, stag hunt, iterated games, Rationality vs. Bounded rationality, Nash equilibria, refinements, Multi-agent interactions and emergent complexity, Mixed strategies, Zero-sum games * **Sequential Decision Theory** * Discounted vs. infinite horizon rewards, Value function approximation, Monte Carlo Tree Search (MCTS)* Markov Decision Processes (MDPs, POMDPs)* State, action, reward structures, Time discounting and infinite horizons, Bellman equations, dynamic programming, Value/policy iteration, convergence, Model-based vs. model-free reinforcement learning * **Theoretical Computer Science** * Approximability results, Lower bounds on AIXI's performance, Formal verification * **Arithmetical Hierarchy** * Sigma and Pi complexity classes, Degrees of unsolvability * AIXItl (time-limited computability)* Hypercomputation models * **Oracle Machines** * Turing machines with oracles * Relativized computation, Relationships between complexity classes (P vs. NP with oracles) * **Philosophy of Mind** * The Church-Turing Thesis * Computationalism, Searle's Chinese Room argument * **Universal Artificial Intelligence Measures** * Legg-Hutter universal intelligence, Other formalizations of intelligence tests, Open-endedness in AI evaluation * **Approximation Techniques** * Monte-Carlo approximations of AIXI & variants, Truncated computations, Context tree weighting (CTW) for compression, AIXItl (time/space bounded AIXI)* Impossibility theorems for perfect AIXI computation, Heuristic search within AIXI's framework * **Counterfactual Reasoning** * AIXI-like agents with causal reasoning, Hypothetical scenarios * Intervention vs. observational data * **Open Problems** * Universal intelligence measures (beyond Turing tests)* Proofs of performance limits or advantages * AIXI-like agents with practical implementations, "Friendly AI" alignment problem * Implications for philosophy of mind * **Information Theory** * Entropy, Shannon Information, Mutual Information, Channel Capacity * **Complexity Theory** * Computational Complexity Classes (P, NP, NP-Complete, etc.)* Turing Machines, Halting Problem Theoretical Frameworks **Statistical Learning Theory** * **Probability Theory & Statistics** * Random variables, probability distributions, densities, Moments (expectation, variance, higher-order)* Parametric vs. nonparametric distributions, Central limit theorem, law of large numbers, Maximum likelihood estimation (MLE)* Bayesian inference, priors, posteriors, conjugacy * **Theory of Generalization** * Bias-variance tradeoff, overfitting/underfitting, Vapnik-Chervonenkis (VC) dimension, Rademacher complexity, Sample complexity bounds, Probably Approximately Correct (PAC) learning, Uniform convergence for empirical risk minimization * PAC Learnability * **Model Selection** * Regularization (L1, L2, etc.)* Structural risk minimization, Cross-validation * **Probabilistic Models** * Naive Bayes, Gaussian Mixture Models, Kernel Density Estimation * **Feature Engineering & Representation** * Dimensionality reduction (PCA, ICA, t-SNE)* Feature selection (filter, wrapper, embedded methods)* Kernel methods & nonlinear feature spaces * Manifold learning, Topological data analysis (TDA)* Support Vector Machines (SVMs) * **Boosting and Ensemble Methods** * AdaBoost, Gradient Boosting, Random Forests * **Information Geometry** * Fisher information, Natural gradient, KL-divergence * **Functional Analysis** * Reproducing Kernel Hilbert Spaces (RKHS)* Mercer's theorem, Representer theorem * **Concentration Inequalities** * Hoeffding's inequality, McDiarmid's inequality, Rademacher complexity * **Regularization Theory** * Ill-posed problems, Sparsity & compressed sensing, Penalized loss functions (L1, L2, elastic net, etc.)* Tikhonov regularization, Ivanov regularization, Sparsity-inducing priors, Regularization path algorithms, Structured sparsity (group lasso, etc.) * **Bayesian Modeling** * Hierarchical Bayesian models, shrinkage, Gaussian Processes (GPs) * Bayesian nonparametrics * Dirichlet processes, Chinese Restaurant Process, Beta process, Indian Buffet Process * Markov Chain Monte Carlo (MCMC) methods * Metropolis-Hastings, Gibbs sampling, Hamiltonian MC * Variational inference (VI) techniques * **Graphical Models** * Bayesian networks, directed acyclic graphs (DAGs)* Markov random fields (MRFs), undirected models, Factor graphs, sum-product algorithm * Inference algorithms (belief propagation, junction tree) * Conditional independence, d-separation * **Model Selection & Evaluation** * Cross-validation (k-fold, leave-one-out, etc.)* Bootstrap resampling * Performance metrics * Accuracy, precision, recall, F1-score, ROC curves, AUC, Loss functions (MSE, cross-entropy, etc.) * Statistical hypothesis testing * **Empirical Risk Minimization (ERM)** * Law of large numbers, Uniform convergence, Probably Approximately Correct (PAC) learning framework * **Bayesian Nonparametrics** * Dirichlet process mixtures, Beta processes, Indian Buffet Process, Hierarchical Bayesian models * **Linear Models** * Ordinary least squares (OLS)* Ridge regression, Lasso, and variants, Generalized linear models (GLMs)* Logistic regression, probit regression * Support vector machines (SVMs) * Kernel trick, Mercer's Theorem, RKHS * Linear vs. nonlinear decision boundaries * **Learning with Structured Data** * Time series modeling (ARIMA, state-space, etc.)* Hidden Markov Models (HMMs)* Sequential data and recurrent models (RNNs, LSTMs)* Learning with text, natural language processing (NLP)* Graph neural networks (GNNs) * **Learning with Noisy Data** * Robust statistics, Outlier detection * Agnostic learning, Adversarial examples * **Online Learning & Streaming Data** * Stochastic approximation algorithms, Regret bounds, Stochastic gradient descent (SGD) variants, Regret bounds, adaptive learning rates, Online convex optimization, Concept drift & change detection * Lifelong learning, continual learning * **Causality** * Causal inference beyond correlations, Potential outcomes framework, Rubin Causal Model, Do-calculus, causal discovery algorithms, Structural equation modeling (SEM)* Propensity score matching / weighting **Deep Learning** * Neural Networks * Perceptrons, Activation Functions (Sigmoid, ReLU, etc.)* Multilayer Architectures * Gradient Descent * Backpropagation * Optimization Algorithms (SGD, Adam, etc.) * Loss Functions * **Tensor Calculus** * Tensors, Tensor operations * **Automatic Differentiation** * Computational graphs * **Optimization Theory** * Convex vs. non-convex optimization, Saddle points, Local vs. global minima, Stochastic Gradient Descent variants (Momentum, Adagrad, RMSprop, Adam) * **Representational Power** * Universal Approximation Theorem, Manifold learning, Expressivity of deep architectures * **Regularization Techniques** * Dropout, Early stopping, Batch normalization, Weight decay, Data augmentation * **Differential Geometry** * Manifolds & Riemannian geometry, Parameter spaces as manifolds, Optimization on manifolds * **Dynamical Systems** * Attractors, basins, limit cycles, Stability analysis of training, Chaos theory & DL * **Network Theory** * Small-world networks, Scale-free networks, Network motifs in neural architectures, Graph Neural Networks (GNNs) * **Attention Mechanisms** * Transformer models & self-attention, Query-key-value paradigm * **Meta-Learning** * **Metric-Based Meta-Learning:** * Prototypical Networks, Matching Networks, Relation Networks * **Model-Based Meta-Learning:** * Memory-Augmented Neural Networks (MANNs)* Neural Turing Machines (NTMs)* Meta Networks * **Optimization-Based Meta-Learning:** * MAML (Model-Agnostic Meta-Learning)* Reptile, LSTM Meta-Learner * **Gradient-based Hyperparameter Optimization*** **Bayesian Meta-Learning*** **Ensemble Meta-Learning** * **Zero-Shot and Few-Shot Learning** (often heavily utilize meta-learning principles) * **Topology** * Homeomorphisms, diffeomorphisms, Topology of weight spaces, Loss function landscapes * **Linear Algebra** * Matrix factorizations (SVD, PCA, etc.)* Eigendecompositions & spectral theory, Low-rank approximations, Numerical stability in DL * **Optimization Landscapes** * Critical points * Hessian matrices * Landscape visualization * **Implicit Bias of Gradient Descent** * Tendency towards simpler solutions, Connections to kernel methods * **Expressivity & Representation Learning** * Fourier analysis & neural nets, Neural Tangent Kernel (NTK)* Information Bottleneck Theory, Disentangled representations * **Implicit Regularization in DL** * Effects of network architecture, Frequency principle, Margin maximization * **Deep Learning on Non-Euclidean Data** * Graph convolutional networks, Hyperbolic neural networks, Neural networks on manifolds * **Algorithmic Neuroscience** * Backpropagation vs. biological plausibility, Dendritic computation, Spiking neural networks **Free Energy Principle** * Active Inference, Markov Blankets, Variational Bayes * Predictive Coding * **Thermodynamics & Statistical Physics** * Helmholtz free energy, Entropy and negentropy * Phase transitions * **Bayesian Brain Hypothesis** * Predictive processing, Hierarchical inference, Generative models of the world * **Perception & Action** * Sensory attenuation, Active inference, Embodied cognition * **Neuroanatomy** * Cortical hierarchies, Predictive coding circuits * **Statistical Mechanics** * Boltzmann distributions, Ising models, Mean-field approximations * **Control Theory** * Variational control, KL-control, Homeostasis & allostasis * **Neuroscience** * Cortical layers & laminar connections, Neuronal oscillations & synchronization, Reinforcement, attention, and surprise * **Embodiment & Robotics** * Morphological computation, Soft robotics * Developmental robotics * **Non-equilibrium Thermodynamics** * Onsager reciprocal relations, Fluctuation theorems, Detailed balance, Jarzynski equality * **Stochastic Processes** * Fokker-Planck equation, Langevin dynamics, Path integrals, Stochastic thermodynamics * **Cybernetics** * Feedback loops, Regulation & control, Ashby's Law of Requisite Variety * **Theoretical Biology** * Autopoiesis, Self-organization, Morphogenesis * **Variational Bayesian Methods** * Evidence Lower Bound (ELBO) optimization, Amortized inference * Reparameterization trick, Normalizing flows * **Embodied Predictive Processing** * Morphology and physical constraints, Sensorimotor contingencies, Affordances * **Active Inference and Curiosity** * Information-seeking behaviors, Intrinsic motivation, Artificial curiosity models * **Consciousness & FEP** * Subjective experience and generative models Intelligence * **Problem-Solving** * Search (A*, etc.)* Planning, Knowledge Representation * **Reasoning** * Logic (First-Order, etc.)* Deduction, Induction * Abduction * **Learning** * Supervised Learning, Unsupervised Learning * Reinforcement Learning * **Embodiment** * Sensorimotor systems * **Metacognition** Specific Computational Models * **Deep Learning Architectures** * Convolutional Neural Networks (CNNs)* Recurrent Neural Networks (RNNs, LSTMs, GRUs)* Transformers, Generative Adversarial Networks (GANs) * Autoencoders, Diffusion, Mamba, Receptance Weighted Key Value (RWKV)* JEPA * **Probabilistic Graphical Models** * Bayesian Networks, Markov Random Fields, Hidden Markov Models * **Reinforcement Learning Algorithms** * Q-Learning, SARSA, Policy Gradients, Deep Q-Networks (DQNs)* Actor-Critic Methods Neuroscience and Biological Intelligence * **Neurons and Neural Circuits** * Synapses, Spike Trains, Action Potentials, Hebbian Learning * **Brain Areas & Functions** * Perception, Attention, Memory (Working, Long-Term, etc.)* Decision Making, Motor Control Collective Intelligence * Swarm Intelligence, Social Networks, Distributed Systems https://twitter.com/burny_tech/status/1761966352988836119 Write the most information dense rich general but hyper concrete list of lists of concepts and subconcepts that govern how intelligence, general intelligence, artificial intelligence, biological intelligence, collective intelligence works from the most fundamental principles to emergent principles, from deeply mathematical theoretical rich formalisms, frameworks and theories to empirical models, from general principles to the most concrete models, using concepts from all related scientific fields and engineering disciplines, from solomonoff induction, AIXI to statistical learning theory to deep learning theory to free energy principle and different components and subcomponents. Don't explain the concepts, only list them as keywords in a list of lists, list as many of them as possible, as densely as possible, as information rich as possible, covering everything explaining intelligence as much as possible. Focus mainly on solomonoff induction, AIXI to statistical learning theory to deep learning theory to free energy principle and create a gigantic list of concepts in them without explaining them. [[2309.17453] Efficient Streaming Language Models with Attention Sinks](https://arxiv.org/abs/2309.17453) an efficient framework that enables LLMs trained with a finite length attention window to generalize to infinite sequence lengths without any fine-tuning [The Laws of Thermodynamics, Entropy, and Gibbs Free Energy - YouTube](https://www.youtube.com/watch?v=8N1BxHgsoOw) [Quantum capacity - Wikipedia](https://en.m.wikipedia.org/wiki/Quantum_capacity) [Categorical Deep Learning - Categorical Deep Learning](https://categoricaldeeplearning.com/) the idea, key concepts, related fields Opensourced OpenAI tool leads to nuclear fusion control https://twitter.com/8teAPi/status/1762190250355658885