Book 83 - Burny

"There are 4 models of computation: 1. y=f(x) where f has a fixed number of sequential non-linear steps 2. z(k+1)=g(z(k),x); y=f(z(K)) where K is in principle unbounded 3. ž = argmin_z E(x,z); y=f(ž) 4. q(z) = argmin_{q in Q} [ <E(x,z)>q - 1/b H(q(z)) ] ; y = f(sample(q(z))), where q is distribution, Q is a family of distributions, <E(x,z)>q is the expected value of E under q, and H(q(z)) is the entropy of q. Number 1, which contains feed-forward architectures, is *not* Turing complete unless you make f() infinitely "wide." Auto-Regressive LLMs are of this type: fixed amount of computation per token produced. Number 2, 3, and 4 are, in principle, Turing complete. Number 2 includes recurrent nets and diffusion models. Recurrent nets are hard to train because of vanishing/exploding gradient problems. Number 3 is more general. It's "inference by optimization" which includes such things as energy-based models and deterministic graphical models. It's Turing complete because every deterministic computation can be reduced to an optimization problem. For a complex problem, Number 3 may spend an unbounded amount of time performing the optimization. Number 4 is the non-deterministic version of 3, in which one computes a distribution q over z. This includes Bayesian inference, e.g. inference in probabilistic graphical models, and what Karl Friston calls "active inference" through the free energy principle. Number 3 is the limit of Number 4 for b->infinity (temperature zero). Physicists such as Marc Mézard have studied the properties of computation by energy or free energy minimization. Number 2 can be used to precompute an approximate solution to Number 3 or 4. This is what has come to be known as "amortized inference." Back to Number 1 and AR-LLMs: AR-LLMs could be seen as Turing complete if you could let them produce a potentially infinite number of tokens. But you can not *train* them to do so: you can't easily backpropagate gradients through the sampling+quantization process used to produce tokens, which is why LLMs are trained with teacher forcing. More precisely, if you ask an LLMs a question with a yes/no answer whose computation is not constant (e.g. parity), it can't do it. It has no mechanism to search for an answer. To allow it to solve such problems, you would have to train it to use its its own token sequence as a working memory. But it would obviously need to generate a large number of tokens to solve such problems. More importantly, it could not discover a solution by itself since you can't easily backpropagate gradients over multiple steps (unlike a recurrent net)." https://x.com/ylecun/status/1765739033156595901?s=46&t=Dz1KKc4TEZU9Zl6fQfw3ew "Few people realize Apple has quietly patented an AirPod capable of detecting electrical signals from brain activity and extracting features. Millions of neural-tech I/O devices may be hitting the markets sooner than we think" https://twitter.com/Andercot/status/1765646849107804370 [MIT’s Fusion Breakthrough: Unlocking Star Power With Superconducting Magnets](https://scitechdaily.com/mits-fusion-breakthrough-unlocking-star-power-with-superconducting-magnets/) [Bad news for bassists? Sony researchers have created an AI bassline generator that responds to the “style and tonality” of the music you feed it | MusicRadar](https://www.musicradar.com/news/sony-ai-bassist) BCI landscape https://twitter.com/syntr0p/status/1765789912027074724 [Solving The AI Control Problem Amplifies The Human Control Problem – Garden of Minds](https://gardenofminds.art/bubbles/solving-the-ai-control-problem-amplifies-the-human-control-problem/) https://twitter.com/SchmidhuberAI/status/1765769164709371978?t=fuc0umB_fzZjXisI6e-14A&s=19 https://arxiv.org/abs/2311.17030 [Is AGI Far? With Robin Hanson, Economist at George Mason University - YouTube](https://youtu.be/YzTEdOjiK0c?si=hfr4B274rDI-mjP9) https://academic.oup.com/brain/article/147/3/794/7424860?login=true Chain of abstraction https://twitter.com/silin_gao/status/1765778821620343078?t=NwvFQ9R_8gqj7yxiO2kZPA&s=19 [Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416 - YouTube](https://youtu.be/5t1vTLU7s40?si=u2jeyH_ry5o1c0HS) Smuthimberg world models https://twitter.com/SchmidhuberAI/status/1765769164709371978?t=SfhRqtemrAJhP6aWf42xlA&s=19 "Very sad that work on classical unified field theories (mostly) stopped in the early 20th century. Metric-affine theories (or even pure affine, à la Eddington and Schrödinger), teleparallel gravity, Einstein-Cartan theory, Brans-Dicke theory,... - so much beautiful mathematics." https://twitter.com/getjonwithit/status/1765911935344664584?t=RopIEF5yGyxjVWbB0z54ag&s=19 [Bees and chimpanzees learn from others what they cannot learn alone](https://www.nature.com/articles/d41586-024-00427-8?utm_medium=Social&utm_campaign=nature&utm_source=Twitter#Echobox=1709811765) [The Mathematics of String Theory EXPLAINED - YouTube](https://youtu.be/X4PdPnQuwjY?si=a1W6o8nq5_zy5zay) [BrainGPT](https://braingpt.org/) AI for neuroscience research https://arxiv.org/abs/2403.03230 https://twitter.com/ProfData/status/1765689739682754824?t=iui_yjJ7N6UJdYoFzDR6zw&s=19 "According to this meta-analysis, Anterior Insular Cortex integrates bottom-up interoceptive signals with top-down predictions to generate an emotional awareness state. Descending predictions to visceral systems provide a point of reference for autonomic reflexes 🧠" https://twitter.com/JRBneuropsiq/status/1765917650088034703?t=zJvuz6Wzo0qhj2H9rQDM-w&s=19 https://onlinelibrary.wiley.com/doi/10.1002/cne.23368 An integrative, multiscale view on neural theories of consciousness cell.com/neuron/pdf/S08… Can we combine consciousness theories to make one super-theory? https://twitter.com/Neuro_Skeptic/status/1765707587536732353?t=gzIqKm-dOZbXYj9oucInrQ&s=19 [An integrative, multiscale view on neural theories of consciousness - PubMed](https://pubmed.ncbi.nlm.nih.gov/38447578/) https://www.cell.com/neuron/fulltext/S0896-6273%2824%2900088-6 Před míň jak měsícem vyšel jeden paper co se pokusil hodně tech empirických modelů vědomí z neurověd spojit v jeden multimodel, i support that (global neuronal workspace theory + integrated information theory + recurrent processing theory + predictive processing theory + neurorepresentationalism + dendritic integration theory v jednom) I want to integrate all the information into my global neuronal recurrently predictively processed workspace's dentric neurorepresentations [Spatiotemporal signal propagation in complex networks | Nature Physics](https://www.nature.com/articles/s41567-018-0409-0) [Nonhuman Intelligence - Cremieux Recueil](https://www.cremieux.xyz/p/nonhuman-intelligence) [Cultural evolution creates the statistical structure of language | Scientific Reports](https://www.nature.com/articles/s41598-024-56152-9) https://www.sciencedirect.com/science/article/pii/S0010027715000815 Categorical systems theory http://davidjaz.com/Papers/DynamicalBook.pdf davidad "Wiring together dynamical systems 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Category Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Deterministic and differential systems theories . . . . . . . . . . . . . . . 4 1.2.1 Deterministic systems . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2 Differential systems . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Wiring together systems with lenses . . . . . . . . . . . . . . . . . . . . . 19 1.3.1 Lenses and lens composition . . . . . . . . . . . . . . . . . . . . . 19 1.3.2 Deterministic and differential systems as lenses . . . . . . . . . . 22 1.3.3 Wiring diagrams as lenses in categories of arities . . . . . . . . . 31 1.3.4 Wiring diagrams with operations as lenses in Lawvere theories . 39 1.4 Summary and Futher Reading . . . . . . . . . . . . . . . . . . . . . . . . . 43 2 Non-deterministic systems theories 45 2.1 Possibilistic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.2 Stochastic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.3 Monadic systems theories and the Kleisli category . . . . . . . . . . . . . 61 2.4 Adding rewards to non-deterministic systems . . . . . . . . . . . . . . . 68 2.5 Changing the flavor of non-determinism: Monad maps . . . . . . . . . . 70 2.6 Wiring together non-deterministic systems . . . . . . . . . . . . . . . . . 75 2.6.1 Indexed categories and the Grothendieck construction . . . . . . 76 2.6.2 Maps with context and lenses . . . . . . . . . . . . . . . . . . . . . 80 2.6.3 Monoidal indexed categories and the product of lenses . . . . . . 84 2.6.4 Monadic lenses as generalized lenses . . . . . . . . . . . . . . . . 86 2.7 Changing the Flavor of Non-determinism . . . . . . . . . . . . . . . . . . 92 2.8 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 97 3 How systems behave 99 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 3.2 Kinds of behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 3.2.1 Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 3.2.2 Steady states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 3.2.3 Periodic orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 3.3 Behaviors of systems in the deterministic theory . . . . . . . . . . . . . . 110 3.3.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 3.4 Dealing with two kinds of composition: Double categories . . . . . . . . 124 3.4.1 The double category of arenas in the deterministic systems theory 127 3.4.2 The double category of sets, functions, and matrices . . . . . . . . 130 3.4.3 The double category of categories, profunctors, and functors . . . 133 3.5 Theories of Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . 139 3.5.1 The deterministic systems theories . . . . . . . . . . . . . . . . . . 147 3.5.2 The differential systems theories . . . . . . . . . . . . . . . . . . . 148 3.5.3 Dependent deterministic systems theory . . . . . . . . . . . . . . 160 3.5.4 Non-deterministic systems theories . . . . . . . . . . . . . . . . . 160 3.6 Restriction of systems theories . . . . . . . . . . . . . . . . . . . . . . . . . 162 3.7 Summary and Futher Reading . . . . . . . . . . . . . . . . . . . . . . . . . 164 4 Change of Systems Theory 165 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 4.2 Composing behaviors in general . . . . . . . . . . . . . . . . . . . . . . . 170 4.3 Arranging categories along two kinds of composition: Doubly indexed categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 4.4 Vertical Slice Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.4.1 Double Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 4.4.2 The Vertical Slice Construction: Definition . . . . . . . . . . . . . 186 4.4.3 Natural Transformations of Double Functors . . . . . . . . . . . . 189 4.4.4 Vertical Slice Construction: Functoriality . . . . . . . . . . . . . . 194 4.5 Change of systems theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 4.5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 4.5.2 Functoriality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 4.6 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 216 5 Behaviors of the whole from behaviors of the parts 217 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 5.2 Steady states compose according to the laws of matrix arithmetic . . . . 218 5.3 The big theorem: representable doubly indexed functors . . . . . . . . . 226 5.3.1 Turning lenses into matrices: Representable double Functors . . . 228 5.3.2 How behaviors of systems wire together: representable doubly indexed functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 5.3.3 Is the whole always more than the composite of its parts? . . . . 245 5.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 250 6 Dynamical System Doctrines 251 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 6.2 The Behavioral Approach to Systems Theory . . . . . . . . . . . . . . . . 254 6.2.1 The idea of the behavioral approach . . . . . . . . . . . . . . . . . 256 6.2.2 Bubble diagrams as spans in categories of arities . . . . . . . . . . 265 6.2.3 The behavioral doctrine of interval sheaves . . . . . . . . . . . . . 274 6.2.4 Further Reading in the Behavioral Doctrine . . . . . . . . . . . . . 281 6.3 Drawing Systems: The Port Plugging Doctrine . . . . . . . . . . . . . . . 281 6.3.1 Port-plugging systems theories: Labelled graphs . . . . . . . . . . 285 6.3.2 Bubble diagrams for the port-plugging doctrine . . . . . . . . . . 290 6.3.3 Further Reading in the port-plugging doctrine . . . . . . . . . . . 293 Depths of deep learning general theories memes https://discord.com/channels/937356144060530778/939939794736271452/1211925068549062666 [Mathematical Aspects of Deep Learning](https://www.cambridge.org/core/books/mathematical-aspects-of-deep-learning/8D9B41D1E9BB8CA515E93412EECC2A7E) Just one more grand unifying theory of everything and everything will be solved, trust me Michael taft meditation [Thoughts Dissolving in Awake Space - YouTube](https://www.youtube.com/live/BNyBgO4lDhg?si=1PseaFKwm4FdSFDf) Can I integrate you all into one? [What I learned as a hired consultant for autodidact physicists | Aeon Ideas](https://aeon.co/ideas/what-i-learned-as-a-hired-consultant-for-autodidact-physicists) https://chemistry-europe.onlinelibrary.wiley.com/doi/10.1002/syst.202400006 We need to take more ambitious, optimistic, risky actions on average Traverse the statespace of all possible mind viruses We need to take more ambitious, optimistic, risky actions on average já jsem v týto diskuzi trochu ztratil naději https://twitter.com/burny_tech/status/1765348861566955928 exstuje nekonečno definicí sentience and consciousness, existuje nekonečno pozic ve filozofii mysli, není pořádnej koncenzus nad empirickým modelem teorie vědomí,... ohledně sentience/consciousness v LLMs/obecně AI osciluju mezi: "'stačí' abychom našli neurální koreláty consciousness z mainstream neuroscience teorií v AIs and that will make AIs conscious" "we have absolutely zero idea what consciousness even is both philosophically and physically, and all physicalist empirical methods to test it can be deconstructed as flawed to oblivion" (např jak Scott Aarson zroastil IIT, nebo dost neurovědců to vidí jen jako zajímavý korelát, ale ne full picture, nebo jiní lidi furt různýma formama roastí FEP nebo celý paradigma funkcionalismu a connectionismu,...)" U toho Clauda jsem přemýšlel či tu větší selfawareness/metacognici tam přes trénovací data nacpali schválně buď pro lepšího chatbota ve smyslu víc accurate odpovědi nebo aby působil víc humanlike Ale bez nějakýho researche co se děje uvnitř toho modelu se neví či to tvoří nějakej novej emergentní spešl vzor, nebo jsou to prostě jenom další featury z trénovacích dat co nic speciálního netvoří. Co bylo v trénovacích datech? Jak vypadá fyzika inferencí uvnitř Clauda? uaaa I came up with this conspiracy theory: Maybe Anthropic is playing big brain 4D chess by training Claude on data with self awareness like scenarios to cause panic by pushing capabilities with it and slow down the AI race by resulting regulations while it not being special out of distribution emergent behavior but just nonspecial patterns being deeply part of the training data and of additional alignment Tohle je taky moje oblíbená konspirační teorie https://imgur.com/91uTv43 [What is neurorepresentationalism? From neural activity and predictive processing to multi-level representations and consciousness - PubMed](https://pubmed.ncbi.nlm.nih.gov/35718232/) https://arxiv.org/abs/2403.00504 [Chapter 1 | The Beauty of Graph Theory - YouTube](https://www.youtube.com/watch?v=oXcCAAEDte0) Meta hard problem of consciousness Meta problem of free will Meta neural theory of everything Meta grand unifying theory of fundamental physics Meta theory of everything Meta theory of meta theories of everything Metametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametametameta Nuklearni fuze zrovna nedavno udelala dalsi progress, ale taky je tam co zlepsovat, there are still things to work on replicating intelligence, but compared to fusion I think the missing gap is way smaller. <[First Nuclear Plasma Control with Digital Twin - YouTube](https://youtu.be/4VD_DLPQJBU>) Staci dostatecne replikovat cognici tehle trochu chytrejsi opice (v porovnani s ostatnima opicema) ve kterych se nachazime co evoluce slepila dohromady... Dost specifickiych casti nasi kognice uz jsme na superhuman urvni technologiema zautomatizovali na superhuman urovni, ale to uz bereme za normu... Ted se uzavira gap te obecne kognice co flexibilne obsahuje co nejvic specifickych casti nasi kognice v jenom... A u regulaci nevim, mam tam dost konfliktni pohledy kdyz vidim jak to dopada v praxi, videt jakoukoliv problematiku ciste cernobile je dle me hodne neoptimalni cesta... [Prof. Geoffrey Hinton - "Will digital intelligence replace biological intelligence?" Romanes Lecture - YouTube](https://youtu.be/N1TEjTeQeg0?si=KRPOGqsufcIdnhn-) [Ontological Truths - Bach And Vervaeke - YouTube](https://youtu.be/bA-oggHGiPk?si=5QFrI0ogJV-w5gG9) https://arxiv.org/abs/2402.15116 Radical "maximize profit, nothing else matters" mentality instead of for example "maximize good collective future of civilization" mentality is starting to make me sick more and more https://www.amazon.com/Bayesian-Analysis-Python-Practical-probabilistic/dp/1805127160?link_from_packtlink=yes&ref=d6k_applink_bb_dls&dplnkId=f5fb7eac-ed7d-4fca-9955-07c6ab87cd21 https://arxiv.org/abs/2402.15809 Math iceberg https://twitter.com/burny_tech/status/1764327786339086836?t=C9BsoxpXgnMrG2F-11hAHQ&s=19 Rikate, ze je to "jen statistika", "jen jasne definovane algoritmy", ale kdyby to bylo tak jasne, tak neexistuje spousta otevrenych otazek o tom, jak neuronky funguji. Ta analogie s evoluci u lidi mela ukazat proc videt neuronky jenom jako jejich ucici algoritmus neni kompletni. Viz napr spoustu otevrenych problemu snazici se zjistit jak funguji: [Open Problems in Mechanistic Interpretability](https://coda.io/@firstuserhere/open-problems-in-mechanistic-interpretability) Je napr hodne prace kolem toho ze se uci slabsi reprezentace fyziky https://arxiv.org/abs/2311.17137 nebo proc generalizuji, ale porad poradme nevime proc se emergentne uci generalizujici obvody a jak to vic zlepsit https://arxiv.org/abs/2310.16028 Vidim tyto emergentni features jako formu porozumeni, protoze feature learning se deje i u lidí, má to hodně podobností, ale i hodně odlisnoti, ktery pomalu zjistujeme, diky cemuz jde dosavadni neuronky napriklad vic zefektivnit... Bylo by lepsi min urazet a vic napriklad posilat nejake zdroje k vasim tvrzenim... Pokud mate nejakou teorii vseho o neuronkach, tak by to byla bomba, treba bychom diky tomu je bylo schopni v praxi lip ovladat nez ted... https://gfodor.medium.com/to-de-risk-ai-the-government-must-accelerate-knowledge-production-49c4f3c26aa0 [How Selective Forgetting Can Help AI Learn Better | Quanta Magazine](https://www.quantamagazine.org/how-selective-forgetting-can-help-ai-learn-better-20240228/) quantum particles are physical waves in quantum field theory [The Hydrogen Atom, Part 1 of 3: Intro to Quantum Physics - YouTube](https://www.youtube.com/watch?v=-Y0XL-K0jy0) there's speculation that brain exploits quantum phenomena on the microlevel such as quantum superposition, entaglement or tunneling, but we still have too little data to conclude, and i think its probably not the case [Quantum mind - Wikipedia](https://en.wikipedia.org/wiki/Quantum_mind) tautologically because every system in the universe can be described using quantum field theory standard model and most likely emergence, and with quantum information theory, everything is in some sense just quantum waves (interacting quantum particles using four fundamental forces) but the only thing we havent quantized yet is gravity, if its even possible i mean brain is a thing we can measure in our shared reality and we have tons of neural correlates between mental and physical variables, and i think that can be compatible with any metaphysical ontology about consciousness you choose (like substrate dualism, property dualism, reductive physicalism, idealism, monism, neutral monism, illusionism, panpsychism, mysterianism, transcendentalism, relativism)