Book 48 - Burny

https://www.lesswrong.com/posts/Btom6dX5swTuteKce/agi-will-be-made-of-heterogeneous-components-transformer-and On the scientific side, see Free Energy Principle/Active Inference in all of its guises, Infra-Bayesianism, Vanchurin’s theory of machine learning (2021), James Crutchfield's "thermodynamic ML" (or, more generally, Bahri et al.’s review of statistical mechanics of deep learning (2022)), Chris Fields' quantum information theory, singular learning theory. (If you know more general frameworks like these, please post in the comments!) On the engineering On the engineering (but also research) side, see Doyle's system-level synthesis, DeepMind's causal incentives working group, the Gaia Network agenda, and compositional game theory. (If you know more agendas in this vein, please post in the comments!) 1. JPMorgan announces DocLLM: A layout-aware generative language model for multimodal document understanding [[2401.00908] DocLLM: A layout-aware generative language model for multimodal document understanding](https://arxiv.org/abs/2401.00908) 2. No, GPT4 can't ace MIT [Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.](https://flower-nutria-41d.notion.site/No-GPT4-can-t-ace-MIT-b27e6796ab5a48368127a98216c76864) [try this archived version if the site loads too slow: https://archive.is/dVmGL] 3. Stuxnet, not Skynet: Humanity's disempowerment by AI https://www.lesswrong.com/posts/tyE4orCtR8H9eTiEr/stuxnet-not-skynet-humanity-s-disempowerment-by-ai 4. aisafety.info — Answering questions about AI Safety https://www.lesswrong.com/posts/ZqxP6pJe53xRnRb4j/aisafety-info-the-table-of-content 5. Roman Yampolskiy & Robin Hanson Discuss AI Risk [Roman Yampolskiy & Robin Hanson Discuss AI Risk - YouTube](https://www.youtube.com/watch?v=aUfw7V_Wy4Y) 6. Daniel Dennett agrees with Douglas Hofstadter, is “very distressed” by AI https://twitter.com/AISafetyMemes/status/1741562666794041732 7. P=NP Implies What About AI? AI Implies What About P=NP? [Things We Did Not Know How to Compute | Gödel's Lost Letter and P=NP](https://rjlipton.wpcomstaging.com/2023/12/31/things-we-did-not-know-how-to-compute/) Science and Technology: 1. Mechanism decoded: how synapses are formed [Mechanism decoded: how synapses are formed | Leibniz-FMP](https://leibniz-fmp.de/newsroom/news/detail/mechanism-decoded-how-synapses-are-formed-1) 2. Bats are the longest-lived mammal for their size. The energetic needs of having to fly means they can exert control over body metabolism and inflammation, with a resistance to diseases that also make them one of the biggest living reservoirs of viruses. https://ncbi.nlm.nih.gov/pmc/articles/PMC7341951/ 3. “Livers are pure magic. Despite being the organ with one of the highest proliferative capacities, liver cancer is *exceedingly* rare (relative to other organs) and unless you have cirhosis / steatosis.” https://twitter.com/LNuzhna/status/1742635316434096252 4. SpaceX Launches Starlink Direct to Phone Satellites [SpaceX launches first batch of direct-to-cell Starlink satellites for testing this year | TechCrunch](https://techcrunch.com/2024/01/03/spacex-launches-first-batch-of-direct-to-cell-starlink-satellites-for-testing-this-year/) 5. SpaceX is aiming to launch 12 launches per month this year. https://twitter.com/TurkeyBeaver/status/1742524679091269656 Miscellaneous: 1. Harvard CS50 (2023) – Full Computer Science University Course [- YouTube](https://www.youtube.com/watch?v=LfaMVlDaQ24) All Models are Wrong butnsome are more useful than others, approximate reality better than others all models are wrong, some compress and extrapolate better than others all models are wrong, some predict the future better than others [Neural network Gaussian process - Wikipedia](https://en.m.wikipedia.org/wiki/Neural_network_Gaussian_process) [Retrocausality - Wikipedia](https://en.m.wikipedia.org/wiki/Retrocausality) Retrocausality [- YouTube](https://youtu.be/RQv5CVELG3U?si=JFKZ2kM3Ix2EEeoe) [- YouTube](https://youtu.be/iixrNh7Xp5M?si=rWY3oJy2DgD5q673) Classical mathematics with godel paradox can't be implemented https://twitter.com/tsarnick/status/1743837644038271076?t=YNYfATw-8TKcx0jvdJTHtw&s=19 Let's build reality from first principles. Let's describe fundamental particles governed by quantum field theory standard model and add gravity that together compose and weakly emerge equations of quantum, classical, statistical physics equations on higher levels with system theory equations with information theory and computation theory. These interacting quantum fields with fundamental particles on complex enough dynamics give rise to the weakly emergent equations of chemistry and that together composes to biology, neuroscience and social systems. We use many of those laws to engineer computers, artificial intelligence and all other technology, which has various mathematical similarities and differences with biological systems allowing for various types of compatibilities between biological and artificial systems that already allow us to understand and engineer beter AI and upgrade ourselves with neurotechnology, and this will only progress forwards. Planets and galaxies and the whole universe is operating under many other approximately deterministic laws. We can predict fundamental particles well, but there seems to be inherent randomness in the quantum wave function of probabilities. Since many nontrivial systems cannot be solved analytically, we have to approximate it with petrubation theory. Chemistry and biology adds tons of classical stochasticity and abstracting approximating equations because we cannot grasp all the variables in play there exactly for a lot of cases, but we can simulate picoseconds of a very simple chemical reaction using quantum mechanics on quantum computers in chemical physics. Societal concrete complexity is very intractable, but we can figure out some very general physical and statistical laws, like measuring fluid dynamics, thermodynamics and network theory. The more macroscopic we go, the more fundamental quantum probabilities and fluctations matter less as they cancel out and we get more deterministic, with stochastic cybernetic controlles in a nonequlibrium steady state pullback attractors forming nested hetearchy of free energy minimizing complex markov blankets ongoing all sorts of phase shifts in the configuration space of functional architectures, like organisms and society, and complexity grows as we go up the levels and, with futher tractable somewhat deterministic dynamics of planets, galaxies and cosmology. We still have yet to figure out so many better or new concepts, laws, frameworks, technologies in all these fields and levels of analysis. The cosmological story of the universe, why and how is there something, is still a mystery beyond our limited modelling approximating compressing perspective, data, time, computational power, it's in its fullness beyond our comprehension. https://twitter.com/burny_tech/status/1743851964163621190?t=U_HQh9x-BbQYDi19VT2qWg&s=19 https://www.reddit.com/r/MachineLearning/s/HF60rx9JQ9 how brains prevent overfitting Some models predict the future states of sensory data better than others Quantum field theory math [ChatGPT](https://chat.openai.com/share/ef4f33c0-ef98-469e-8a19-2d344e53f944) Chess LLM with internal world model https://twitter.com/a_karvonen/status/1743666230127411389?t=IV5N6kbG3KZrttVpUdbINg&s=19 Quantized Mixtral https://twitter.com/AlphaSignalAI/status/1744060911185473604?t=Zw-kbkzfk5OP-cqu_mXC9A&s=19 Psychedelic cyborg https://twitter.com/JasonSilva/status/1743562542432063990?t=RTcwoBcpy6koRs6vy4xQSw&s=19 LLM isnt true understanding argument applied to people https://twitter.com/robertwiblin/status/1742595869310939641?t=jxloX9Fh-OkEy_zrrE5Vmw&s=19 SotA robotics from human video to instructions https://twitter.com/adcock_brett/status/1743987597301399852?t=_abSvmxQrNsxkfsFKCt51w&s=19 [Turing Complete Transformers: Two Transformers Are More Powerful Than One | OpenReview](https://openreview.net/forum?id=MGWsPGogLH) Turing Complete Transformers: Two Transformers Are More Powerful Than One [[2301.04589] Memory Augmented Large Language Models are Computationally Universal](https://arxiv.org/abs/2301.04589) Memory Augmented Large Language Models are Computationally Universal [Perturbation theory - Wikipedia](https://en.wikipedia.org/wiki/Perturbation_theory) [picture of mechanics in nLab](https://ncatlab.org/nlab/show/picture+of+mechanics) [Thoughts on LLM Agents](https://borretti.me/article/thoughts-llm-agents) [AI Safety Memes Wiki](https://stampy.ai/?state=MEME_) Layers of reality https://cdn.discordapp.com/attachments/991777324301299742/1193512473760899092/orders_of_magnitude_english_annotations_highres_horizontal_layout_25200x5760.png?ex=65acfc1c&is=659a871c&hm=625e90323002ec3ba2deb8effba78ffe9b05e92334f71e5ba90eab8ebb58f3ea& I don’t think that 20 century style democracy is the future of governance, and neither are fascism, monarchy or communism. When everyone can use AI to integrate with reality, representing our combined interests through inert bureaucracies led by Biden or Trump seems laughable https://twitter.com/Plinz/status/1744131939211379112 MLPs can learn arbitrarily complex truth tables by scaling only features https://medium.com/nakshatra/the-nature-of-nothingness-understanding-the-vacuum-catastrophe-c04033e752f4 i wont be calm when there is still not enough justice and too much suffering, people experiencing loneliness and lack of love, connection, meaning etc. in this world [Critical exponent - Wikipedia](https://en.wikipedia.org/wiki/Critical_exponent) [Universality class - Wikipedia](https://en.wikipedia.org/wiki/Universality_class) [Scale of the Universe](https://scaleofuniverse.com/en) [[2307.11844] Bio-realistic Neural Network Implementation on Loihi 2 with Izhikevich Neurons](https://arxiv.org/abs/2307.11844) Bio-realistic Neural Network Implementation on Loihi 2 with Izhikevich Neurons [[2311.10768] Memory Augmented Language Models through Mixture of Word Experts](https://arxiv.org/abs/2311.10768) Memory Augmented Language Models through Mixture of Word Experts ChatGPT: You're on the quest to understand everything we know so far about deep structures of reality and integrating all fields by creating a giant repository of knowledge and mapping out where more progress needs to be done. You only have access to your weights and the internet. Don't make new instituons or interviews. Only create a giant piece of text. Begin. [ChatGPT](https://chat.openai.com/share/4429ea5e-0269-4e26-8d33-f51c402376db) SotA open source llm? DeepSeek LLM: Scaling Open-Source Language Models with Longtermism [[2401.02954] DeepSeek LLM: Scaling Open-Source Language Models with Longtermism](https://arxiv.org/abs/2401.02954) [[2301.10743] Tighter Bounds on the Expressivity of Transformer Encoders](https://arxiv.org/abs/2301.10743) Tighter Bounds on the Expressivity of Transformer Encoders SotA visual recignition by visual search V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs (Show, sEArch, and TelL) [[2312.14135v2] V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs](https://arxiv.org/abs/2312.14135v2) https://medium.com/the-physics-arxiv-blog/how-the-nature-of-information-could-resolve-one-of-the-great-paradoxes-of-cosmology-8c16fc714756 prompt engineering summary [[2312.16171v1] Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4](https://arxiv.org/abs/2312.16171v1) [Intelligence amplification - Wikipedia](https://en.wikipedia.org/wiki/Intelligence_amplification) backprop free alternatives https://twitter.com/fly51fly/status/1743995590412021990/photo/1 psychedelic music https://www.facebook.com/AumPsychedelia/posts/pfbid02RsBh34cZPpfPK3H2Fo6Jb7asgEWn1p1RZFBaibgJYwPFqEZDvzbYoCRsu9SxAk2Vl?__cft__[0]=AZUQPlNQwlMfgOk3vxG-ez9cV1GmZ7r3qgD3mVTezTszJh4Af6pIrV4EeaPYHth0AoDjYHk22mpP-1AncayW-fZkm6TYyd5dH-xVRbYCU8-l3ke5dTEaLBAeHPgz7kmqspNfhtLHlPmwo7-lNBEf6jaWx7e306WFKa_IkYWwy_dFad9lezTaRdqOBQdlzI8vwu0&__tn__=%2CO%2CP-R]-R [[2401.02412] LLM Augmented LLMs: Expanding Capabilities through Composition](https://arxiv.org/abs/2401.02412) LLM Augmented LLMs: Expanding Capabilities through Composition https://twitter.com/dair_ai/status/1744040878161674326 ai papers [[2312.10997] Retrieval-Augmented Generation for Large Language Models: A Survey](https://arxiv.org/abs/2312.10997) Retrieval-Augmented Generation for Large Language Models: A Survey Improving Text Embeddings with Large Language Models (Microsoft, December 2023) [[2401.00368] Improving Text Embeddings with Large Language Models](https://arxiv.org/abs/2401.00368) [Frontiers | Low Intensity Focused Ultrasound for Non-invasive and Reversible Deep Brain Neuromodulation—A Paradigm Shift in Psychiatric Research](https://www.frontiersin.org/articles/10.3389/fpsyt.2022.825802/full) Low Intensity Focused Ultrasound for Non-invasive and Reversible Deep Brain Neuromodulation—A Paradigm Shift in Psychiatric Research [[1807.03819] Universal Transformers](https://arxiv.org/abs/1807.03819) Universal Transformers [Turing Machines are Recurrent Neural Networks](https://users.ics.aalto.fi/tho/stes/step96/hyotyniemi1/) Turing Machines are Recurrent Neural Networks [Neural Turing machine - Wikipedia](https://en.wikipedia.org/wiki/Neural_Turing_machine) https://news.ycombinator.com/item?id=33869533 "It is possible to construct an (infinite) recurrent neural network that emulates a Turing Machine. But the fact that a Turing Machine can be built out of perceptrons is neither surprising nor interesting. It's pretty obvious that you can build a NAND gate out of perceptrons, and so of course you can build a Turing Machine out them. In fact, it's probably the case that you can build a NAND gate (and hence a TM) out of any non-linear transfer function. I'd be surprised if this is not a known result one way or the other." "you can approximate any nonlinear multivariable function arbitrarily with a multi-layer perceptron with any non-polynomial nonlinear function, applied after the linear weights and bias" AGI could be achieved by combining just about five core types of DNN (Transformers, Selective SSM, GNN), a few dozen classical algorithms (search, graph algorithms, dynamic programming), tools required for generality (physics sim like Julia, math sim Lean, symbolic engine Wolfram) https://www.lesswrong.com/posts/Btom6dX5swTuteKce/agi-will-be-made-of-heterogeneous-components-transformer-a Is mechanistic interpretability path to alignment, like bottom up or top down localizing and control of deception or other machiavellianism patterns? [GitHub - JShollaj/awesome-llm-interpretability: A curated list of Large Language Model (LLM) Interpretability resources.](https://github.com/JShollaj/awesome-llm-interpretability) [[2310.06824] The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets](https://arxiv.org/abs/2310.06824) [Representation Engineering: A Top-Down Approach to AI Transparency](https://ai-transparency.org) [Spaghettification - Wikipedia](https://en.wikipedia.org/wiki/Spaghettification) What is missing in robotics? Aloha - This robot got way more capable than I thought, its software and hardware open source https://fxtwitter.com/zipengfu/status/1742973258528612724 - do laundry👔👖- self-charge⚡️- use a vacuum- water plants🌳- load and unload a dishwasher- use a coffee machine☕️- obtain drinks from the fridge and open a beer🍺- open doors🚪- play with pets🐱- throw away trash - turn on/off a lamp💡 openai/anthropic/google mají něco málo ale to je velkej high bar tak nevím, pak je hodně commited skupin mimo, např Max Tegmark nebo Connor Leahy, nebo různý AI safety granty od např Light speed grants (ted je celkem money krize), nebo openai teď dali nějaký granty tady je mapa [AI Existential Safety Map](https://aisafety.world/) ted jdu replikovat pár studií, nasetrit na cestování, všechny obepsat a ideálně navštívit cestováním na ruznych konferencích jako je Neurips a různý podobný o kerých mám tušení penize jsou mi ve vysledku celkem jedno, vidim je jako more opportunity adding middle man, ja chci abychom rozumeli neuronkam na druhou stranu kdybych mel hodne penez tak funduju z velky casti reverse engineering neuronek, takze az tak jedno mi nejsou peníze jsou power, zatím je to reward signál pod kterým svět dneska jede ale třeba ve světě po AI revoluci peníze už relevantní nebudou a přejdem na jinej systém hodně lidí co vydělává šílený částky v big techu takhle penize rozdistribuovavaji na tyto účely co jsou hodně underfunded and i wish to get my family out of their work prison ale fakt nevím co by dělali kdyby nechodili do práce, protože jim v podstatě ždímá vešekerý jejich čas a jediný co se naučili a v čem žijí je nesnášet chození do práce i wish everyone unhappy with that they have to do to get out of their prisons or whatever else generates intense suffering, like poverty, sickness, wars, climate change,... but thats in big part intractable for us now... but we can collaborate on it as some already do... so i can at least try to continue the path im on now... i believe in the power of AI in transforming the world for good, if its controllable and if the power is in the hands of people that care for other people hmm, trochu jsem se rozbrečel there is hope for a future where humanity is more happy and flourishes ❤️ for such future we have to create the right incentives in terms of where money, compute, data, intelligence, and overall power goes and what attracts people's attention many people are trying to do this already, but its not enough, we need more Einstein on God ✍️ I am not an Atheist. I do not know if I can define myself as a Pantheist. The problem involved is too vast for our limited minds. May I do not reply with a parable? The human mind, no matter how highly trained, cannot grasp the universe. We are in the position of a little child, entering a huge library whose walls are covered to the ceiling with books in many different tongues. The child knows that someone must have written those books. It does not know who or how. It does not understand the languages in which they are written. The child notes a definite plan in the arrangement of the books, a mysterious order, which it does not comprehend, but only dimly suspects. That, it seems to me, is the attitude of the human mind, even the greatest and most cultured, toward God. [[2305.14292v2] WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia](https://arxiv.org/abs/2305.14292v2) WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia [Kaggle - LLM Science Exam](https://www.kaggle.com/competitions/kaggle-llm-science-exam/leaderboard) leaderbard on as scientific llm as possible, RAG hacks [Kaggle - LLM Science Exam](https://www.kaggle.com/competitions/kaggle-llm-science-exam/discussion/446422) RAG on Wikipedia chunks Public embedding models, e5 worked best for us Custom pytorch cosine similarity code, no need for FAISS etc. - can just do it without any memory issue in chunks on GPU Using five chunks as context was best for us with 1k max_length Mostly ensemble of 7B LLMs, one 13B is also in, but larger ones were not useful We tried to make Deberta work, but LLMs were superior, not even helpful in ensemble Ensemble of different wiki/chunking/embedding techniques with late fusion (i.e. each llm gets a different technique) All LLMs fine-tuned with H2O LLM Studio with binary classification head Training data was not too important, we tried a lot there, but just using the early data shared from @radek1 (thanks!) or generating similar one yourself does the job quite well Two types of model head architectures (details will follow) Managed to select highest private LB, which is also our highest CV (on a set of 6k samples) advanced rag [Introducing Query Pipelines — LlamaIndex, Data Framework for LLM Applications](https://blog.llamaindex.ai/introducing-query-pipelines-025dc2bb0537?gi=5cffe1c160e2) [Advanced Retrieval for AI with Chroma - DeepLearning.AI](https://www.deeplearning.ai/short-courses/advanced-retrieval-for-ai/) Rag hallucination capabilities https://fxtwitter.com/omarsar0/status/1742633831234994189?t=yAWdq3gnC_p6gYPZmOgsmA&s=19 https://fxtwitter.com/jerryjliu0/status/1743680481504514049?t=oc_4XsfPKWot-FoB5fqiHw&s=19 https://www.reddit.com/r/MachineLearning/s/Cjlb8ZQYXY human brain flops estimate Future is bright. We will direct systemic money, attentional and power (money, compute, data) incentives towards the flourishing of all of sentience by bottomup and topdown action. There wont be dystopian tyrrany or catastrophic extinction of all of sentience. There will be technological AGI revolution that benefits everyone, not just the powerful few or nonsentient systems or nobody. Health and immortality will be cracked. Intelligence augmentation will be engineered. Collective intelligence decision making for the most optimal bottom up and top down governance to maximize progress, freedom, safety, resilience, adaptivity will be realized. Technology will be steered. AGI/ASI will be understood internally. Singularity is nearer, leading to biological, silicon and other sentient systems merging, upgrading, growing to trillions and merging into hiveminds or playing the cosmic games on their own spreading through the whole universe experiencing infinite wellbeing, connection, meaning, love, having infinite intelligence, fully reversengineering the deep fundamental mathematical patterns of reality that we are embedded in, where reality, the universe, is equivalent to the collective consciousness itself waking up to itself in its infinite complexity and beauty. https://twitter.com/burny_tech/status/1744578320639877567?t=xyv0gC0vmGoYIQwnb8BM2A&s=19 https://twitter.com/main_horse/status/1734287592932417988?t=X8xeMwoVEIpSZuQxx5DdNw&s=19 i suspect the main benefit of MoE is ⬆ representational capacity, ⬇ compression necessary, less features in superposition required, etc https://twitter.com/main_horse/status/1744554104805048755?t=kJjhiDP7a7LQZYRHC9zoSw&s=19 Surprisingly, we do not observe obvious patterns in the assignment of experts based on the topic. perhaps we can finally end the part of MoE discourse that treats domain-experts as a viable goal... Training And deploying open source LLMs, open source LLM landscape https://twitter.com/NielsRogge/status/1744351291839521016?t=JhFIyaQnsyKes9mHsYwrVw&s=19