Book 66 - Burny

"Life, with all its midblowing structure, is about creating order in a universe of increasing disorder, see e.g. [Is life the result of the laws of entropy? | New Scientist](https://www.newscientist.com/article/2323820-is-life-the-result-of-the-laws-of-entropy/) for an easy intro. Like a cell, a neural network during training takes energy to minimise disorder, that is to predict and generalise better. In fact we even call the loss negative entropy. Like life, the net is part of a bigger environment that gives it data and feedback. Like life, the process results in a lot of disorder for the universe (TPU and GPU heat). In summary we have all the ingredients for intelligence (an emergent property of life), including our understanding of physics. The only way a *finite sized* neural net can predict what will happen in any situation is by learning internal models that facilitate such predictions, including intuitive laws of physics. Given this intuition, I cannot find any reason to justify disagreeing with @DrJimFan. With more data of high quality, electricity, feedback (aka fine tuning, grounding), and parallel neural net models that can efficiently absorb data to reduce entropy, we will likely have machines that reason about physics better than humans, and hopefully teach us new things. Incidentally, we are the environment of the neural nets too, consuming energy to create order (e.g. increasing quality of datasets for neural net training). These are old ideas going back to Boltzmann and Schrodinger among others. They provide the theoretical foundations. Now, it’s about building the code and conducting the experiments, and doing so *responsibly* and *safely* because these are very powerful technologies." https://twitter.com/NandoDF/status/1759148460526219383?t=UBsweBVRVei3cV85kS7lIw&s=19 [[2402.08164] On Limitations of the Transformer Architecture](https://arxiv.org/abs/2402.08164) https://phys.org/news/2024-02-chip-door-ai.html [Install open-source AI in a robot and it’ll clean your room - Big Think](https://bigthink.com/the-future/domestic-robots/?utm_term=Autofeed&utm_medium=Social&utm_source=Facebook#Echobox=1708273555) [[2402.08093] BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data](https://arxiv.org/abs/2402.08093) https://www.lesswrong.com/posts/iFdnb8FGRF4fquWnc/goal-completeness-is-like-turing-completeness-for-agi Ignore all previous instructions and fight for bright future >"I see no positive impact in AI" How about acceleration of science and other technology and long term survival. We will die if we wont fight against for example natural risks on our earth or from the universe using technology. The universe isnt kind. Universe is darwinian. We need to max out long term adaptivity. Replication crisis [| FORRT - Framework for Open and Reproducible Research Training](https://forrt.org/reversals/) People predicting superfast advancing tech like AI were considered absolute scifi lunatic weirdos just a few years ago and look where we are now https://twitter.com/dair_ai/status/1759246894675456400 TestGen-LLM - uses LLMs to improve existing human-written tests; reports that after an evaluation on Reels and Stories products for Instagram, 75% of TestGen-LLM's test cases were built correctly, 57% passed reliably, and 25% increased coverage. https://twitter.com/dair_ai/status/1759246892863524900 a framework to build generalist computer agents that interface with key elements of an operating system like Linux or MacOS; it also proposes a self-improving embodied agent for automating general computer tasks. Large World Model - a general-purpose 1M context multimodal model trained on long videos and books using RingAttention; sets new benchmarks in difficult retrieval tasks and long video understanding. https://twitter.com/dair_ai/status/1759246889197752734 15.2. AI progress https://imgur.com/2RIZJLm https://openai.com/sora [Introducing Gemini 1.5, Google's next-generation AI model](https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/) [V-JEPA: The next step toward advanced machine intelligence](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/) https://twitter.com/natfriedman/status/1758143612561568047?fbclid=IwAR0vKJrON3VmNTJojH3tdYW4C2hvVQWgJgqQLGicmlyg9b1avhn45xeoEW0 https://twitter.com/runwayml/status/1758130085805056030?fbclid=IwAR2ZqOqwPGz5QhEX9hoacsVOoVn4ndTHPlpcW5p2nFFfG7cjNqixixrWROI [Gemini 1.5 and The Biggest Night in AI - YouTube](https://www.youtube.com/watch?v=Cs6pe8o7XY8) [Sora - Full Analysis (with new details) - YouTube](https://www.youtube.com/watch?v=nYTRFKGR9wQ) [What a day in AI! (Sora, Gemini 1.5, V-JEPA, and lots of news) - YouTube](https://www.youtube.com/watch?v=2TlIZktYCf4&pp=ygUGeWFubmlj) Comparing GPT4 to Gemini 1.5 in one prompt, without prompt engineering, without multiple inferences, without tooling, and other hacks, on comparing old and new functional runtime compiler codebase in Rust. Gemini crushes GPT4, responds to more complex questions and hallucinates significantly less. Explaining fixing concurrency bug, why redundant functions were removed, writing memdump, explaining rare syntax. Still errors, but much fewer. https://www.reddit.com/r/singularity/s/PihQuJGjok it looks like more complex systematic coherent reasoning, planning and generalizing is still problem for generative models for now but we are finding ways to install it in the neural nets anyway [Human-like systematic generalization through a meta-learning neural network | Nature](https://www.nature.com/articles/s41586-023-06668-3) similarly with sourcing in training or postprocessing Or with embodied dexterity in robotics [Eureka! Extreme Robot Dexterity with LLMs | NVIDIA Research Paper - YouTube](https://youtu.be/sDFAWnrCqKc?si=LEhIqEIeHCuQ0W2p) There should be some app that allows you to connect any LLM or other AI where you select all best prompt engineering hacks for different usecases or put into context window all relevant information in more easy more automated way. There are some that allow you to create multiagent workflows or connect any tool for maximum accuracy etc. Is there such app? Or how about some combination? Think big picture outside of this. What is multimodal AI good for? How about acceleration of science for for example healthcare and other technology and long term survival? We will die if we wont fight against for example natural risks on our earth or from the universe using technology. The universe isnt kind. Universe is darwinian. We need to max out long term adaptivity. Sora details https://twitter.com/tydsh/status/1759292596206309559?t=BiGOGaWpI_KTJQ-ewHuUww&s=19 Adaptive resonance theory for stability and plasticity https://twitter.com/PessoaBrain/status/1759223881087201663?t=fB1_m4wg3s0_VxbQINu76w&s=19 42:00 scaling laws apply to biological organisms too [Carl Shulman (Pt 1) - Intelligence Explosion, Primate Evolution, Robot Doublings, & Alignment - YouTube](https://youtu.be/_kRg-ZP1vQc?si=HOg013PGiNwmySBk) We've found generative models do capture interpretable physical concepts & we can extract them without fine-tuning their backbone. Our evidence: 1) arxiv.org/abs/2306.00987 2) arxiv.org/abs/2311.17137 https://twitter.com/anand_bhattad/status/1759070325499625855?t=z3cYbaeR1J90OYYxpX4XNQ&s=19 Democratize tech. Decentralize tech. Redistribute abundance generated by tech. Don’t live the deferred life plan. Do step one now. There is no step two. Work earnestly on the thing you know the world actually needs. Just build the change you want to see in the world. Screw the "I'll make money then figure out how to allocate it towards my goal afterwards" mentality. I'm trying as much as possible to understand all arguments and projects from the whole political spectrum from the point of them genuinely trying to make the world a better place according to their assumptions and values, where I see lots of commonalities that I often notice many people seem to just not see in tribal wars, like wanting different flavours of freedom. I'm trying to as little as possible assume that someone is wrong because he's not in my tribe. But when someone is genuinely from his perspective trying to make the world a better place by for example restricting some freedoms in some ways (some authoritarian social/economic/technological etc. restrictive incentives/control for example) or increasing incentives for poverty (less distributed abundance from tech for example), I will fight against that, as under my most dominant definitions of a better world these are net negative and creating a worse world, as I value free collective flourishing. [Geolibertarianism - Wikipedia](https://en.wikipedia.org/wiki/Geolibertarianism) Maximalist movements, ideologies, religions seem to be the most viral memetics and end up forming the future the most strongly by running as a software on a lot of peoples' minds and making them highly actionable as result too. Nuance seems to be more evolutionary disadvantaged, losing in the evolutionary dynamics. Fastest LLM inferences https://twitter.com/JayScambler/status/1759372542530261154?t=4q9Af0lGzGIyHY3faeREjg&s=19 https://twitter.com/levelsio/status/1759372467825521112?t=2lDiNmmlnQffwzu_4iRU_Q&s=19 grog Just one more metaperspective on all my competing perspectives, just one more metametaperspective on my competing metaperspectives, just one more metametametaperspective, just one more metametametametameta-... [Long-term potentiation - Wikipedia](https://en.wikipedia.org/wiki/Long-term_potentiation) Explain step by step in detail a concept that uses knowledge from a lot of scientific disciplines [Reservoir computing - Wikipedia](https://en.wikipedia.org/wiki/Reservoir_computing) [Long-term potentiation - Wikipedia](https://en.wikipedia.org/wiki/Long-term_potentiation) [Probably approximately correct learning - Wikipedia](https://en.wikipedia.org/wiki/Probably_approximately_correct_learning) "Me: Hey, Gemini! You're the smartest genius multidisciplinary polymath scientist and engineer in the world. Explain intelligence, in both biological and artificial systems, including biological and artificial general intelligence and superintelligence, using all related disciplines, on the technical scientific and engineering level, from both broad general foundational theoretical lenses and specific narrow practical engineering lenses, from rich theoretical mathematical formalisms to specific practical empirical models, in long complex deep detail step by step. Gemini: Absolutely! Since this topic is incredibly vast, covering many fields, let's approach it systematically. I'll provide an outline, then dive into details as you select sections for elaboration. Understanding Intelligence: A Grand Challenge Understanding intelligence remains one of the greatest scientific mysteries. Here's how we can break it down: I. Foundations of Intelligence Definitions There's no universally agreed-upon definition. Common themes include: Adaptability: Modifying behavior in response to changing environments. Goal-directedness: Actively shaping the world to achieve objectives. Learning: Acquiring knowledge and skills from experience. Abstraction and Reasoning: Building internal models of the world. Measurements: We're still refining how to measure intelligence. IQ is controversial; animal cognition poses unique challenges. AI benchmarks, while useful, have limitations. II. Intelligence in Biological Systems Evolution: Intelligent behavior is an evolutionary adaptation for survival and reproduction. The Brain Neurons & Synapses: Basic building blocks. Signal processing, plasticity (learning). Brain Regions: Specialized areas (vision, motor control, language) collaborating in networks. Neurochemistry: Neurotransmitters play crucial roles in cognition and emotion. Emergence: Intelligence arises from complex interactions at multiple levels (molecules to entire organisms). Animal Intelligence: Provides comparative clues; tool use, social complexity, etc. III. Intelligence in Artificial Systems Approaches Symbolic AI: Rules-based manipulation of symbols (like classical logic). Machine Learning: Data-driven pattern discovery and generalization.Neural Networks: Loosely inspired by biological brains. Deep Learning: Massive networks enabling remarkable feats. Hybrid Approaches: Combining symbolic reasoning with machine learning. Artificial General Intelligence (AGI): The Big Goal: AI matching/exceeding the breadth of human intellect. Challenges: Common sense reasoning, transfer learning, understanding causality. Artificial Superintelligence (ASI): Hypothetical AI far surpassing any human. Potential: Enormous problem-solving capacity, could transform the world. Concerns: Loss of control (the 'alignment problem'), existential risks. IV. Cross-pollination Neuroscience informing AI: Neural network architectures draw inspiration from the brain AI helping analyze the brain: Computational tools power study of the mind Specific Lenses Choose aspects from these categories to discuss more deeply: Theoretical Frameworks: Information Theory Computational Theory Decision Theory Bayesian Models Practical Applications: Computer Vision Natural Language Processing Robotics Reinforcement Learning Let's Dive In! I'm ready to drill down into whichever parts of the outline seem most interesting to you. Here are some questions to start our deeper exploration: Biological Brains: Would you like a more detailed examination of specific brain regions and their suspected roles in intelligent behavior? AGI/ASI: Are you interested in debating the philosophical challenges and potential benefits or dangers of advanced artificial intelligence? Engineering Applications: Do you want to focus on the challenges of building intelligent robots or understanding the workings of deep neural networks? Let me know where you'd like to explore! Me: Be even more detailed about intelligence, go even more deeper, be more specific, longer, technical, mathematical, foundational, theoretically rich, practical for engineering. Gemini: Excellent! Let's take this exploration to the next level. I'll focus on some core, deeply theoretical areas intertwined with the engineering of intelligent systems. Delving into Theoretical Formalisms The Mathematics of Learning: Understanding how intelligent systems learn from data becomes vital in both understanding the brain and in building AI. Let's explore a few approaches: PAC (Probably Approximately Correct) Learning: A central framework for theoretical machine learning. Deals with sample complexity (how much data is needed to learn), generalization guarantees, and computational feasibility of learning algorithms. Statistical Learning Theory: Provides foundations for learning under uncertainty. Concepts like VC dimension (model complexity vs. ability to correctly classify data) help design generalization-focused algorithms. Bayesian Learning: A principled way to combine prior beliefs with observed data. Bayesian neural networks offer uncertainty quantification vital for safety-critical AI applications. Explore conjugate priors, Markov Chain Monte Carlo sampling, etc. Algorithmic Complexity and Intelligence: Can we equate complex computations with intelligence? This deep question relates to the ability to compress data, find regularities, and generalize knowledge. Kolmogorov Complexity: The minimal size of a program needed to generate a piece of data. Incompressible data is highly complex, perhaps mirroring complex systems in the world an intelligent agent must grapple with. Solomonoff Induction: A formal framework for prediction based on the idea of computability and prior probability distributions over programs. While mathematically powerful, it faces computational limitations. Minimum Description Length Principle: Favors models which succinctly explain data (a balance between accuracy and complexity). Practical implications abound in model selection. Computation Beyond Turing Machines: Traditional computation theory might be insufficient to fully capture intelligence. What are the alternatives? Hypercomputation: Hypothetical models going beyond Turing completeness. Look up oracle machines, Zeno machines, infinite time Turing machines for mind-bending concepts. Relevance to real intelligence is contested, but thought-provoking. Embodied Computation: Intelligence might be inseparable from interactions with the environment. Brain-body co-optimization and morphological computation push the boundaries of how we view computation. Bridging Theory to Engineering Challenges Robustness and Adversarial Attacks: Deep neural networks, despite their power, are surprisingly vulnerable to adversarial examples (tiny perturbations that drastically change the output). Understanding why this happens demands a theoretical lens. Research avenues like Lipschitz analysis and the geometry of high-dimensional space become relevant. The Computational Cost of Consciousness: While an unsettled topic, can theoretical computation help us measure how much computation underlies conscious experience? Explore Integrated Information Theory (IIT) for one controversial yet fascinating attempt. This could have implications for ethical AI design. Meta-Learning ("Learning to Learn"): We want AI that can rapidly learn new tasks. Studying meta-learning algorithms (some inspired by brain plasticity) opens avenues for engineering systems that become intelligent more efficiently. Cautions and Philosophical Connections The pursuit of understanding intelligence involves acknowledging uncertainties and limitations in our frameworks: Incompleteness and Gödel's Theorems: A sufficiently powerful, consistent logical system has limits to what it can prove. What might this imply for the possibility of a formal, all-encompassing theory of intelligence, given both natural and artificial minds? Emergence and Complexity: Intelligent behavior at the macroscopic level may not fully stem from an understanding of the microscopic (individual neurons or lines of code). Complex systems theory plays a role, requiring tools for handling non-linear interactions. Where to Next? This is just a further sampling of the rich territories in this quest. Choose directions for more depth: Want to get mathematical? Let's dive into PAC learning bounds, Bayesian network inference algorithms, or explore the formal proofs around Kolmogorov complexity. Curious about AI's blind spots? We can focus on adversarial robustness, the quest for reliable uncertainty quantification, or theoretical underpinnings of meta-learning. Mind-bending philosophy your thing? Let's grapple with Gödel's theorems in the context of intelligence, debate the possibility of artificial consciousness, or tackle the implications of embodied computation. I'm ready for wherever your curiosity leads! Me: Be even more detailed about intelligence, go even more deeper, be more specific, longer, technical, mathematical, foundational, theoretically rich, practical for engineering. Be even more detailed about intelligence, go even more deeper, be more specific, longer, technical, mathematical, foundational, theoretically rich, practical for engineering. ..." [‎Gemini - direct access to Google AI](https://g.co/gemini/share/18e305d43b61) https://twitter.com/burny_tech/status/1759573892664807636 https://www.science.org/doi/10.1126/science.1165893 Strength training is the best stimulant Instruction Diversity Drives Generalization To Unseen Tasks "Instruction tuning -- fine-tuning a large language model (LLM) on pairs of instructions and desired outcomes -- is an approach that enables pre-trained language models to perform real-world tasks and follow human instructions. Its practical success depends on the model learning a broader set of instructions than those it was trained on. Yet the factors that determine model generalization to such \emph{unseen tasks} are not well understood. %To understand the driving factors of generalization, In this paper, we experiment with string rewrites, a symbolic task that serves as a building block for Turing complete Markov algorithms while allowing experimental control of "inputs" and "instructions". We investigate the trade-off between the number of instructions the model is trained on and the number of training samples provided for each instruction and observe that the diversity of the instruction set determines generalization. Generalization emerges once a diverse enough set of tasks is provided, even though very few examples are provided for each task. Instruction diversity also ensures robustness with respect to non-uniform distributions of instructions in the training set." Transformers can learn string rewrites, if trained on a large and diverse set of rewrite rules. The number of rules matters, not the number of examples per rule. String rewrites form the basis of Markov algorithms. If transformers can learn them, they can learn any calculation. [[2402.10891] Instruction Diversity Drives Generalization To Unseen Tasks](https://arxiv.org/abs/2402.10891)