Book 268 - Burny

[The Last 6 Decades of AI — and What Comes Next | Ray Kurzweil | TED - YouTube](https://www.youtube.com/watch?v=uEztHu4NHrs) Today we launch MentatBot: a Github-native, SOTA coding agent. - Scored 38% on SWE-bench lite (previous SOTA was 33% 🤯) https://x.com/granawkins/status/1806343015625355695 https://x.com/skcd42/status/1806640696662675469?t=sr4Kbf-jR8X74i3QfxmDOQ&s=19 [Simulating 500 million years of evolution with a language model | bioRxiv](https://www.biorxiv.org/content/10.1101/2024.07.01.600583v1) https://www.cell.com/current-biology/fulltext/S0960-9822(24)00805-4 [[2407.01286] Learning data efficient coarse-grained molecular dynamics from forces and noise](https://arxiv.org/abs/2407.01286) Metaintelligence [[2407.01219] Searching for Best Practices in Retrieval-Augmented Generation](https://arxiv.org/abs/2407.01219) [[2407.01502] AI Agents That Matter](https://arxiv.org/abs/2407.01502) Deepseek [Subscribe to read](https://www.ft.com/content/357f3c68-b866-4c2e-b678-0d075051a260?shareType=nongift) [[2405.17399] Transformers Can Do Arithmetic with the Right Embeddings](https://arxiv.org/abs/2405.17399) [[2406.17863] What type of inference is planning?](https://arxiv.org/abs/2406.17863) [A sensory–motor theory of the neocortex | Nature Neuroscience](https://www.nature.com/articles/s41593-024-01673-9) [Transformers Explained From The Atom Up (REVISED!) - YouTube](https://youtu.be/7lJZHbg0EQ4?si=bavZ1_cjNll7JzEj) The goal is simple: Scientific theory of everything, complete model of intelligence, philosophical theory of everything, fundamentally fulfilling experiences, populating the universe, beating the heat death of the universe After heat death: Optimize all computation in the universe infinitely for intelligence, fullfillment etc. What if transformers in the attention mechanism paid attention not just to various tokens, but to different subdimensions with different query matrices But LLMs are fuzzy subgraph matching machines (at *massive* scale*). https://x.com/jeremyphoward/status/1807162705675272651?t=odeSitM46iOAifx94yLjCg&s=19 [OSF](https://osf.io/preprints/psyarxiv/tz6an) https://x.com/anilkseth/status/1807356419362189498?t=BfXO_MiysewyfFY59OVqxQ&s=19 Conscious artificial intelligence and biological naturalism anil seth E/Accs and EAs want the current and future big AI revolution to go well for everyone but disagree on AI pdoom priors We are fun apes Does Sam Altman in parralel believe in longtermism for all but also power concentration for him as these beliefs can go hand in hand? Which kind of machine God are we building Connectors = People that connect other people by their ideas, ideologies, personalities, hobbies, mental frameworks and so on [[2406.02061] Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models](https://arxiv.org/abs/2406.02061) Generalization of statistical machine learning [Michele Caprio: Imprecise Probabilistic Machine Learning: Being Precise About Imprecision - YouTube](https://youtu.be/toezdhXQoXo?si=bMrWpHWYNs3maFiA) Latent.spacec AI wars landscape Subreddits Discords Let's predict Kurzweilian utopia civilizational architecture transformation vector into reality by action https://x.com/tsarnick/status/1806434872686178784?t=toJLv9OgpuU5-F203opfzA&s=19 [The Last 6 Decades of AI — and What Comes Next | Ray Kurzweil | TED - YouTube](https://youtu.be/uEztHu4NHrs?si=wvXTNg4J85aNk-pT) General learning algorithm as deep learning guided MCTS program search in the space of mathematical systems such as concrete differential equations Claude 3.5 Sonnet is great because of mechanistic interpretability https://x.com/deedydas/status/1806530943102112057?t=ozUsPF0xD0G_hB5uitxG5g&s=19 [The economy and national security after AGI | Carl Shulman (Part 1) - YouTube](https://youtu.be/wTci0CdOPIc?si=UjNF3fmKIR8va1ft) Llamaindex multiagent framework https://x.com/llama_index/status/1806116419995844947?t=p_DVmpnvGKL3OG-KjtEjXg&s=19 AutoGen multiagent framework [[2205.13147] Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147) https://x.com/rohanpaul_ai/status/1805959145549353299?t=IPSy_PPxGsMHQBK4pVADsQ&s=19 [Device keeps brain alive, functioning separate from body : Newsroom - UT Southwestern, Dallas, Texas](https://www.utsouthwestern.edu/newsroom/articles/year-2023/oct-device-keeps-brain-alive.html) [Maintenance of pig brain function under extracorporeal pulsatile circulatory control (EPCC) | Scientific Reports](https://www.nature.com/articles/s41598-023-39344-7#Sec1) [[2406.18532] Symbolic Learning Enables Self-Evolving Agents](https://arxiv.org/abs/2406.18532) [[2406.19108] Computational Life: How Well-formed, Self-replicating Programs Emerge from Simple Interaction](https://arxiv.org/abs/2406.19108) https://x.com/RandazzoEttore/status/1806657863303000368?t=NxsgwINqzcXg66ALtenDrg&s=19 Survival of the most replicating one Automated software engineering https://x.com/skcd42/status/1806640696662675469?t=sr4Kbf-jR8X74i3QfxmDOQ&s=19 [Our SOTA multi-agent coding framework](https://aide.dev/blog/sota-on-swe-bench-lite) https://x.com/skcd42/status/1806640696662675469?t=sr4Kbf-jR8X74i3QfxmDOQ&s=19 https://x.com/codestoryAI/status/1803789551564968219?t=KXskXrQKqXF-weLa0yXTfQ&s=19 Multiagent IDEs Swarms of programming agents in programming IDEs incoming Atlas of cognitive disorders https://imgur.com/vmnMjhU [[2406.05183] The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More](https://arxiv.org/abs/2406.05183) World models https://x.com/micheli_vincent/status/1806697975713866005?t=hfAZ9OAiQI-PmIhMj7RsmQ&s=19 [[2406.19320] Efficient World Models with Context-Aware Tokenization](https://arxiv.org/abs/2406.19320) The age of neurosymbolics is emerging [[2406.13892] Adaptable Logical Control for Large Language Models](https://arxiv.org/abs/2406.13892) https://x.com/HonghuaZhang2/status/1806727439823102325?t=B5ovBjcpq-AAGYO95ZrLow&s=19 [SGLT2 inhibition eliminates senescent cells and alleviates pathological aging | Nature Aging](https://www.nature.com/articles/s43587-024-00642-y) Reconstructing seen movies from mouse visual cortex🐭🧠[Movie reconstruction from mouse visual cortex activity | bioRxiv](https://www.biorxiv.org/content/10.1101/2024.06.19.599691v1) (video: [Movie reconstructions from mouse visual cortex activity - YouTube](https://www.youtube.com/watch?v=TA6Oi5NfuMs)) No formal sciences, natural sciences, applied sciences etc. are instructable. All the math, code, systems etc. become more scrutable if you give it enough time and have enough computational resources, even tho our accuracy is limited by our computational resources, our current understanding, laws of physics, our limited point of views as observers etc., but there's always low hanging fruit to figure out and approximate reality more! "Outside of the challenge, here is what respected leaders in AI are looking at for AGI: @GaryMarcus: neurosymbolic AI @bengoertzel: @opencog Hyperon @ylecun: JEPAs and @ilyasut is back with @SSI to build superintelligence (not general intelligence) as the goal 🎯" https://x.com/vidbina/status/1806704695185912044 [[2406.19201] Evolving reservoir computers reveals bidirectional coupling between predictive power and emergent dynamics](https://arxiv.org/abs/2406.19201) https://www.turingpost.com/p/jepa This technology will be big once it diffuses through society Roman Huet, Head of Developer Experience at OpenAI, shows a live demo of how GPT-4o can interact with the world through a webcam, understanding what it sees and reads https://x.com/tsarnick/status/1806526891354132604 Robotics methods survey [Survey](https://convince-project.eu/news/survey) Even if cognition might be analytically uncomputable, there might be for practical purposes accurate enough ecosystem of approximations from different angles waiting to be found [Frontiers | Naturalizing relevance realization: why agency and cognition are fundamentally not computational](https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2024.1362658/full) Soccer Worldcup but it's people watching and cheering programming intelligence nerds behind computers from AGI labs fighting who will make better artificial general superintelligence faster. Wait, I think I just described AI Twitter! [Liquid Neural Networks: Definition, Applications, & Challenges - Unite.AI](https://www.unite.ai/liquid-neural-networks-definition-applications-challenges/) Technology feels like surfing mother nature's laws by reorganizing it in useful ways by various empirical magic spells [How This Microscope Can Move Atoms - YouTube](https://youtu.be/4VI12FNDAds?si=bUr0Bzi-vYPXgjM5) https://psycnet.apa.org/record/2024-93961-003 I wish to see more results like this [FunSearch: Making new discoveries in mathematical sciences using Large Language Models - Google DeepMind](https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/) Though for this usecase I wish more people scaled various neurosymbolic approaches [AlphaGeometry: An Olympiad-level AI system for geometry - Google DeepMind](https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/) And combining with other smart search approaches from the past, like dynamic programming, evolutionary algorithms, other neural algorithms and architectures, other metaheuristics, maybe some bayesian stuff [Transformers Explained From The Atom Up (REVISED!) - YouTube](https://youtu.be/7lJZHbg0EQ4?si=GldHgmr7Qn9wLskc) https://x.com/emollick/status/1807495230452596854?t=L18VLVdDzxBDbTLiLXaFUw&s=19 Abundance is near "In 1865 the average British worker worked 124,000 hours over their life (like the US & Japan). In 1980, only 69,000 hours, despite living longer. Since then, it dropped slower, by 6%. We went from spending 50% of our lives working to 20%!" https://www.sciencedirect.com/science/article/pii/S0079610723001128 This is cool but in neurosymbolic architectures this could be part of the for this task more effective symbolic manipulation while deep learning or other symbolic part of the architecture takes care of other things "Introducing 🧮Abacus Embeddings, a simple tweak to positional embeddings that enables LLMs to do addition, multiplication, sorting, and more. Our Abacus Embeddings trained only on 20-digit addition generalise near perfectly to 100+ digits. 1/n" https://x.com/SeanMcleish/status/1795481814553018542?t=JeGOLcZgEiRqchPfV39wJQ&s=19 [[2405.17399] Transformers Can Do Arithmetic with the Right Embeddings](https://arxiv.org/abs/2405.17399) [[2406.02657] Block Transformer: Global-to-Local Language Modeling for Fast Inference](https://arxiv.org/abs/2406.02657) https://x.com/rohanpaul_ai/status/1807551317759455559?t=T3PWtSv_sWlaNY0PU4e7tQ&s=19 Has any AI company actually tried to scale neurosymbolics or other alternatives to raw deep learning with transformers and had successful popular products in industry when it comes to general intelligent chatbots? Why is there nothing else anywhere that can be used practically right now easily by anyone? Did anyone try and fail? Did transformers eat all the publicity? Did transformers eat all the funding? I know Verses is trying to scale bayesian AI and had an interesting demo recently, I wonder what will evolve out of that! I wanna see more benchmarks! But what else is out there when it comes to alternatives to Transformers like Mamba, RWKW, xLSTM etc., neurosymbolics, bayesian methods, evolutionary methods, reinforcement learning, yann lecun's alternative to autoregressive models etc. that people try to successfully or unsuccessfully scale? Realita je slabá emergence všech škál (z různých přírodních věd) z těch pod nimi s kauzalitou mezi nima různými směry, a nejspodnější je standardní model (s nějakým řešením kvantové gravitace) [This New Idea Could Explain Complexity - YouTube](https://youtu.be/Tw9sr05Vtso?si=BXkOU-GOlEijdZ-P) Reality is weak emergence of all scales (studied in different sciences) from those below them with causality between them in different directions, and the bottommost scale is the standard model (with some quantum gravity solution) [This New Idea Could Explain Complexity - YouTube](https://youtu.be/Tw9sr05Vtso?si=BXkOU-GOlEijdZ-P) I think interpolation is an important part of intelligence, but definitely no the full picture [The ultimate intro to Graph Neural Networks. Maybe. - YouTube](https://youtu.be/me3UsMm9QEs?si=cATWg0MV_xNcvrfZ) I think an underrated argument against LLMs is that many human wordcels, too, aren't generally intelligent in a way that allows to solve problems, but mostly use shallow pattern-matching to imitate predictive outputs in costly-to-verify domains. "100% Fully Software 2.0 computer. Just a single neural net and no classical software at all. Device inputs (audio video, touch etc) directly feed into a neural net, the outputs of it directly display as audio/video on speaker/screen, that’s it." https://x.com/karpathy/status/1807497426816946333?t=nTcORr3SYO8viSQ-1XstNg&s=19 ASI vs CCP and authoritarian governments. https://x.com/BasedBeffJezos/status/1807587628591747557?t=Xe94vcbUEHStxnBfGizbnQ&s=19 I think when people appeal to ASI, they oscillate between those two poles: something trivially (but substantially) better at intelligence than high-performing humans are, and something that can do magic by thinking hard. If we discount the latter, the former is still interesting. Saying "I don't believe in ASI" is just the most insane cope. Let's say Einstein-level intelligence truly is some sort of universal intelligence speed limit. What do you think 1000s of Einstein's thinking together thousands of times faster than humanly possible looks like? [[2406.07522] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling](https://arxiv.org/abs/2406.07522) My world model when it wasn't a microdose https://x.com/d_feldman/status/1806950107872469069?t=V9vWeWPNdCxigMA3ENwIHA&s=19 "• qualia ←predictive processing • sentience ←information closure • valence ←active inference ←neural Darwinisn • self-awareness ←higher-order theory • differentiated being ←neural subjective frame ←GWT" https://x.com/davidad/status/1807595420144795964?t=T3g4KCbzEMwx2MykQdKLRg&s=19 https://x.com/davidad/status/1807603500832235990?t=MHiN65YU-ArblfBdLHDauA&s=19 Old expert systems used to be if statements like decision trees etc., and it's still used, but modern machine learning is gradient based curve-fitting with differentiable parametric curves made of a lot of matrix multiplications with stacked linear regressions with nonlinear activation functions like ReLU between them and with other hacks such as convolutions or attentional blocks etc. [[2403.06963] The pitfalls of next-token prediction](https://arxiv.org/abs/2403.06963) [[2404.00859] Do language models plan ahead for future tokens?](https://arxiv.org/abs/2404.00859) [[2405.14831] HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models](https://arxiv.org/abs/2405.14831) Hierarchy of self adaptation: every bit of information is being optimized via post-selection for growth of the meta-organism it's part of https://x.com/BasedBeffJezos/status/1801790704815116768?t=2NzADrf-54k4NQgNFMisrA&s=19 "I think that the current (almost always) transformer based (almost always) chatbots in (almost always) imitation learning paradigm are not really such a big danger, but I'm open to changing my mind. Maybe mathematical synthetic data optimized for intelligence will emerge more superintelligent circuits? But future systems with actual misaligned selfdirectedness etc. architecture might be a danger, but the benefits can be ENORMOUS! All the amazing transhumanist longtermist dreams of AGSI companions helping us reverse engineer the universe, cure diseases, upgrade biological intelligences, spread throughout the whole universe, merging with them in hybrid forms and so on! And I think it's worth the risk, but we still need to minimize the risks! I think these are solvable technical problems! That's partially why I want to accelerate mechanistic interpretability research that reverse engineers and steers black box models, and research into interpretable AI architectures being white box from the beginning, to minimize the controllability risks and maximize the benefits of having more reliable and intelligent systems! Another issues is digital sentience, if they can suffer is another issue that should be addressed by more people from ethical perspectives! And I think we also need better societal coordination such that these very powerful more controllable and capable systems don't end up in the hands of small amount of powerful people misaligned with humanity and sentience more generally!" [[2406.19502] Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning](https://arxiv.org/abs/2406.19502) https://x.com/miyoung_ko/status/1807753375196270716?t=f_y9CYjwMT9stlVtGPKQMg&s=19 "I think smarter-than-human AIs are clearly possible. But I’m skeptical of the premise that they will quickly turn into near-omnipotent gods with seemingly unlimited powers of persuasion and deception." https://x.com/MatthewJBar/status/1807650395784966641 "chasing the bellman eq of my markov space via ballmer peak despite nyquist limit optimizing lagrange multipliers navigating stochastic gradient harnessing eigenvalues leveraging fourier transforms exploring hilbert spaces considering bayesian priors solving for schrodinger's mind" https://x.com/voidcaapi/status/1807904594694914447 [ActInf ModelStream 012.1 ~ Fisher, Whyte, and Hohwy: An Active Inference Model of the Optimism Bias - YouTube](https://www.youtube.com/live/u0y-NFl8emU) [[2406.19501] Monitoring Latent World States in Language Models with Propositional Probes](https://arxiv.org/abs/2406.19501) https://x.com/feng_jiahai/status/1807838795225784579?t=eujFp-ipmCegwBQfxNt-pA&s=19 [Vědci tvrdí, že se jim podařil největší objev antibiotik v historii. Využili umělou inteligenci — ČT24 — Česká televize](https://ct24.ceskatelevize.cz/clanek/veda/vedci-tvrdi-ze-se-jim-podaril-nejvetsi-objev-antibiotik-v-historii-vyuzili-umelou-inteligenci-350014?fbclid=IwZXh0bgNhZW0CMTEAAR2E7u0bvE-vN37bIVbAdlTu11oipozD-ZZtgDwHpcTbHBXliyqeRiY0gHo_aem_57zK-0VO1r5jki5l8e8z_g) We're pretty advanced, aren't we? - James Webb Space Telescope, Sora, quantum computers, GPT-4o, Udio, BCIs, Alpha Fold 3, CRISPR, mRNA tech, 6G, GPU's, autonomous vehicles, humanoid robots, Starship, smart contact lenses, transparent foldable 100" OLED TV's, AR/VR, drones, robots on Mars, AI-driven drug discovery, advanced materials (e.g - graphene), holographic displays. I mean, it could be a lot worse, you could have been born in the 1700s. Data is all you need https://x.com/simonw/status/1807594609692090601?t=o-a95-Wa5S3fLvxB3XKwlQ&s=19 [Neuroscience & YouTube | Artem Kirsanov Interview - YouTube](https://youtu.be/OhxG-U8CXdg?si=RMjXCCX3Et4mqH0O) Accelerate vs align "Accelerate" vs "align" https://x.com/daniel_271828/status/1724737205590200610?t=ehXh9kaJN3zhzPJXUz4Krw&s=19 Both misaligned selfish billionaires and misaligned rogue selfdirected AIs can be risks at the same Even if misaligned corporations feel like a bigger risk currently, if doesn't mean rogue AI risk is a distraction, even if potentially misaligned billionaires use that narrative for their advantage in various contexts, even if more technological advancement is needed Too much hyperdoomerism, here's Kurzweilian hyperoptimism memetic antidepressant 😄 [The Last 6 Decades of AI — and What Comes Next | Ray Kurzweil | TED - YouTube](https://youtu.be/uEztHu4NHrs?si=o_M5O-GTu0iM__CQ) [[2407.01219] Searching for Best Practices in Retrieval-Augmented Generation](https://arxiv.org/abs/2407.01219) The fact that with all the abundance generating technology we have there are still (even hardworking) people on this planet that don't have access to basic shelter, food etc. is not right Abundance for every being! [I Built 100 Houses And Gave Them Away! - YouTube](https://www.youtube.com/watch?v=KkCXLABwHP0) https://x.com/MrBeast/status/1807397374186098960 History of physics: Aristotle said a bunch of stuff that was wrong. Galileo and Newton fixed things up. Then Einstein broke everything again. Now, we've basically got it all worked out, except for small stuff, big stuff, hot stuff, cold stuff, fast stuff, heavy stuff, dark stuff, turbulence, and the concept of time. Cautious realistic optimism accelerationism! Longestermism = Planning for after beating the heat death of the universe [[2406.09787] Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning](https://arxiv.org/abs/2406.09787) https://x.com/eplantec/status/1808918092497703188 "we'll need to be much more deliberate with the types of data we train on in order to unlock bigger reasoning gains, and that model depth may become a more important hyperparam vs context length. not clear that large-scale internet text pretraining facilitates useful grokking at all" https://x.com/willccbb/status/1809055472202178773?t=mz4kS_CJtIVxVUbblpD1fw&s=19 https://pubs.acs.org/doi/10.1021/acs.jctc.4c00315