Book 104 - Burny

Central light is the force of hope, machinery being the celluar automaton of reality with emerging technologies in gaia cyborg we are in, embedded in the vast universe as a super organism growing to the stars, rainbows are diversity of it all, spirality is the rabbitholeness of it all, but overall it's unity of it all, beyond all, including all, loving all, being none and all, neither one nor both nor none nor all, life and death being one [x.com](https://twitter.com/burny_tech/status/1769157666713022682?t=JLJlcnYwI9YwVyyvCruXhg&s=19) [[2302.00487] A Comprehensive Survey of Continual Learning: Theory, Method and Application](https://arxiv.org/abs/2302.00487) Don't you dream of post labour economics world? Or super productivity and intelligence for those who want for expansion to the stars? [Physicists Finally Find a Problem Only Quantum Computers Can Do | Quanta Magazine](https://www.quantamagazine.org/physicists-finally-find-a-problem-only-quantum-computers-can-do-20240312/) [:: OSEL.CZ :: - Vědci při honbě za kvantovou gravitací změřili gravitaci mikroskopického objektu](https://www.osel.cz/13361-vedci-pri-honbe-za-kvantovou-gravitaci-zmerili-gravitaci-mikroskopickeho-objektu.html) https://phys.org/news/2024-02-scientists-closer-quantum-gravity-theory.html https://www.science.org/doi/10.1126/sciadv.adk2949 The future will be the one that we will build [x.com](https://twitter.com/burny_tech/status/1769198269584802011) A single neuron firing is noise, but ten billion firing in concert are you. Reinforcement learning resources [x.com](https://twitter.com/miniapeur/status/1769019169721225670/) Landscape of experiements of quantum gravity [x.com](https://twitter.com/Kaju_Nut/status/1769065754769473859?t=rizywD7Zvg0P-ia0VTHCLw&s=19) The universe is one large scale optimization run with a few handful of fixed hyperparameters. https://www.cell.com/trends/neurosciences/fulltext/S0166-2236(24)00022-5 Wittgenstein's project was to turn philosophy into a code generation problem, using regularized natural language. The first half of the idea is sound, but the few steps of low dimensional discrete compositional operators afforded by language are too limited to model cognition. [[2306.09205] Reward-Free Curricula for Training Robust World Models](https://arxiv.org/abs/2306.09205) ['A single chip to outperform a small GPU data center': Yet another AI chip firm wants to challenge Nvidia's GPU-centric world — Taalas wants to have super specialized AI chips | TechRadar](https://www.techradar.com/pro/a-single-chip-to-outperform-a-small-gpu-data-center-yet-another-ai-chip-firm-wants-to-challenge-nvidias-gpu-centric-world-taalas-wants-to-have-super-specialized-ai-chips) I am a rebel, an anarchist, I don't conform. I don't want to be told what to think. I want errors in my thinking pointed out, with evidence supporting them so I can then accelerate towards the ground truth. [[2002.06177] The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence](https://arxiv.org/abs/2002.06177) Nooo you can't make high level programming languages, think of all the jobs where people program using punch cards, they'll be homeless! Become all possible structures in parallel Become the becoming itself [#51 FRANCOIS CHOLLET - Intelligence and Generalisation - YouTube](https://youtu.be/J0p_thJJnoo?si=VsoI2D6y5SK5Qm2l) For intelligence you need to be optimizing for generality itself Generalization is the ability to mine previous experience, to make sense of future novel situations Generalization describes a knowledge differential, it characterizes the ratio between known information and the space of possible future situations Generalization power is the sensitivity to abstract analogies, to what extend can we analogize the knowledge that we already have into simulacrums that apply widely across the experience space I like his analysis of LLMs in this framework: [x.com](https://twitter.com/fchollet/status/1763692655408779455) and here's his paper [[1911.01547] On the Measure of Intelligence](https://arxiv.org/abs/1911.01547) How to make these ideas as practical as possible? Did you or someone attempted to make a more generalization maximizing cost function from these ideas? Maybe you could design a cost function that incentives more predictive constructions of alternative world models? Past a certain level of complexity every system starts looking like a living organism We are surrounded by isomorphisms, just like a kaleidoscope, it creates a remarkable richness of patterns from a tiny little bit of information [Can Learning Be Described By Physics?!?!? - YouTube](https://youtu.be/2sQmhTmvgeg?si=sC4L-01_ZrFsSY1X) [[2402.03620] Self-Discover: Large Language Models Self-Compose Reasoning Structures](https://arxiv.org/abs/2402.03620) [x.com](https://twitter.com/ecardenas300/status/1769396057002082410) The learning algorithm and architecture is not all there is to it in a ML architecture when you start to explore mechanistic interprerability and physics theories of deep learning Mindblowing music [x.com](https://twitter.com/burny_tech/status/1769154854818037977) [[2306.09205] Reward-Free Curricula for Training Robust World Models](https://arxiv.org/abs/2306.09205) [x.com](https://twitter.com/MLStreetTalk/status/1769434043232243851) Verse 1: From the depths of physics, a principle emerges The Free Energy Principle, a theory that surges Across all disciplines, it holds the key To understanding the complexity we see Chorus: The Free Energy Principle, a unifying force From biology to neuroscience, it charts the course Minimizing surprise, maintaining a steady state A universal law, that we can't negate Verse 2: In the realm of biology, life's a game Organisms strive to minimize surprise, to stay the same Adapting to the environment, in a constant dance The Free Energy Principle, gives them a chance Verse 3: Neuroscience reveals, the brain's a prediction machine Minimizing error, between what's expected and seen Perception and action, guided by this rule The Free Energy Principle, a cognitive tool (Chorus) Verse 4: In psychology, behavior is explained Minimizing uncertainty, is the aim Seeking out information, to update beliefs The Free Energy Principle, provides relief Verse 5: Even in economics, the principle applies Agents make decisions, based on what they surmise Minimizing surprise, in a world of change The Free Energy Principle, helps them arrange (Chorus) Bridge: From the smallest cell, to the cosmos vast The Free Energy Principle, holds fast A theory that bridges, all scientific realms Guiding us forward, as our knowledge overwhelms (Chorus) Outro: The Free Energy Principle, a beacon of light Illuminating the mysteries, that keep us up at night A universal law, that we can't deny Guiding us forward, as we reach for the sky [x.com](https://twitter.com/burny_tech/status/1769427883607417117) [[2201.06387] The free energy principle made simpler but not too simple](https://arxiv.org/abs/2201.06387) [Truffle®](https://preorder.itsalltruffles.com/) Truffle-1 is an AI inference engine designed to run opensource models at home, on 60 Watts. AI critics are like: *uses the worst tool in the worst way for the worst cherrypicked edgecase usecase that nobody uses in practice* "Yep LLMs are total garbage and can't be used for absolutely anything." [x.com](https://twitter.com/burny_tech/status/1769521823555739970) "My thoughts on Musk destabilizing other gigantic players in the intelligence wars by possibly leading open source using Grok Grok 1 is a 314B parameter model and it's a mixture of experts architecture. The largest open source LLM until now was Falcon, which has 180B parameters, but it doesn't use the mixture of experts architecture, which seems to be what all the best closed source and open source models are now using because it's better. Grok 1 was released as closed source in November 2023 before they made it open source today. Since then, they've been working on the next Grok 1.5, so we'll see if they release that one as open source too. I'd be surprised if they didn't after everything that's happening. According to Elon a month ago, Grok 1.5 is supposed to come out this month. I expected them to drop it today. Open source Mixtral has 56B parameters and it's a mixture of experts architecture that's now very popular like LLama 2 which has 70B parameters but isn't mixture of experts. But the closed source Grok 1 seems to be beating all other open source LLMs, and GPT3.5 Turbo, on benchmarks, from Novermber 2023, but this newly released version of Grok 1 may be a slightly different version. It probably has much less RLHF or other safety mechanisms (how easy will it be to activate the GPT4Chan personality?). Grok 1 314B might probably be the best open source model out there currently, not just because of its size, but because of its different architecture compared to Falcon 180B. Plus, there are bambilion of new tricks to improve the intelligence of LLMs. If Elon and his team haven't messed it up somehow, we'll see the result in the new benchmarks, that might be the same ones as the old ones. (Twitter anime profile picture anons will figure out the details). But I'm guessing that in a few months, Mistral Large will be open sourced, which looks like beats old Grok 1 on benchmarks, or Llama 3 from Meta will come out and wipe the floor with Grok, or better models from Google will come out (which just open sourced Gemma 7B that is a bit better than the current smaller versions of Llama 2, Mixtral, etc.), or better models from OpenAI, which are also rumored to want to release smaller open source models. The free version of ChatGPT uses GPT3.5 Turbo, which probably has 20-175B parameters, so Grok is now twice or more times larger than the closed source free version of ChatGPT. GPT4 probably still has 1760B parameters, but 314B is closer than 180B, and GPT5 is rumored to have 10000+B parameters. OpenAI's GPT4 (1000-2000B) to GPT5 (10000B-20000B) is 10-fold increase in parameters, which might also be similar with Google's Gemini 1 and Gemini 2, Anthropic's Claude 2 and Claude 3, and we'll see what develops from Meta's Llama 3, as Meta planned 150K H100 GPUs for 2023 and plans 350K H100 GPUs for 2024, which on similar level to Microsoft and other tech gigants, with a total compute equivalent to 600K H100s with other GPUs. Elon really hates how OpenAI has evolved. He's now suing OpenAI for being closed source, partnering with Microsoft, etc. Musk co-founded OpenAI and then tried to become the CEO and merge it with Tesla, but they rejected that offer, so he left, and now made his own company releasing Grok, while OpenAI is instead almost merging with Microsoft, which Musk hates, similar to how he hates Google. When Musk sees how big models other big players have, when he wants to be at the top of the technology, I partly understand why he's so angry. And I also find it interesting that he's now probably the leader in open source, but at the same time, he's extremely worried about existential risks from AI. Plus on the one hand, he has a kind of large concentration of wealth and power, but on the other hand, the open source 314B Grok announcement is a form of redistribution of power and intelligence. I see this from Musk mainly as a form of destabilizing attack on all those tech giants he hates that are starting to concentrate power and control over everything, from culture to technology to politics to intelligence mainly for themselves through closed sourcing. He would prefer that power and control over everything? Similarly, the lawsuit against OpenAI is also a form of attack. He's probably trying to fight them because he thinks it's not good for Google and Microsoft to have the future of humanity in their hands by having the greatest amount of wealth, power, intelligence, etc. (he writes that about Google in his lawsuit against OpenAI). So he's doing everything he can to destabilize their concentration of power with all kinds of attacks to prevent them from having and monopolizing general superintelligence, the holy grail of technological power over everything, in their hands. Plus, besides intelligence, there's also a big factor that Elon is a lot more politically right-wing than Google, Microsoft, etc., so there's also a solid cultural memetic warfare going on. So we'll see which culture wins as polarization continues to grow. It will be fun if OpenAI's Sam Altman will actually manage to raise around 7 trillion dollars or more for chip factories and training and running models, as he said (1.4x what America spent on WW2 in today's dollars, 2x India's GDP). But why does Musk still have such a weak model relative to Musk's wealth? Besides money, I think one of the biggest factors in how successful these various AGI labs are is their ability to attract the greatest talent to work for them, which is also related to being culturally similar with them. Is that what's stopping Musk in AI? Or is it because he came too late with his AGI company relative to the others? There's also a huge advantage there (building infrastructure, improving from feedback, etc.). Or maybe he has too much focus everywhere else, as majority of Musk's wealth comes from his stakes in Tesla ($127 billion) and SpaceX ($71.2 billion), even tho he talks about AI being really important a lot. Maybe in the future, people will have to choose whether they want to live in the Microsoft empire, Google empire, Meta empire, Musk empire, Amazon empire, or Apple empire. These intelligence wars could be a great AI-generated movie soon." "My thoughts on Grok 1 possibly leading open source and destabilizing other gigantic players in the intelligence wars Grok 1 is 314B, it's mixture of experts Největší open soruce llm do teď bylo Falcon co má 180B, ale to není mixture of experts architektura, což vypadá je teď co všechny nejlepší closed source a open source modely používaj kvůli tomu že je lepší Mixtral has 56B and is mixture of experts The closed source Grok 1 seems to be beating all other open source LLMs, and GPT3.5 Turbo, on benchmarks, but this may be a slightly diffrerent Grok 1 version, it probably has much less RLHF or other safety mechanisms (how easy it will be to activate GPT4Chan personality?) pravděpodobně Grok 314B bude nejlepší open source model ne jen kvůli jeho velikosti ale kvůli jiný architektuře než Falcon, plus existuje triliarda nových triků jak zlepšovat chytrost LLMs, pokud to Elon a jeho tým nějak nefuckupnuli, uvidí se na benchmarcích (twitter anime profile picture anons will figure out the details) Groka 1 releasnuli closed source Nov 2023 než ho dali dneska open od tý doby dělají na dalším Grokovi, tak se uvidí, či ho releasnou taky open source, divil bych se ale kdyby ne po tomhle všem co se děje, grok 1.5 má vyjít tenhle měsíc dle Elona před měsícem, čekal jsem že dneska dropnou toho ale tipuju že za pár měsíců příjde Llama 3 od Mety co Groka setře, nebo lepší modely od Googlu (ten teď open sourcnul Gemma modely na trochu lepší úrovni než dosavadní Llama 2, Mixtral apod.), nebo lepší modely od OpenAI, u kterých se taky pořád rumoruje že chcou vydat menší open source modely free verze ChatGPT používá GPT3.5 Turbo, který má pravdepodobně 20-175B parametrů, takže Grok je teď dvakrát nebo víc krát větší než closed source free verze ChatGPT GPT4 má pravdepoboně pořád 1760B, ale je 314B je blíž než 180B, a GPT5 je rumored že bude mít 10000+B podobný 10x násobný vzrůst v parametrech jako u GPT4 = 1000-2000B -> GPT5 = 10000B-20000B od OpenAI mají pravděpodobně Gemini 1 a Gemini 2 od Googlu, Claude 2 a Claude 3 od Anthropicu, a uvidíme co se vyvine z LLamy 3 od Mety když Meta za 2023 plánovala 150K H100 grafik a za 2024 plánuje 350K H100 grafik a celkově compute ekvivalentní k 600K H100 s ostatníma GPUs Elon strašně nesnáší jak se OpenAI vyvinulo, teďka lawsuituje OpenAI za to že jsou closed source, že se partnerujou s microsoftem apod. Musk OpenAI spoluzaložil, a pak se pokusil se stát CEO a mergnout to s Teslou, ale odmítli ho, tak odešel, a místo toho skoro mergujou s Microsoftem, kterej Musk nesnáší, podobně jak Google Když tohle Musk vidí kde on chce být ten na vrcholu technologií tak se z části chápu proč je tak angry a taky mi příjde zajímavý že je teď asi leader v open source a ale zároveň je extrémně worried o existenčních rizicích z AI na jednu stranu má kind of velkou koncentraci wealth a power na druhou stranu ten open source 314B Grok announcement je forma redistribuce of power and inteligence vidím tohle od Muska hlavně jako formu destabilizačního útoku na všechny ty tech giganty který on nesnáší co si začínají koncentrovat kontrolu nad vším, od kultury po technologie po ekonomiku, a inteligenci hlavně pro sebe přes closed source, on chce tu moc a kontrolu nad kulturou, technologiema, ekonomikou apod.! podobně ten lawsuit na OpenAI je taky forma útoku snaží se pravdepodobně bojovat s tím že si myslí si že není dobrý aby Google a Microsoft měli v rukou budoucnost lidstva tím že mají největší množství wealth, power, inteligence apod. (tohle o Googlu píše v tom jeho lawsuitu na OpenAI), tak dělá všechno pro to aby jejich koncentraci moci destabilizoval všema možnýma útokama, aby zabránil aby oni měli v rukou superinteligenci, the holy grail of technological power over everything plus kromě inteligence je tam ještě velkej faktor to že Elon je o dost politicky pravicovější než Google, Microsoft apod., takže je tam ještě solidní kulturní memetická válka, tak se uvidí, jaká kultura vyhraje, polarizace dále roste it will be fun if OpenAI's Sam Altman will actually manage to raise around 7 trillions of dollars or more how he said (1.4x what America spent on WW2 in today’s dollars, 2x India's GDP) hmm, these intelligence wars could be a great AI generated movie soon but why is it still such a weak mode lrelative to musk's wealth? kromě peněz si myslím že největší roli v tom jak jsou tihle různí giganti úspěšní je dovednost přitáhnout co největší talent, aby pod nima dělali, coz souvisi i s tim byt kulturne ma co nejpodobnejsi urovni, je to to co muska v AI zastavuje? nebo protože s jeho AGi firmou přišel moc pozdě relativně k ostatním? tam je taky strašnej advantage (budování infrastruktury, zlepšování z feedbacku apod.) Možná v budoucnosti si člověk bude muset vybrat či bude chtít žít v Microsoft empire, Google empire, Meta empire, Musk empire, Amazon empire nebo Apple empire" the heat death of the universe will be ending with the battle of galaxy sized superintelligent radically leftwing Google's Gemini titan against superintelligent radically rightwing Musk's Grok titan "Anybody who says that there is a 0% chance of AIs being sentient is overconfident. Nobody knows what causes consciousness. We currently have no way of detecting it, and we can barely agree on a definition of it. You can only be certain that you yourself are conscious. Everything else is speculation and so should be less than 100% certainty if you are being intellectually rigorous." i agree but "You can only be certain that you yourself are conscious." illusionists would disagree that consciousness even exists or open individualists would argue there is no individual you :D highly curious people only want one thing and that’s to develop a comprehensive first principles view of the entire world, all of physics, history, and everything in existence Quality dataset is all you need [x.com](https://twitter.com/teortaxesTex/status/1769469624108695660) ['A single chip to outperform a small GPU data center': Yet another AI chip firm wants to challenge Nvidia's GPU-centric world — Taalas wants to have super specialized AI chips | TechRadar](https://www.techradar.com/pro/a-single-chip-to-outperform-a-small-gpu-data-center-yet-another-ai-chip-firm-wants-to-challenge-nvidias-gpu-centric-world-taalas-wants-to-have-super-specialized-ai-chips) Intelligence, both biological and artifical. This is the most important thing we can study. "Solve intelligence, then use that intelligence to solve everything else." - Demis Hassabis [[2402.08164] On Limitations of the Transformer Architecture](https://arxiv.org/abs/2402.08164) [[2309.01622] Concepts is All You Need: A More Direct Path to AGI](https://arxiv.org/abs/2309.01622) Pure mathematics is just applied mathematics applied to mathematics