Thoughts AI technical 14

The space of possible such structures that llms can generate is so extremely vast, and its fascinating as well, as i love exploring the latent space, especially if its maybe possible to somehow push it more out of distribution, where some golden fruits might be in terms of creativity, but also a lot of stuff that doesn't make sense. Some people want to pause AI development to do alignment. But you would have to empirically test all those alignment theories or whatever in that pause, and since it's paused, you can't really do that, as you cannot build better models to test it on, so you can't really do good empirical research. AI still has yet to find more out of distribution useful mathematics i want my AIs to maximize scientific predictive ability Once we can't design new benchmarks that AI can't beat, that's the sign of AGI 5.29.2025 prediction: multimodal diffusion-based inference time compute scaling will become more popular in the coming months https://x.com/ma_nanye/status/1880105038132990054 [On the Biology of a Large Language Model](https://transformer-circuits.pub/2025/attribution-graphs/biology.html) Could you use to to add selfawareness of LLM's state by: do a bit of autoregression, reverse engineer circuits using attribution graphs etc., encode these graphs into tokens that you append, continue autoregression What is the next modality in AI after text, image, sound, video, latent thoughts? Will AI ever think in modalities completely ineffable to human modalities? Tady ještě podobný věci řeší v kontextu reinforcement learningu, že jednu věc čemu reinforcement learning pomáhá je zooming in na ty víc pravděpodobněji správnější možnosti, a zároveň čím víc se RL škáluje, tím novější vzory můžou vznikat, jako vznikly nový vzory v AlphaZero (superhuman chess strategies) 9:20 [https://youtu.be/64lXQP6cs5M?si=a7-Ly7xdd9MGyoXl](https://youtu.be/64lXQP6cs5M?si=a7-Ly7xdd9MGyoXl) They discuss how reinforcement learning helps in zooming in on the more likely correct solutions, and at the same time the more RL scales, the more novel patterns can emerge, like the new patterns that emerged in AlphaZero (superhuman chess strategies) 9:20 [https://youtu.be/64lXQP6cs5M?si=a7-Ly7xdd9MGyoXl](https://youtu.be/64lXQP6cs5M?si=a7-Ly7xdd9MGyoXl) " i think there exists a perspective where current AI systems are already more general than us, but in different way than how people imagine generality, and thats why we struggle to fit them to human cognition deep learning is this elastic origami that forms spaghetti representations from whatever data you throw at it and whatever reinforcement learning from experiences you give it i think the rationalist folks assume emergence of too many humanlike patterns in cognition i think a lot of the current misalignment we already see is the models roleplaying rogue AI from scifi training data but at the same time reward hacking from reinforcement learning is also totally real (like cheating on unit tests) the incentives in the training form the systems, i dont think there's an inherent strong antihuman misalignment by default thing that a lot of people seem to assume but im still most of the time swimming in the sea of uncertain probabilities about how the current systems work and possible future developments these systems and all of reality has so many dimensions that its often almost impossible to comprehend it even approximately " https://fxtwitter.com/aryehazan/status/1921651675326304656?t=xi3f8-XHDeXN5w_W41S1KQ&s=19 https://x.com/aryehazan/status/1921652260183880186?t=QRFuHHoiRLOcFvKPGTbXaA&s=19 Everyday I'm really confused when I see how those models often fail with very basic tasks, but then I see things like this where they help someone like this a lot at the research level. Damn, if only we had access to the weights of those models, and if it was cheaper to do different reverse engineering methods from mechanistic interpretability to understand the learned emergent circuits Then maybe a lot of these very inconsistent behaviors would be less mysterious Výhoda LLMs je že je to obecný jazykový engine/substrate můžeš mít LLMs specializovaný na víc human/přirozený jazyky můžeš mít LLMs specializovaný na víc nonhuman/nepřirozený jazyky Myslím si že vznikne nějakej novej exaktnější abstraktní jazyk u LLMs, myslím že už do jistý míry vzniká Který co nejvíc minimalizuje ambiguity v human jazyce pro zadávání programovacích tasků A celkově dneska už to nejsou jazykový modely, dneska to jsou multimodální modely Technicky je absolutně jedno s jakýma datama začínáš, ty to prostě chceš embeddnout do toho gazilion dimensional prostoru [Imgur: The magic of the Internet](https://imgur.com/lpZK1Gd) Everything to everything modely myslím budou velký, a tím směrem to dle mě spjeje Ale furt je dost specializovaných modelů co jsou prostě lepší no Třeba teď zrovna řeším real time object detection (a extrakce vlastností jako pozice atd.), a na to je teď prostě nejlepší specializovaná konvoluční neuronka, protože na to jsou obecný transformerový visual LLMs moc pomalý Ale to se možná v budoucnu změní Multiagent systémy jsou někdy efektivnější na programování Můžeš použít [GitHub - RooCodeInc/Roo-Code: Roo Code (prev. Roo Cline) gives you a whole dev team of AI agents in your code editor.](https://github.com/RooVetGit/Roo-Code) Někdo by měl udělat meta tool co callne všechny tyhle programming lešení a pak udělá megadiskuzi o výsledku nebo featura co mi chybí je procházet dokumentace stránku po stránce pro task, místo použití cosine similarity, i kdyby to bylo víc expensive WE TAUGHT SAND TO PASS THE TURING TEST I REPEAT WE TAUGHT SAND TO PASS THE TURING TEST I REPEAT WE TAUGHT SAND TO PASS THE TURING TEST https://x.com/burny_tech/status/1928671747244499283 " Additional AlphaEvolve note: Do you think that it's possible that in a few months we'll see someone replicate or do better with another SOTA LLM with a simpler looping setup (like ReAct), and do you agree that that would trivialize this evo+verifier etc. setup? I'm actually not that sure personally that it might be possible. But I'm open to being wrong. I've been wrong many times about similar claims about LLMs already lol. I still give it some probability. Or maybe better LLM without scaffolding with emergent circuits that correspond to the evo+verifier algorithm can be learned and reverse engineered by mechanistic interpretability. https://x.com/burny_tech/status/1922743628985741341 " I'm thinking of some system that uses some combination of: - neuro for flexibility (LLM stuff) - symbolic for better generalization and more rigid circuits where needed (Francois Chollet ideas, like DreamCoder, MCTS, symbolic math/physics engines) - evolutionary/novelty search for better open ended discovery (Kenneth Stanley ideas) - better RL algorithms for better generalization and other stuff (Rich Sutton ideas) - more biologically inspired parts of architecture for better data efficiency and maybe adaptibility and some other stuff (LiquidAI/neuromorphic ideas, maybe selforganizing ideas like something like neural celluar automata or forward forward algoritm or hebbian learning, but also in conjunction with gradient descent) - maybe some physics bias (like hamiltonian neural networks have) I'm thinking a lot lately if its even possible to somehow hybridize them all or if that would be too much of a amalgamation and it just wouldn't work. I see all those existing hybridizations of them n their own already and think if you could hybridize even more. Ale poslední reverse engineering experimenty ukazují že větší modely tvoří lepší univerzální vícejazyčný vnitřní features/obvody obecných konceptů co jsou pravděpodobně schopný do jistý míry limitovaně generalizovat mezi jazykama. Ale je to dost brittle a fuzzy. Ale co jsem napsal pořád platí. [On the Biology of a Large Language Model](https://transformer-circuits.pub/2025/attribution-graphs/biology.html) "protoze to je jen stochastic parrot!!!" Tbh mam pocit že tahle logika celkem platí i u lidí. Čím víc viděli něco v daným jazyce, tím líp se jim nad tím přemýšlí, generalizují, staví obvody, a tím líp na základě toho můžou stavět svoje další vzory. Ale lidi mají např míň brittle/fuzzy obvody, a líp generalizují. (Většina lidí alespoň... xd) Poslední dny jsem strávil až moc hodin zkoumání debunkování různých variant pseudovědátorů, konspirátorů, religious lidí apod., že mám pocit, že tyhle lidi by dost LLMs steamrollnula lol. Je to fascinující pozorovat z lingvistickýho hlediska. Různý LLMka mají svoje vlastní typy dialektů struktury jazyka. Ale zároveň je tam pár atraktorů do pár dialektů, protože modely se (mimo jiný) trénují na outputu ze sebe (synetetických datech) pro zvětšení konzistentnosti a pro doplnění prostoru možných promptů (ale musí se to udělat dobře aby nenastal mode collapse) a na jiných modelech (proces distilace kompetitorů je často efektivnější než from scratch tréning pokud ten model není nejlepší), a na LLM generated textu z internetu. Chatgptnese language Je to v podstatě jazyk, který mám pocit že pomalu a jistě adaptují i lidi co LLMka hodně používají Mám za názor že pro určitou část lidí je to celkem upgrade relativně k tomu jaký jazyk měli před tím [Instrumental convergence - Wikipedia](https://en.wikipedia.org/wiki/Instrumental_convergence#Paperclip_maximizer) Myslím si že ale tahle hypotéza v její silné formě v kontextu obecnějšího AI dostla jistou nenulovou míru falsifikace poslední rok a něco, ale možná by to víc platilo pro sofistikovanější budoucí systémy v dosavadním paradigmatu, nebo pro jiný paradigmata než ty co jsou teď populární Např to nebere v potaz dost aspektů openended divergentního novelty searche, který se v evoluci pravděpodobně děje bez optimizačních objektiv, který je pravděpodobné potřeba na určitý schopnosti Deep learning (nebo neurosymbolický nebo neuroevoluční hybridy) vždycky do jistý míry memorizovaly a generalizovaly zároveň Teď jenom jak moc a jakým stylem, různý systémy to mají různě, a neurální to mají vždycky nenulově AlphaEvolve má jak neuronkový (LLM), evoluční (evolutionary search), tak symbolický (verifier) atd., komponenty Každý z těch komponentů přidá vlastní formy generalizace Např evoluce přidá open ended search v latentním prostoru Plus ještě chybí reinforcement learning, co by přidal čtvrtý typ incentivy pro generalizaci To si myslím že bude obsahovat další paper od nich Není tam reinforcement learning protože tenhle paper je mega old (jeden týpek co s autorama interagoval mi teď řekl) Možná interně už mají mnohem lepší systémy na matiku Tohle jako součástky používá mega old Gemini modely co jsou samy o sobě celkem terrible relativně, a stejně to má cool mathematical discoveries, a zlepšuje se to s lepšíma base LLM modelama In science AI coming up with ideas on its own is great Especially when it reaches superhuman territory and allows you to do breakthroughs you maybe couldn't before [https://youtu.be/T_2ZoMNzqHQ?si=RF0Fqzp2yOI4w7uA](https://youtu.be/T_2ZoMNzqHQ?si=RF0Fqzp2yOI4w7uA) Superhuman level is where it becomes the most fascinating Or in chess where superhuman strategies thaught something new to human grandmasters Essentially upgrading humans as a result "In a human study, we show that these concepts are learnable by top human experts, as four top chess grandmasters show improvements in solving the presented concept prototype positions." [[2310.16410] Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero](https://arxiv.org/abs/2310.16410) Vibe code or not vibe code, that is the question. I think it heavily depends on: - how relatively easy is the task - the complexity of the project - how common are the building blocks - the time you want to spend - the quality that you want - the maintainability that you want - the understanding of the code that you want - solo project vs team project - how personal it is or how much is it for being production for others - if it's weekend throw away project vs production level code etc. Decentralized open source AI is the only realistic way to prevent big tech oligopoly on AI in the future Decentralized training infrastruktura vzniká celkem fajnově Včera jsme s pár lidmama přemýšleli nad nápadem co mě napadlo. Data jsou jeden z dalších faktorů kromě algoritmů a výpočetního výkonu. Takže by možná mohl jít vytvořit takovej consensual spyware co o sobě sbírá co nejvíc co mu povolíš a co je plně open source a co to posílá co nejvíc decentralizovaně do decentralized training infrastruktury, aby to neměl žádnej jeden člověk. Z části je to takový uploadování sebe do kolektivního hivemindu co vlastní všichni ostatní smysly už v dnešních modelech jsou už jsou dávno multimodální s images v jednom modelu, a teď začínají audio, video ale ještě chybí věci jako čich a chuť, to jo no ale nvm, to je spíš v robotice je pravda že většina dat je prostě low quality teď je největší moat v datech od nejchytřejších lidí na planetě co se týče dat ale mimo data je teď největší moat algoritmy na reasoning/inteligenci který nepotřebují data, což je dle mě větší moat než ty jakýkoliv data a vypadá to že v tomhle teď Google začíná všechny ostatní solidně předbíhat takže je potřeba jejich modely reverse engineerovat do open source modelů The current master algorithm of LLM RL is GRPO and reward function is for example if mathematical result was correct, code compiled, web search was somehow correct, etc. [[2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https://arxiv.org/abs/2501.12948) [Neural scaling law - Wikipedia](https://en.wikipedia.org/wiki/Neural_scaling_law) ale teď je nejvíc trendy reinforcement learning z verified rewards kde nepotřebuješ step by step přesně training data co imitovat, ale nějaký reward signál, např správnost math výsledku (co jde získat i z dat bez mezikroků), signál že code compiles, atd. a vznikají pokusy kde ani nepotřebuješ to for example [[2505.03335] Absolute Zero: Reinforced Self-play Reasoning with Zero Data](https://arxiv.org/abs/2505.03335) [[2505.19590] Learning to Reason without External Rewards](https://arxiv.org/abs/2505.19590) "Experiments demonstrate that Intuitor matches GRPO's performance on mathematical benchmarks while achieving superior generalization to out-of-domain tasks like code generation, without requiring gold solutions or test cases." Tyhle nový reinforcement learning metody mají svoje vlastní scaling laws kolem inference time compute, co se dál zefektivňují https://fxtwitter.com/polynoamial/status/1834280425457426689 Jak praví klasik Rich Sutton, co je jeden z hlavních lidí v reinforcement learningu (má na to nejlepší textbook) a věří že to stačí pro všechno, v jeho Bitter Lesson: "One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great." [https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf](https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf) Myslím že těhle různých general purpose method, různých axis kde půjde škálovat, kde budou různý scaling laws, se ještě objeví víc Thinking in the context of LLM's like Deepseek R1 means that the model generates tokens within think tags. They look like html <think> </think> The text between those tags is what the model is "thinking" After the closing tag </think> You get the response to the user from the model. This is image is from LM Studio. It shows the thinking in that kinda of code block. And some people rebel against this terminology https://fxtwitter.com/rao2z/status/1927707640223719631?t=K0GTx0-9LgmKdb2WXh6SSA&s=19 [[2504.09762] Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!](https://arxiv.org/abs/2504.09762) And it's interesting how mechanistic interpretability finds different circuits inside the LLM than what the LLM gives you as the explanation of how it got to it's answer https://fxtwitter.com/JohnBcde/status/1905332570516074782?t=ghA7XYNggk7WmopZwVIIxw&s=19 [On the Biology of a Large Language Model](https://transformer-circuits.pub/2025/attribution-graphs/biology.html) The tokens are generated the same way using predicting the next token by autoregression But they differ in how they're trained They're trained using reinforcement learning algorithms like GRPO on reward signals from for example if math result is correct, code compiles, etc., instead of classic supervised training Here's introduction into that: [[2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https://arxiv.org/abs/2501.12948) [https://youtu.be/bAWV_yrqx4w?si=gXXZmkYvSYE7I7Dd](https://youtu.be/bAWV_yrqx4w?si=gXXZmkYvSYE7I7Dd) It's all same one model doing the same autoregression but trained to output in this other format with <think></think> using reinforcement learning, and not doing supervised training on steps of solutions (which you still do before this phase though), to make the model figure the steps out on its own (which is debatable to what degree that actually happens, as seen in discussion above). Assuming big labs use similar enough methods like DeepSeek. I think what varies is mostly the RL algorithm and reward functions implementation details. I think it still matters what the tokens are, as appending completely random tokens wouldn't get the same math benchmark results, if some weirdness still isn't somehow happening in latent computations. But there are levels of how much it matters and levels of quality. (edited) Hmm okay this is interesting [Imgur: The magic of the Internet](https://imgur.com/9GFSTXE) This makes me think, could you get some form of model's "awareness" of its own circuits if you gave the information about the imperfectly reverse engineered circuits to it as an implicit function call result Since when we introspect, we do that by starting/"calling" the "introspecting process" Or I wonder if it's possible to somehow hardcode some form of this idea on a more architectural level But I guess researchers trying to implement some form of metacognition are already attempting similar stuff for years not sure if this makes sense, but my idea was something along the lines of: do a bit of autoregression, then automatically reverse engineer circuits using attribution graphs etc., then encode these graphs into tokens that you append, then continue autoregression and you could also maybe train on it there's for sure tons of engineering problems in that idea if its possible to make it somehow work another issue is that the feature graphs in that biology of llms paper were labeled manually iirc, so you would have to automate that, maybe using llm could work to at least some degree and that its costly and slow