Resources AI SoTA

[GitHub - naklecha/llama3-from-scratch: llama3 implementation one matrix multiplication at a time](https://github.com/naklecha/llama3-from-scratch) [GitHub - trotsky1997/MathBlackBox](https://github.com/trotsky1997/MathBlackBox?tab=readme-ov-file#news) https://x.com/casper_hansen_/status/1852342730527023473?t=hwqUzQ5Gv00pqh4eQdB9lA&s=19 implementation details of o1 [[2501.09686] Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models](https://arxiv.org/abs/2501.09686) O1 replication [[2410.18982] O1 Replication Journey: A Strategic Progress Report -- Part 1](https://arxiv.org/abs/2410.18982) [GitHub - hijkzzz/Awesome-LLM-Strawberry: A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.](https://github.com/hijkzzz/Awesome-LLM-Strawberry) Democratization of o1 is happening [GitHub - srush/awesome-o1: A bibliography and survey of the papers surrounding o1](https://github.com/srush/awesome-o1) https://x.com/DrJimFan/status/1834279865933332752 Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. Brown et al. [[2407.21787v1] Large Language Monkeys: Scaling Inference Compute with Repeated Sampling](https://arxiv.org/abs/2407.21787v1) Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. [[2408.03314] Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters](https://arxiv.org/abs/2408.03314) https://x.com/rohanpaul_ai/status/1835443326205517910 https://x.com/terryyuezhuo/status/1834286548571095299 ReFT: Reasoning with Reinforced Fine-Tuning [[2401.08967] ReFT: Reasoning with Reinforced Fine-Tuning](https://arxiv.org/abs/2401.08967) Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning [[2402.05808] Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning](https://arxiv.org/abs/2402.05808) https://x.com/iamgingertrash/status/1834297595486675052 tree search distillation + RL post training! https://x.com/rm_rafailov/status/1834291016192360743 Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking [[2403.09629] Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking](https://arxiv.org/abs/2403.09629) https://x.com/laion_ai/status/1834564564601729421 Let's Verify Step by Step [[2305.20050] Let's Verify Step by Step](https://arxiv.org/abs/2305.20050) [https://www.youtube.com/watch?v=hZTZYffRsKI](https://www.youtube.com/watch?v=hZTZYffRsKI) [Reddit - The heart of the internet](https://www.reddit.com/r/LocalLLaMA/comments/1fgr244/reverse_engineering_o1_architecture_with_a_little/) [Reverse engineering OpenAI’s o1 - by Nathan Lambert](https://www.interconnects.ai/p/reverse-engineering-openai-o1) [GitHub - srush/awesome-o1: A bibliography and survey of the papers surrounding o1](https://github.com/srush/awesome-o1/) [LLM Reasoning Papers - a philschmid Collection](https://huggingface.co/collections/philschmid/llm-reasoning-papers-66e6abbdf5579b829f214de8) https://x.com/srush_nlp/status/1846599704194302130?t=_vLPjFMc0JCaHYXz37IATQ&s=19 [[2411.16489] O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?](https://arxiv.org/abs/2411.16489) Elvis AI research news [https://youtu.be/vSm9wNw1Yek?si=ONbLSlGnpHIYE4mv](https://youtu.be/vSm9wNw1Yek?si=ONbLSlGnpHIYE4mv) https://trendingpapers.com [The AI Timeline](https://mail.bycloud.ai/) llama from scratch [GitHub - rasbt/LLMs-from-scratch: Implement a ChatGPT-like LLM in PyTorch from scratch, step by step](https://github.com/rasbt/LLMs-from-scratch?tab=readme-ov-file#bonus-material) [https://www.youtube.com/results?search_query=implementing+llama+from+scratch](https://www.youtube.com/results?search_query=implementing+llama+from+scratch) [https://www.youtube.com/watch?v=lrWY4O5kUTY](https://www.youtube.com/watch?v=lrWY4O5kUTY) [https://www.youtube.com/watch?v=oM4VmoabDAI](https://www.youtube.com/watch?v=oM4VmoabDAI) [GitHub - NirDiamant/RAG_Techniques: This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.](https://github.com/NirDiamant/RAG_Techniques) [A recipe for frontier model post-training](https://www.interconnects.ai/p/frontier-model-post-training) https://x.com/srchvrs/status/1830333083066814507?t=onr6xERkw5pUDj-qbcWUwQ&s=19 Agents survey [[2404.11584v1] The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey](https://arxiv.org/abs/2404.11584v1) [AI Agents Landscape & Ecosystem (June 2025): Complete Interactive Map](https://aiagentsdirectory.com/landscape) [The Prompt Report](https://trigaten.github.io/Prompt_Survey_Site/) Computable Artificial General Intelligence [[2205.10513] Computable Artificial General Intelligence](https://arxiv.org/abs/2205.10513) Agents on software engineering [[2409.09030] Agents in Software Engineering: Survey, Landscape, and Vision](https://arxiv.org/abs/2409.09030) "90% of frontier ai research is already on arxiv, x, or company blog posts. q* is just STaR search is just GoT/MCTS continuous learning is clever graph retrieval +1 oom efficiency gains in deepseek-coder paper" https://x.com/aidan_mclau/status/1811898849189069068?t=Sp5B-qFF_7xohpTwNvbnHg&s=19 Foundational models in music [[2408.14340] Foundation Models for Music: A Survey](https://arxiv.org/abs/2408.14340) Multimodal diffusion plus next token prediction [[2408.11039] Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model](https://www.arxiv.org/abs/2408.11039) Llama 3.1 details https://x.com/danielhanchen/status/1815798460052038095 Distilling transformers to mamba [[2408.10189] Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models](https://arxiv.org/abs/2408.10189) Agent 2.0 [[2406.18532] Symbolic Learning Enables Self-Evolving Agents](https://arxiv.org/abs/2406.18532) Constraining llms using automata [[2407.08103v1] Automata-based constraints for language model decoding](https://arxiv.org/abs/2407.08103v1) [FermiNet: Quantum physics and chemistry from first principles - Google DeepMind](https://deepmind.google/discover/blog/ferminet-quantum-physics-and-chemistry-from-first-principles/) Lagrangian neural networks [https://youtu.be/HLUIx6FqAvg?si=irQp7qKm-7kW5Wf4](https://youtu.be/HLUIx6FqAvg?si=irQp7qKm-7kW5Wf4) Qietstar Star selfthaught reasoner [https://youtu.be/T9gAg_IXB5w?si=VzdBeoei0qLQsQ5Z](https://youtu.be/T9gAg_IXB5w?si=VzdBeoei0qLQsQ5Z) LSTM plus mamba [[2403.16536v2] VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting](https://arxiv.org/abs/2403.16536v2) Foundation models course https://x.com/rohanpaul_ai/status/1828577063378547017 [CS 886: Recent Advances on Foundation Models](https://cs.uwaterloo.ca/~wenhuche/teaching/cs886/) "Funny how secret sauce methods become openly published just as the frontier moves forward. - Data composition solved - LR schedules 80% solved - efficient RLHF ≈solved New frontier: actually strong comprehensive *pretraining* synthetic data recipes, distillation, LLM RL…" https://x.com/teortaxesTex/status/1812282965935673545?t=iazC_r-JAdiGSbnXC8-vXQ&s=19 https://x.com/TheAITimeline/status/1812216803323461736 new papers [[2407.12077] GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression](https://arxiv.org/abs/2407.12077) Prompt engineering survey https://x.com/bindureddy/status/1814409737557160044?t=6PqfEy63nWOQJ5lIEKBZEg&s=19 "self-modeling to artificial networks causes a significant reduction in network complexity" https://x.com/juddrosenblatt/status/1814463465572184133 1. Interfacing an LLM with a reliable symbolic system (Prolog) raises math performance near ceiling: Tested on an *entirely new* collection of math word problems, the Non-Linear (NLR) reasoning dataset, to ensure all were outside the LLM training set. GPT fails completely. But GPT writing prolog code succeeds near ceiling. [[2407.11373] Reliable Reasoning Beyond Natural Language](https://arxiv.org/abs/2407.11373) 2. How can informal reasoning improve formal theorem proving? Lean-STaR: A framework for learning to interleave informal thoughts with steps of formal proving. Training language models to produce informal thoughts prior to each step of a proof, thereby improving the model’s formal theorem-proving capabilities. [[2407.10040] Lean-STaR: Learning to Interleave Thinking and Proving](https://arxiv.org/abs/2407.10040) 3. Adding self-modeling to artificial networks causes a significant reduction in network complexity. When artificial networks learn to predict their internal states as an auxiliary task, they change in a fundamental way. [[2407.10188] Unexpected Benefits of Self-Modeling in Neural Systems](https://arxiv.org/abs/2407.10188) 4. A system that incorporates both natural language pre-training and reinforcement learning from the start. [[2308.01399] Learning to Model the World with Language](https://arxiv.org/abs/2308.01399) 5. Georgia Tech researchers have developed a neural network, RTNet, that mimics human decision-making processes, including confidence and variability, improving its reliability and accuracy in tasks like digit recognition. [A New Neural Network Makes Decisions Like a Human Would | Research](https://research.gatech.edu/new-neural-network-makes-decisions-human-would) 9. Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? [[2406.04391] Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?](https://arxiv.org/abs/2406.04391) [[2402.14735] How Transformers Learn Causal Structure with Gradient Descent](https://arxiv.org/abs/2402.14735) https://x.com/EshaanNichani/status/1762122616255709665?t=rdJ34C0am2JCWkSFLREQYA&s=19 AI with universal physical understanding https://x.com/AnimaAnandkumar/status/1815441489398546741?t=3VhKU4MHfe6v25BrxeBZug&s=19 Model distillation https://x.com/rohanpaul_ai/status/1815464921032991003?t=yE3LN0PAdL9GzhKBk07Cwg&s=19 [[2402.13116] A Survey on Knowledge Distillation of Large Language Models](https://arxiv.org/abs/2402.13116) Jailbreaking techniques [[2403.04786v1] Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models](https://arxiv.org/abs/2403.04786v1) Training works at the edge of stability, the edge of chaos is optimal https://x.com/Ethan_smith_20/status/1815759691068350875 [AI achieves silver-medal standard solving International Mathematical Olympiad problems - Google DeepMind](https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/?utm_source=x&utm_medium=social&utm_campaign=&utm_content=) AI x math and reasoning [Harmonic - News](https://www.harmonic.fun/news) https://x.com/reach_vb/status/1811069858584363374 [[2404.06405] Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry](https://arxiv.org/abs/2404.06405) [FunSearch: Making new discoveries in mathematical sciences using Large Language Models - Google DeepMind](https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/) [[2006.08381] DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning](https://arxiv.org/abs/2006.08381) [[2407.10040] Lean-STaR: Learning to Interleave Thinking and Proving](https://arxiv.org/abs/2407.10040) [[2203.14465] STaR: Bootstrapping Reasoning With Reasoning](https://arxiv.org/abs/2203.14465) [GitHub - teacherpeterpan/self-correction-llm-papers: This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.](https://github.com/teacherpeterpan/self-correction-llm-papers) [AI achieves silver-medal standard solving International Mathematical Olympiad problems - Google DeepMind](https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/) Making LLMs do arithmetic https://x.com/jxmnop/status/1816958426385383753 Synthetic data engineering https://x.com/alexandr_wang/status/1816491442069782925?t=jDuJ7Ka8wLKVvJPXoXvrLA&s=19 OpenAI’s Strawberry hypothesis Active Inference https://x.com/tedx_ai/status/1819761571952042448 https://x.com/tedx_ai/status/1820425220991393929 AI resources https://x.com/sebkrier/status/1824849583362723975?t=4eznZWSih89sslTfnSLXOA&s=19 [The future of AI is analogue | Jonathan Peters » IAI TV](https://iai.tv/articles/the-future-of-ai-is-analogue-auid-2917) [AI made of jelly ‘learns’ to play Pong — and improves with practice](https://www.nature.com/articles/d41586-024-02704-y) [GameNGen](https://gamengen.github.io/) [Symbolic metaprogram search improves learning efficiency and explains rule learning in humans | Nature Communications](https://www.nature.com/articles/s41467-024-50966-x) quantum neural networks [[2408.12739] Quantum Convolutional Neural Networks are (Effectively) Classically Simulable](https://arxiv.org/abs/2408.12739) [[2403.01946] A Generative Model of Symmetry Transformations](https://arxiv.org/abs/2403.01946) Chomsky Impossible language models [https://www.youtube.com/watch?v=8lU6dGqR26s&feature=youtu.be](https://www.youtube.com/watch?v=8lU6dGqR26s&feature=youtu.be) [Novel Chinese computing architecture 'inspired by human brain' can lead to AGI, scientists say | Live Science](https://www.livescience.com/technology/artificial-intelligence/novel-chinese-computing-architecture-inspired-by-human-brain-can-lead-to-agi-scientists-say) [Network model with internal complexity bridges artificial intelligence and neuroscience | Nature Computational Science](https://www.nature.com/articles/s43588-024-00674-9) [How the Higgs Field (Actually) Gives Mass to Elementary Particles | Quanta Magazine](https://www.quantamagazine.org/how-the-higgs-field-actually-gives-mass-to-elementary-particles-20240903/) Open source implementation of AlphaFold3 [GitHub - Ligo-Biosciences/AlphaFold3: Open source implementation of AlphaFold3](https://github.com/Ligo-Biosciences/AlphaFold3) https://honeycomb.sh/blog/swe-bench-technical-report [Allen School News » Mind over model: Allen School’s Rajesh Rao proposes brain-inspired AI architecture to make complex problems simpler to solve](https://news.cs.washington.edu/2024/08/19/mind-over-model-allen-schools-rajesh-rao-proposes-brain-inspired-ai-architecture-to-make-complex-problems-simpler-to-solve/) [A sensory–motor theory of the neocortex | Nature Neuroscience](https://www.nature.com/articles/s41593-024-01673-9) [MarioVGG](https://virtual-protocol.github.io/mario-videogamegen/) [[2408.14837] Diffusion Models Are Real-Time Game Engines](https://arxiv.org/abs/2408.14837) [[2405.05254] You Only Cache Once: Decoder-Decoder Architectures for Language Models](https://arxiv.org/abs/2405.05254) [GitHub - verazuo/jailbreak_llms: [CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).](https://github.com/verazuo/jailbreak_llms) Transformers solve an open problem in symbolic mathematics https://x.com/f_charton/status/1834725632728613094 [https://www.youtube.com/watch?v=yCzV97QNG8w](https://www.youtube.com/watch?v=yCzV97QNG8w) Denny Zhou (Founded & lead reasoning team at Google DeepMind) - "We have mathematically proven that transformers can solve any problem, provided they are allowed to generate as many intermediate reasoning tokens as needed. Remarkably, constant depth is sufficient." https://fxtwitter.com/denny_zhou/status/1835761801453306089 diagram of thought for math [[2409.10038] On the Diagram of Thought](https://arxiv.org/abs/2409.10038) Parables on the Power of Planning in AI: From Poker to Diplomacy: Noam Brown, researching reasoning in OpenAI [https://www.youtube.com/watch?v=eaAonE58sLU](https://www.youtube.com/watch?v=eaAonE58sLU) https://x.com/burny_tech/status/1836991461692231707 deepmind not gradient based but evolutionary tennis robot https://x.com/davidad/status/1821622048596144490 https://x.com/GoogleDeepMind/status/1821562365931855970 [Home](https://sites.google.com/view/competitive-robot-table-tennis/home?utm_source&utm_medium&utm_campaign&utm_content&pli=1) [Scalable spatiotemporal prediction with Bayesian neural fields | Nature Communications](https://www.nature.com/articles/s41467-024-51477-5) https://x.com/GoogleAI/status/1839730191347699820?t=C0zweN806imi8-S_gXD_4g&s=19 flow matching [[2210.02747] Flow Matching for Generative Modeling](https://arxiv.org/abs/2210.02747) [https://www.youtube.com/watch?v=7NNxK3CqaDk](https://www.youtube.com/watch?v=7NNxK3CqaDk) Liquid foundation models, transformer killers? https://x.com/LiquidAI_/status/1840768728402993352?t=GhYuA85zs6FmKec0qNxgig&s=19 [From Liquid Neural Networks to Liquid Foundation Models | Liquid AI](https://www.liquid.ai/blog/liquid-neural-networks-research) Automated Design of Agentic Systems [[2408.08435] Automated Design of Agentic Systems](https://arxiv.org/abs/2408.08435) https://x.com/rohanpaul_ai/status/1841502578028503298 thermodynamic AI https://twitter.com/MaxAifer/status/1841671025991229874 https://twitter.com/MaxAifer/status/1841902089015824393 [[2410.01793] Thermodynamic Bayesian Inference](https://arxiv.org/abs/2410.01793) [https://www.youtube.com/watch?v=6DrCq8Ry2cw](https://www.youtube.com/watch?v=6DrCq8Ry2cw) Human-Timescale Adaptation in an Open-Ended Task Space [[2301.07608] Human-Timescale Adaptation in an Open-Ended Task Space](https://arxiv.org/abs/2301.07608) POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and their Solutions through the Paired Open-Ended Trailblazer [[1901.01753] Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions](https://arxiv.org/abs/1901.01753) https://www.uber.com/en-CZ/blog/enhanced-poet-machine-learning/ RAG over a lot of files in practice https://x.com/shanselman/status/1842667956100296722?t=AeWTwMZPkya78xAOTllngA&s=19 automatic design of agentic systems, multiagent systems creating multiagent systems [[2408.08435] Automated Design of Agentic Systems](https://arxiv.org/abs/2408.08435) [[2410.08146] Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning](https://arxiv.org/abs/2410.08146) LeanAgent: Lifelong Learning for Formal Theorem Proving [[2410.06209] LeanAgent: Lifelong Learning for Formal Theorem Proving](https://arxiv.org/abs/2410.06209) Recursively selfimproving swarms maybe soon mmmmm... [GitHub - openai/swarm: Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.](https://github.com/openai/swarm) AI agents [GitHub - hyp1231/awesome-llm-powered-agent: Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...](https://github.com/hyp1231/awesome-llm-powered-agent) Autorotating neural networks [Auto-Rotating Neural Networks: An Alternative Approach for Preventing Vanishing Gradients | OpenReview](https://openreview.net/forum?id=Gg3JlR9btC) cool playlist of machine learning resources [Než budete pokračovat na YouTube](https://www.youtube.com/playlist?list=PLg642XuzWS1Ts3ED6ym_HJ9hqq8QOdB9Z) Understanding LLMs: A Comprehensive Overview from Training to Inference [[2401.02038] Understanding LLMs: A Comprehensive Overview from Training to Inference](https://arxiv.org/abs/2401.02038) automated agent workflow generation [[2410.10762] AFlow: Automating Agentic Workflow Generation](https://arxiv.org/abs/2410.10762) illya recommended AI papers https://x.com/Sumanth_077/status/1853078836239613970 Automated prompt engineering https://x.com/cwolferesearch/status/1853465302031556725?t=nMBZJzaz_xMWFklZCxcnAQ&s=19 Differential Transformer [https://www.youtube.com/watch?v=6Xy0tCidPl0](https://www.youtube.com/watch?v=6Xy0tCidPl0) Harvard Presents NEW Knowledge-Graph AGENT (MedAI) [https://www.youtube.com/watch?v=Fm68I-phaiY](https://www.youtube.com/watch?v=Fm68I-phaiY) Alternatives to LangChain [Reddit - The heart of the internet](https://www.reddit.com/r/LangChain/s/IdKXYMpiua) better optimizer than adam ADOPT [[2411.02853] ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate](https://arxiv.org/abs/2411.02853) https://x.com/ishohei220/status/1854051859385978979 jailbreaks [A list of various Jail Breaks for different models](https://rentry.org/jb-listing) local models [The meta list of various /aicg/ guides on running local](https://rentry.org/meta_golocal_list) Best neurosymbolic AI architecture recommended by Chollet https://x.com/fchollet/status/1855362153563463762?t=hAojgyzrh4468vLJH-MnQA&s=19 [[2410.23156] VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning](https://arxiv.org/abs/2410.23156) MIT researchers develop an efficient way to train more reliable AI agents [MIT researchers develop an efficient way to train more reliable AI agents | MIT News | Massachusetts Institute of Technology](https://news.mit.edu/2024/mit-researchers-develop-efficiency-training-more-reliable-ai-agents-1122) Model-Based Transfer Learning for Contextual Reinforcement Learning [[2408.04498] Model-Based Transfer Learning for Contextual Reinforcement Learning](https://arxiv.org/abs/2408.04498) "Experimental results suggest that MBTL can achieve up to 50x improved sample efficiency compared with canonical independent training and multi-task training. This work lays the foundations for investigating explicit modeling of generalization, thereby enabling principled yet effective methods for contextual RL. We introduce Model-Based Transfer Learning (MBTL), which layers on top of existing RL methods to effectively solve contextual RL problems. MBTL models the generalization performance in two parts: 1) the performance set point, modeled using Gaussian processes, and 2) performance loss (generalization gap), modeled as a linear function of contextual similarity. MBTL combines these two pieces of information within a Bayesian optimization (BO) framework to strategically select training tasks." [GitHub - NirDiamant/Prompt_Engineering: This repository offers a comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies. It serves as an essential resource for mastering the art of effectively communicating with and leveraging large language models in AI applications.](https://github.com/NirDiamant/Prompt_Engineering?tab=readme-ov-file#-advanced-strategies) [https://youtu.be/68_BKkymYs4?si=HGGP6UIJHCcvHP1e](https://youtu.be/68_BKkymYs4?si=HGGP6UIJHCcvHP1e) llama 3.1 405b Brain learning algorithm [Inferring neural activity before plasticity as a foundation for learning beyond backpropagation | Nature Neuroscience](https://www.nature.com/articles/s41593-023-01514-1) Database of How Companies Actually Deploy LLMs in Production (300+ Technical Case Studies, Including Self-Hosted https://x.com/rohanpaul_ai/status/1863606556228788354?t=6ei5ThiAa7oGY_lVZx0SGw&s=19 [Amazon.com](https://www.amazon.com/AI-Engineering-Building-Applications-Foundation/dp/1098166302) AlphaZero from scratch [https://www.youtube.com/watch?v=wuSQpLinRB4](https://www.youtube.com/watch?v=wuSQpLinRB4) [MuZero - Wikipedia](https://en.wikipedia.org/wiki/MuZero) AlphaFold from scratch [Než budete pokračovat na YouTube](https://www.youtube.com/playlist?list=PLJ0WcPQS7xJVJr6ceIPFSkAGAgrkmw1c9) Deepseek V3 technical report [🚀 Introducing DeepSeek-V3 | DeepSeek API Docs](https://api-docs.deepseek.com/news/news1226) https://x.com/nrehiew_/status/1872318161883959485?t=SB9rLEF4MoCy4Gt9FBVicQ&s=19 [Physics-informed neural networks - Wikipedia](https://en.wikipedia.org/wiki/Physics-informed_neural_networks) sheaf neural networks https://x.com/meowdib/status/1922315466401308965 flow matching visualization https://fxtwitter.com/alec_helbling/status/1924451851316932758 A Principled Bayesian Framework for Training Binary and Spiking Neural Networks https://x.com/adeelrazi/status/1926849339596439804 [[2505.17962] A Principled Bayesian Framework for Training Binary and Spiking Neural Networks](https://arxiv.org/abs/2505.17962) (edited) [[2505.12540] Harnessing the Universal Geometry of Embeddings](https://arxiv.org/abs/2505.12540) It seems to be support for platonic representation hypothesis [[2405.07987] The Platonic Representation Hypothesis](https://arxiv.org/abs/2405.07987) "Our conjecture is as follows: neural networks trained with the same objective and modality, but with different data and model architectures, converge to a universal latent space such that a translation between their respective representations can be learned without any pairwise correspondence." and it surprisingly works They had to design pretty sophisticated loss functions in that paper to get those translation mappings, I think it wasn't obvious that it will even work Also I don't think that it was obvious that the semantic shape of the different embeddings would be this similar such that it's this translatable