LLMs can create novel content
given examples can be extended, the logical patterns being apparently localized and utilized
local continuous weak generalization seems to happen, is what I mean
not a novel alien mind, but not just regurgitation
I'm surprised all the infinite gazilion trillions dollar supergigacorporations didn't connect their multimodal LLMs turned into chatbot product to some ground truth knowledge base like Wikipedia or web in general efficiently like Perplexity and why don't they use python more often or WolframAlpha etc. everytime they do arithmetic or other math where it's more efficient, or use physics engines or symbolic engines for reasoning and data science and other machine learning and AI tools and frameworks and so on where they would call it in cases where it's more accurate, efficient and so on.
Perplexity does so well in a big subset of retrieving facts use cases IMO, ConcensusAI is great for studies, wolfram engine works well with integrals for example etc.. I think the technology is already out there but implemented by others.
I think it's both fundamental research and software engineering problem, I think LLMs are still fundamentally limited in many ways no matter how many tools you give them. I'm trying to advocate for using the tools in usecases where it's already much better. But for many usecases you will need upgrade of the fundamental technology or the technologies connected to it.
DreamCoder: Minimum description principle guiding neural network guiding symbolic domain specific language discrete program search, and generalizing together similarly successful programs in sleep
[Chollet's ARC Challenge + Current Winners - YouTube](https://youtu.be/jSAT_RuJ_Cg?si=TQxPY9MOcg5kDAc0)
Domains specific program BFS and comparing with hamming distance
Dream coder neural network guided program search
LLM program sampling and refining infinitely with domains specific system prompt
Transformers have attention that is permutation symmetry, future models will have hybrid symmetries (graphs, CNNs,...) and hybrid neurosymbolic architectures with neural guided feature and sampling program search with generalizing together similarly successful short programs, planning etc. in multimodal embodied context
[[2406.03689] Evaluating the World Model Implicit in a Generative Model](https://arxiv.org/abs/2406.03689)
https://x.com/keyonV/status/1803838591371555252?t=OvX5fqwR7HeCgMZ1g3RHQw&s=19
GPT420: Approximating the manifold of the whole space of all possible tasks in all possible modalities and all possible solutions to them and the whole universe
Various forms of superintelligence will look at humans how we look at plants. They sometimes react a bit in a long information processing timescale relative to us. Superintelligence might be 100000x times faster with more optimally predictive models of the whole universe than us.
Small gpt car
https://x.com/ax_pey/status/1804209628680720746?t=W6rBRjOvvF2I_e7xdizoJg&s=19
Accelerating grokking
[[2405.20233] Grokfast: Accelerated Grokking by Amplifying Slow Gradients](https://arxiv.org/abs/2405.20233)
https://x.com/_ironjr_/status/1798733867303772607?t=ap-se3Q1rJ5IRm5veEMHKw&s=19
https://x.com/davidad/status/1804550585124864123?t=3p6pDrDs3TSBCuazqoHQWg&s=19
Culture is software for the biological superintelligence that runs on the human neural meta-connectome
Automated AI software engineering
[Code Droid Technical Report](https://www.factory.ai/news/code-droid-technical-report)
https://x.com/svpino/status/1804140282243322106?t=8svyxLLVn2Tpd0H3rhAz8w&s=19
Past, present, and future gone to rest in timeless this-ness
Omnicausal singularity
[How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit](https://www.latent.space/p/hiring)
https://x.com/swyx/status/1804271733618413886?t=EtsaHRXv_lQwFJoDstCywg&s=19
Symmetry metalearning in geometric deep learning sense
['Humanzee' was grown in a lab before scientists euthanized it not long afterwards](https://www.unilad.com/news/animals/humanzee-grown-lab-scientists-062600-20240620?fbclid=IwZXh0bgNhZW0CMTEAAR0yi-hiO4F7PR6Rs2qp1g8FaeqSCzK3-pQqkG2bBSEkcBgpW_MkaJOLHYM_aem_cE2zj2cOHG3VdMOhw9Yrwg)
[[2406.14562] Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities](https://arxiv.org/abs/2406.14562)
https://news.ycombinator.com/item?id=40685752
[[2405.09637] CLASSP: a Biologically-Inspired Approach to Continual Learning through Adjustment Suppression and Sparsity Promotion](https://arxiv.org/abs/2405.09637)
[[2310.01365] Elephant Neural Networks: Born to Be a Continual Learner](https://arxiv.org/abs/2310.01365)
https://www.reddit.com/r/MachineLearning/s/NxM9X0b5Wi
causal modeling, strong generalization, continuous learning, data & compute efficiency, controllability and stability/reliability in implicit symbolic reasoning, agency, more complex tasks across time and space, long term planning, multimodal embodiment
Honestly people saying that current AI systems can do something they for now cannot do are in many ways harmful to the ecosystem as that lowers trust from people, even tho in many cases it can be self fulfilling prophecy where resulting more investment results in better systems
On a day to day basis do you *feel* like you are a bag of a bunch of cells solving a massive collective action problem to instantiate a semi-coherent agent?
[Phys. Rev. Lett. 127, 241103 (2021) - Real-Time Gravitational Wave Science with Neural Posterior Estimation](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.127.241103)
https://x.com/AnnalenaKofler/status/1803724069327568973
People saying that current AI systems can do something they for now cannot do (like fully automated software engineering) are in many ways harmful to the ecosystem as that lowers trust from people, even tho in many cases it can be self fulfilling prophecy where resulting more investment results in better systems
[Crossable Wormholes? - YouTube](https://www.youtube.com/watch?v=bv-2SP7-ZEY)
[Why do people hate scientists? - YouTube](https://www.youtube.com/watch?v=7OSstBWdRTs)
[ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt)](https://www.latent.space/p/iclr-2024-benchmarks-agents)
[What are Breakthroughs in Physics? - Edward Witten - Closer to Truth](https://closertotruth.com/video/wited-005/)
[Edward Witten - Why the ‘Unreasonable Effectiveness’ of Mathematics - YouTube](https://www.youtube.com/watch?v=1-Zl9o7I4Fo)
[Edward Witten - How Do Scientific Breakthroughs Happen? - YouTube](https://www.youtube.com/watch?v=YKQvj11tuKU)
[Edward Witten - What are Breakthroughs in Science? - YouTube](https://www.youtube.com/watch?v=bKapdscHwJ0)
[Edward Witten - How is Mathematics Truth and Beauty? - YouTube](https://www.youtube.com/watch?v=O3isFuQ2q2A)
Astrology driven development in machine learning only has one answer:
Try your random idea and see if it works.
[[2406.07496] TextGrad: Automatic "Differentiation" via Text](https://arxiv.org/abs/2406.07496)
https://x.com/james_y_zou/status/1800917174124740667?t=Ikmkj_0dJaSlRlNUZkZipA&s=19
[[2406.11179] Learning Iterative Reasoning through Energy Diffusion](https://arxiv.org/abs/2406.11179)
thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking about thinking
I no longer want to lower my thinking
I want to embrace my thinking
New Ai architecture
[[2401.17948] HyperZ$\cdot$Z$\cdot$W Operator Connects Slow-Fast Networks for Full Context Interaction](https://arxiv.org/abs/2401.17948)
https://x.com/LeopolisDream/status/1804627325583327358
https://x.com/HyperEvolAILab/status/1753251800906707288
[Edward Witten: On the Shoulders of Giants - YouTube](https://www.youtube.com/watch?v=2UQ8teAebcg)
[We Need To Stop OpenAI - YouTube](https://www.youtube.com/watch?v=XyiTDbKndNM)
I'm noticing that my mind often cannot grasp how many people often do not care about the future of humanity or sentience in general going well
[The Dangerous Propaganda Of Techno-Optimism - YouTube](https://www.youtube.com/watch?v=5iEUAp0QuPg) oh man this is pure doom, i need to take some Kurzweil antidepressant after this
i believe there will be technological solutions to the water problem
Machine learning applied to water cooling optimization in data centers
Here's memetic antidepressant:
[Joe Rogan Experience #2117 - Ray Kurzweil - YouTube](https://youtu.be/w4vrOUau2iY?si=HCi_yFf0QR__6GYu)
We must prevent power centralization
Stay alive for longevity escape velocity thanks to AGI
https://x.com/Dr_Singularity/status/1804962619859533929?t=AkZ3Em6_UuAKmsv4RyWVEg&s=19
https://www.reddit.com/r/singularity/s/4v5LWm4i18
https://x.com/bshlgrs/status/1802766374961553887
[Getting 50% (SoTA) on ARC-AGI with GPT-4o](https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt)
řekl bych ale že ten jeho solution je pořád relativně slabý i když je to super milestone, tím jak je to kind of bruteforce, ale ne úplně najivní bruteforce, protože v naivnějším search bruteforcu bys tam měl šílenou kombinatorickou explozi v prostoru všech možných programů, takže myslím že tam nějaký (slabší) statistický odvozování/reasoning je, i když IMO hooodně lokálně slabý
ale myslím že jsou lepší metody jak dělat smarter search v prostoru programů, než je tento LLM sampling program search, jako je např DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian typed program learning [[2006.08381] DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning](https://arxiv.org/abs/2006.08381)
^Dle mě nejlíp shrnuje všechny approaches co se na ARC AGI challenge použili v první půhodině zde: [Chollet's ARC Challenge + Current Winners - YouTube](https://www.youtube.com/watch?v=jSAT_RuJ_Cg) první attempts byly např ještě BFS nad LLMs
21:15:
"He also said that the submission is ineligible for the ARC-AGI prize and the main leaderboard, as it uses a closed source model and too much runtime compute. And I think, you know, in principle it does go against the spirit of the ARC challenge because, you know, all he's done is kind of like used the memorization power of the underlying language model to generate a whole bunch of programs and then done a brute force search with an evaluation over the generated programs.
So you could argue that it's not it's not really, you know, reasoning efficiency in the spirit of Chollet's ARC"
osobně myslím že je to trochu víc než jen memorizace a že to není naivní bruteforce 😄 naučený emergentní weak generalizing circuits tam jsou, aby se ten model neutopil v kombinatorický explozi
ale vidí attention mechanismy v transformerech co v podstatě implementují permutation symmetry jako formu weak reasoningu, jen (já taky) se snaží hodně pushovat, aby se v AI systémech implementovalo víc forem reasoningu než jen tohle, ať jiný symmetrie nebo neurosymbolika jako DreamCoder (minimum description principle guiding neural network guiding symbolic domain specific language discrete program search, and generalizing together similarly successful programs in "sleep")
Ten co zvládl 50% má fajn future predikce o tom benchmarku:
https://imgur.com/7kyEEsv
recent naškálovaný reverse engineering těch gigamodelů, i když pořád relativně slabý, ukazuje zajímavý vnitřek
[Mapping the Mind of a Large Language Model \ Anthropic](https://www.anthropic.com/research/mapping-mind-language-model)
https://openai.com/index/extracting-concepts-from-gpt-4/
Claude 3 Sonnet is so great as a copilot helping to digest ML papers
Tricks:
think it like playing chess. you need to make the llm to end up somewhere after several steps. somewhere = exactly what you want.
be generic, don’t reflect to one problem, reflect to a pattern when you try to prevent the llm to do something
you can use smarter models (but you need detailed system prompt) to help you to develop prompt
it’s all about context. if the llm makes mistakes, think about what information (or data) it should have known to make a better decision.
do a lots of runs before you change the prompt to see the frequency of mistakes you see
ideally change one thing at a time
Sama and co are the hardest people to read, they're playing all sides all at once
[Inventor and futurist talks his hopes for the advancement of AI and technology - YouTube](https://www.youtube.com/watch?v=9v5WlPBVnqU)
Do you perform lasso regularization or ridge regularization in your world model? Are your representations sparse? Do you give nonzero weight to all priors or completely remove some priors from being considered? Does new evidence tend to update you relatively strongly or weakly?
Yann Lecun:
"Can generative image models be good world models?
This work from @Meta FAIR shows that there is a tradeoff between realism and diversity.
The more realistic a generative model becomes, the less diverse it becomes.
Realism comes at the cost of coverage.
In other words, the most realistic systems are mode-collapsed.
My hunch, supported by a growing amount of empirical evidence, is that world models should *not* be generative.
They should make predictions in representation space.
In representation space, unpredictable or otherwise irrelevant information is absent.
This is the main argument in favor of JEPA (Joint Embedding Predictive Architectures)."
[[2406.10429] Consistency-diversity-realism Pareto fronts of conditional image generative models](https://arxiv.org/abs/2406.10429)
https://x.com/ylecun/status/1803677519314407752
I wonder if I should stay in STEM or try politics. I want both. Maybe both in parallel will be the result.
https://towardsdatascience.com/foundation-models-in-graph-geometric-deep-learning-f363e2576f58
Even if AGI is achieved in the next year and can do all tasks that humans can, integrating it into the state and corporations full of infinitely slow bureaucracy will take longer than the heat death of the universe
https://www.reddit.com/r/QuantumComputing/comments/1dmlbz9/favourite_quantum_youtube_channels/
Understanding does not depend on knowing a lot of facts as such, but on having the right concepts, explanations, and theories.
@DavidDeutschOxf
Ontologically agnostic ontology
Is us learning mathematics literally learning generalizing circuits in the brain?
What different strengths do different models like GPT-4, GPT-4o, Claude 3 Opus, Claude 3.5 Sonnet, Gemini 1.5 Pro etc. have?
I'm thinking for example:
Claude 3 Opus and Claude 3.5 Sonnet for complex coding, STEM, structuring, creativity.
GPT-4o for easier code.
Gemini 1.5 Pro for long context retrieval and text structuring.
[Anthropic CEO Says We'll Need More Than UBI to Solve Inequality - Business Insider](https://www.businessinsider.com/anthropic-ceo-dario-amodei-universal-basic-income-ubi-ai-inequality-2024-6)
New Study Suggests Universal Laws Govern Brain Structure From Mice to Men
[Unveiling universal aspects of the cellular anatomy of the brain | Communications Physics](https://www.nature.com/articles/s42005-024-01665-y)
[Brain’s structure hangs in ‘a delicate balance’ - Northwestern Now](https://news.northwestern.edu/stories/2024/06/brains-structure-hangs-in-a-delicate-balance/)
Transformers really do learn world models, but they’re thickly fouled by incoherence, and this makes them behave unreliably, especially out-of-distribution.” — davidad [[2406.03689] Evaluating the World Model Implicit in a Generative Model](https://arxiv.org/abs/2406.03689)
[I Optimised My Game Engine Up To 12000 FPS - YouTube](https://www.youtube.com/watch?v=40JzyaOYJeY)
[Mapping the Brain - YouTube](https://www.youtube.com/watch?v=VSG3_JvnCkU)
[Grokking Group Multiplication with Cosets | OpenReview](https://openreview.net/forum?id=hcQfTsVnBo)
Clifford-Steerable Convolutional Neural Networks
"We significantly and consistently outperform baseline methods on fluid dynamics as well as relativistic electrodynamics forecasting tasks."
[[2402.14730] Clifford-Steerable Convolutional Neural Networks](https://arxiv.org/abs/2402.14730)
https://www.pnas.org/doi/10.1073/pnas.2410196121
[Making Our Own Working Neuron Arrays! | DOOM Neurons Part 2 - YouTube](https://youtu.be/c-pWliufu6U)
"AGI could be one of the most beneficial technologies ever created, but it may also be extremely dangerous. It is absolutely crucial that we get it right."
https://x.com/NeelNanda5/status/1805033158745710821?t=lS3u1UXopWU53kNEE8ERQQ&s=19
I think AI engineers and researchers have a bias towards how good current LLM AI models are good at parts of their job as the models are most likely very much trained on ML coding the most so they have the best performance there. I bet they have seen the PyTorch library and Attention is all you need paper orders of magnitude more times than other stuff. The AI engineers and researchers also have theoretical knowledge how they work and how to prompt them, and have on average more practical experience using the AI models for many coding and math tasks in their ML domain where the models are very much trained, and know their limitations and capabilities, and how to take advantage of the AI systems being connected to other systems etc., which gives them big comparative advantage in terms of skill.
Is GPT4/Claude3.5 smart as a high schooler or as Wikipedia or something else?
Nuclear fission > nuclear fusion
https://x.com/fchollet/status/1805343413669446022?t=JV_1ieABkqG0-XwnjUj0qQ&s=19
Theory of everything as a treat
Joscha Bach [Joscha at Microsoft - YouTube](https://youtu.be/XsGfCfMQgNs?si=BZsCdBuXDndsoPph)
[[2406.08862] Cognitively Inspired Energy-Based World Models](https://arxiv.org/abs/2406.08862)
[The Insane Engineering of MRI Machines - YouTube](https://youtu.be/NlYXqRG7lus?si=zSjUcwNaaSm8t61o)
No matter how long you train on RNA & DNA data alone, you will never know protein function & structure. No matter how long you train on protein function & structure alone, you will never know what reaction pathways they participate in.
Need to train across as many modalities as possible to see big gains in performance. But this requires way more data and way more compute because you have to encode across many different tokenization schemes.
Anyone who claims that their political party does no wrong and the other party does no right is either a liar, a fool or both.
[Brain-reading device is best yet at decoding ‘internal speech’](https://www.nature.com/articles/d41586-024-01424-7?utm_medium=Social&utm_campaign=nature&utm_source=Twitter#Echobox=1715613447)
https://www.liebertpub.com/doi/10.1089/genbio.2024.0011
https://x.com/SimonDBarnett/status/1805242329948725314?t=HLlTkLUuQW2vRG_vok05Ag&s=19
https://www.researchgate.net/publication/357097879_The_Many_Faces_of_Information_Geometry
https://x.com/BensenHsu/status/1805192217004568650?t=abdvauHEGDgo7EtZYjbVaw&s=19
A conspiracy theory is a narrative that gives meaning to the world by connecting significant dots with confabulation and motivated reasoning, making truth and fiction indistinguishable. The consensus narrative does that too, but it aims at unifying society, not splintering it.
https://x.com/Plinz/status/1489838425129635843?t=PDVjZXW_VoPTHuIbu_Khsg&s=19
[Does intelligence lead to suffering? | Joscha Bach and Lex Fridman - YouTube](https://youtu.be/xStS6pft1SI?si=zN3RfCepQ-9QTSRG)
So many people fighting over validity of intelligence and artificial intelligence without even defining intelligence but basing it just on vibes is so confusing to me
Reddit is basically an autistic dream because there is a subreddit for any special interest that you currently hyperfocus on
LLMs are superhuman at next token prediction
[Frontiers | Naturalizing relevance realization: why agency and cognition are fundamentally not computational](https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2024.1362658/full)
https://x.com/yuqirose/status/1805327242790420610?t=u2NlvpRyN-D4ST1jtR5A5g&s=19
https://www.pnas.org/doi/10.1073/pnas.2311808121
[Local Network Interaction as a Mechanism for Wealth Inequality | Nature Communications](https://www.nature.com/articles/s41467-024-49607-0)
Basically all narratives are oversimplifications of the actual highly nonlinear dynamics they are trying to approximate
LLMs are superhuman at identity/persona shapeshifting
[Movie reconstruction from mouse visual cortex activity | bioRxiv](https://www.biorxiv.org/content/10.1101/2024.06.19.599691v1)
https://x.com/Neuro_Joel/status/1805221959191437356?t=G4e3CugvAYlYxuL9NRzVFQ&s=19
[Evolutionary Scale · ESM3: Simulating 500 million years of evolution with a language model](https://www.evolutionaryscale.ai/blog/esm3-release)
https://x.com/alexrives/status/1805559211394277697?t=GSLNRxt8b_FIYsIscQErXQ&s=19
https://www.lesswrong.com/posts/YmkjnWtZGLbHRbzrP/
https://x.com/NeelNanda5/status/1805697564542746719?t=ZKBWYAlclrvm1Gn1Cn_-5g&s=19
AI Transformer into hardware
https://x.com/Etched/status/1805625693113663834?t=rNizwGEgiHlI_4T0wUKvmw&s=19
You are my bravest theorem prover
You are my bravest coder