Articles - Burny

- [[GLM-5.2 moves from GRPO (Group Relative Policy Optimization) to PPO (Proximal Policy Optimization)]] - [[How are current AI systems mostly created, and who is pursuing alternatives]] - [[Current state and future of recursive AI self-improvement research. In how strong form is it real]] - [[AI tribes from The Master Algorithm and what camps would I add]] - [[Comment on Measuring Reward-Seeking by Instilling Contrastive Beliefs paper from mechanistic interpretability perspective]]