Training an LLM from Scratch, Locally — Angelos Perivolaropoulos, ElevenLabs
A practical guide to the engineering principles and trade-offs involved in training a small language model from scratch on a local machine, based on a workshop by Angelos Perivolaropoulos from ElevenLabs.
Beyond Bigger Models: Recursion As The Next Scaling Law In AI
Recent advancements with Hierarchical Reasoning Models (HRM) and Tiny Recursive Models (TRM) show how recursion at inference time enables small, 7-million parameter models to outperform models 1000x their size on complex reasoning tasks. This is achieved by giving models compute depth to break through the inherent reasoning ceilings of standard feed-forward Transformers.
Will machines ever be intelligent?
Doug Burger, Nicolò Fusi, and Subutai Ahmad explore the intelligence of AI, contrasting transformer-based LLMs with the human brain's distributed, continuously learning architecture. They delve into differences in efficiency, representation, and sensory-motor grounding, debating what intelligence truly means and how future AI might bridge the gap.
This Technology Scares OpenAI (Here's Why)
Jeff Hawke, CTO at Odyssey, provides a deep dive into the emerging field of "world models"—AI systems that generate continuous, interactive simulations. He draws parallels to the "GPT-2 era" of LLMs, outlining the current state, core research challenges like coherence and control, and the immense potential for applications in gaming, robotics, and content creation. Hawke also clarifies the confusing terminology, distinguishing canonical world models from spatial intelligence and generative video models like Sora.
What are State Space Models? Redefining AI & Machine Learning with Data
State Space Models (SSMs) are emerging as a powerful and efficient alternative to Transformers for handling sequential data. Aaron Baughman explains the core concepts of SSMs, their mathematical foundations, and how architectures like S4 and Mamba address the memory and scalability challenges inherent in Transformers, leading to a new generation of faster, more intelligent hybrid AI models.
The Limits of Today’s AI Models
Karan Goel, CEO of Cartesia, discusses the fundamental limitations of Transformer architectures, arguing they behave more like retrieval systems than learning systems. He explains how State Space Models (SSMs) enable compression and abstraction, and why Cartesia is tackling multimodal intelligence by first solving for voice AI, aiming to develop a transferable 'recipe' for end-to-end representation learning.