Reinforcement Learning

Reinforcement learning

Apr 30, 2026

Robotics' End Game: Nvidia's Jim Fan

Jim Fan of Nvidia outlines the endgame for robotics, arguing it will mirror the successful playbook of Large Language Models. He introduces "The Great Parallel," a roadmap where World Models replace Language Models, and data collection shifts from limited teleoperation to scalable egocentric video, culminating in a future of physical APIs and automated research.

Apr 29, 2026

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Maxime Labonne from Liquid AI shares a playbook for post-training frontier small models (under 1GB) for on-device deployment. The talk breaks down the LFM2.5 recipe, which includes on-policy preference alignment and agentic reinforcement learning, and addresses unique challenges at the 1B scale, such as capability interference and 'doom loops', offering concrete solutions to build efficient models for tasks like data extraction and tool use.

Apr 15, 2026

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Wayve CEO Alex Kendall discusses their contrarian, AI-first approach to autonomous driving. He explains their journey from a garage prototype using reinforcement learning to developing a generalizable AI driver that has driven zero-shot in over 500 cities. Kendall emphasizes a strategy focused on licensing this embodied AI for mass-market consumer vehicles—a 100-million-unit-per-year opportunity—rather than building bespoke robotaxis, arguing that the future is an AI that can drive any car, anywhere.

Apr 08, 2026

From Neural Networks to Digital Brains: The Next Leap in AI • Daniel Lütgehetmann • GOTO 2025

Daniel Lütgehetmann of inait introduces "digital brains," biologically accurate computational models of real brains, as a solution to current AI's limitations in physical world interaction. Unlike traditional AI that struggles with dynamic environments and skill accumulation, these digital brains leverage biologically inspired learning rules to achieve dramatically faster learning in robotics and complex systems, demonstrating potential for real-world adaptability and efficiency.

Mar 10, 2026

10 years of AlphaGo: The turning point for AI | Thore Graepel & Pushmeet Kohli

Ten years after the historic match between AlphaGo and Lee Sedol, Google DeepMind's Thore Graepel and Pushmeet Kohli reflect on its legacy. They discuss how AlphaGo's blend of deep learning and tree search conquered the game of Go, the significance of creative breakthroughs like 'Move 37', and how these foundational concepts evolved into systems like AlphaZero, which learns without human data. The conversation bridges the gap from game-playing to solving scientific grand challenges, detailing how the same principles are now used in tools like AlphaTensor to discover novel, more efficient algorithms for fundamental problems like matrix multiplication.

Feb 10, 2026

Building the GitHub for RL Environments: Prime Intellect's Will Brown & Johannes Hagemann

Prime Intellect's Will Brown and Johannes Hagemann discuss the paradigm shift from static prompting to dynamic, environment-based AI development. They introduce their Environments Hub, a platform aimed at democratizing frontier-level training and enabling companies to build specialized models by compounding institutional knowledge.

← Previous Next →