World models

Prompt to Pipeline: Building with Google's Gen Media Stack — Paige & Guillaume, Google DeepMind

Prompt to Pipeline: Building with Google's Gen Media Stack — Paige & Guillaume, Google DeepMind

A comprehensive overview of Google DeepMind's latest advancements, featuring Paige Bailey demonstrating Gemini 1.5 Flash's cost-effective video analysis and AI Studio's single-prompt app generation. Guillaume Vernade showcases a full generative media pipeline, turning a public domain book into an illustrated, animated, and scored project using Gemini, Nano Banana, VO, and LIA. Ian Valentine closes with the power of Gemma 4, demonstrating on-device, multi-agent code generation and debugging without cloud APIs.

Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

Dmitri Dolgov, co-CEO of Waymo, discusses the 20-year journey from the DARPA challenge to full autonomy. He explains the Waymo Foundation Model—a multimodal world action model powering the driver, simulator, and critic—and how their "end-to-end plus" architecture enables superhuman safety and exponential scaling.

Robotics' End Game: Nvidia's Jim Fan

Robotics' End Game: Nvidia's Jim Fan

Jim Fan of Nvidia outlines the endgame for robotics, arguing it will mirror the successful playbook of Large Language Models. He introduces "The Great Parallel," a roadmap where World Models replace Language Models, and data collection shifts from limited teleoperation to scalable egocentric video, culminating in a future of physical APIs and automated research.

Why Most Robot Demos Are Fake

Why Most Robot Demos Are Fake

Changan Chen, co-founder of Rhoda AI, discusses their vision-first approach to building foundation models for robotics. The conversation covers their unique training pipeline, the distinction between policy and world models, and the path to deploying data-efficient, reliable robots in real-world industrial settings.

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Wayve CEO Alex Kendall discusses their contrarian, AI-first approach to autonomous driving. He explains their journey from a garage prototype using reinforcement learning to developing a generalizable AI driver that has driven zero-shot in over 500 cities. Kendall emphasizes a strategy focused on licensing this embodied AI for mass-market consumer vehicles—a 100-million-unit-per-year opportunity—rather than building bespoke robotaxis, arguing that the future is an AI that can drive any car, anywhere.

Moonlake: Multimodal, Interactive, and Efficient World Models — with Fan-yun Sun and Chris Manning

Moonlake: Multimodal, Interactive, and Efficient World Models — with Fan-yun Sun and Chris Manning

Moonlake AI presents a distinctive approach to world modeling, prioritizing interactive, action-conditioned environments built on symbolic representations and game engines over purely pixel-based generative models. This method focuses on causal reasoning, long-term consistency, and programmable rendering (via their 'Reverie' diffusion model) to create dynamic, multiplayer worlds, positioning itself as a platform for training embodied AI and revolutionizing game development.