Asynchronous rl

Code World Model: Building World Models for Computation – Jacob Kahn, FAIR Meta

Code World Model: Building World Models for Computation – Jacob Kahn, FAIR Meta

Jacob Kahn from FAIR, Meta, introduces the Code World Model (CWM), a new paradigm for AI models that learn from program execution rather than just code syntax. By training on detailed execution traces, CWM builds an internal world model of computation, enabling it to predict a program's behavior. This talk explores CWM's architecture, its highly scalable and asynchronous reinforcement learning setup, and groundbreaking applications like a 'neural debugger' that understands user intent from code structure and the potential to approximate undecidable problems like the halting problem.

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

At Applied Compute, efficient Reinforcement Learning is critical for delivering business value. This talk explores the transition from inefficient synchronous RL to a high-throughput asynchronous 'Pipeline RL' system. The core challenge is managing 'staleness'—a side effect of in-flight weight updates that can destabilize training. The speakers detail their first-principles systems model, based on the Roofline model, used to simulate and find the optimal allocation of GPU resources between sampling and training, balancing throughput with algorithmic stability and achieving significant speedups.