Reinforcement learning

Reinforcement Learning for Agents — with Amazon AGI Labs’ Antje Barth

Reinforcement Learning for Agents — with Amazon AGI Labs’ Antje Barth

Antje Barth from Amazon's AGI Labs discusses Nova Act, a new service for building reliable AI agents. She explores how they achieve over 90% reliability using reinforcement learning in 'web gyms', the shift towards 'normcore' agents for practical automation, and the future of AI as a digital co-worker.

How Ricursive Intelligence’s Founders are Using AI to Shape The Future of Chip Design

How Ricursive Intelligence’s Founders are Using AI to Shape The Future of Chip Design

Anna Goldie and Azalia Mirhoseini of Ricursive Intelligence discuss how their work on Google's AlphaChip, which used AI to design TPUs, is now being extended to automate the entire chip design process. They explain their vision for a 'designless' industry and a recursive self-improvement loop where AI designs better chips, which in turn accelerates AI development.

Collaborative AI Agents At OpenAI

Collaborative AI Agents At OpenAI

Robert from OpenAI discusses the critical role of structured evaluations (evals) and graders for developing advanced collaborative agents. He explores the limitations of 'vibe-based' assessments, introduces a maturity model for evals, and presents a comprehensive rubric for measuring agent performance beyond simple accuracy, connecting these concepts to the power of Reinforcement Fine-Tuning (RFT).

Post-training best-in-class models in 2025

Post-training best-in-class models in 2025

An expert overview of post-training techniques for language models, covering the entire workflow from data generation and curation to advanced algorithms like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning (RL), along with practical advice on evaluation and iteration.

AGI: The Path Forward – Jason Warner & Eiso Kant, Poolside

AGI: The Path Forward – Jason Warner & Eiso Kant, Poolside

In a live demo, Poolside's CEOs showcase their second-generation model, the Malibu agent, by migrating a complex codebase from ADA to Rust, including automated testing and iterative feature development. They outline their vision for achieving AGI through a full-stack approach combining proprietary models, reinforcement learning, and massive-scale compute, with plans for a public model release in early 2025.

What are we scaling?

What are we scaling?

A critical analysis of AI progress, arguing that short AGI timelines are unlikely given the current reliance on pre-baking skills via reinforcement learning. The author contends that true AGI requires on-the-job, continual learning—a capability current models lack. The modest economic impact of AI is presented not as a diffusion lag but as direct evidence of this capability gap. The future of AI will be a gradual, competitive race to solve continual learning, not a sudden takeoff.