Reinforcement Learning

Reinforcement learning

Dec 07, 2025

The 100-person lab that became Anthropic and Google's secret weapon | Edwin Chen (Surge AI)

Edwin Chen, founder and CEO of Surge AI, discusses his contrarian, bootstrapped approach to building a billion-dollar company, the critical role of high-quality data and 'taste' in training advanced AI models, the pitfalls of current benchmarks, and why Reinforcement Learning environments are the next frontier in AI.

Dec 07, 2025

The 100-person lab that became Anthropic and Google's secret weapon | Edwin Chen (Surge AI)

Edwin Chen, founder and CEO of Surge AI, discusses his contrarian approach to building a bootstrapped, billion-dollar company, the critical role of high-quality data and 'taste' in training AI, the flaws in current benchmarks, and why reinforcement learning environments are the next frontier for creating models that truly advance humanity.

Dec 02, 2025

Shaping Model Behavior in GPT-5.1— the OpenAI Podcast Ep. 11

OpenAI's Christina Kim (Research) and Laurentia Romaniuk (Product) discuss the development of GPT-5.1, detailing the shift to universal "reasoning models" to enhance both IQ and EQ. They explore the nuances of "model personality," the technical challenges of balancing steerability with safety, and how features like Memory create a more personalized, context-aware user experience.

Nov 24, 2025

Agents are Robots Too: What Self-Driving Taught Me About Building Agents — Jesse Hu, Abundant

Drawing surprising parallels between AI agents and robotics, this talk argues that the agent development community is repeating a key mistake from the self-driving industry: underestimating the difficulty of action and over-focusing on reasoning. It covers essential robotics concepts like DAgger, MDPs, simulation, and the critical importance of a robust offline infrastructure, explaining why perfect reasoning doesn't guarantee successful execution in the real world.

Nov 24, 2025

Fully Connected 2025 kickoff: The rise (and the challenges) of the agentic era

Robin Bordoli of Weights & Biases explores AI's exponential growth, from past achievements to the current agentic landscape. He discusses the rise of reinforcement learning, the challenge of productionizing reliable agents, and highlights how foundational issues in AI development persist even as model capabilities soar.

Nov 24, 2025

Fully Connected keynote: Building tools for agents at Weights & Biases

A summary of the keynote by Lukas Biewald (Weights & Biases) and Camille Fournier (CoreWeave) at Fully Connected London 2025. They discuss recent product updates for W&B Models and Weave, the synergy behind the CoreWeave acquisition, and a deep dive into building and automating an autonomous software engineer agent.

← Previous Next →