Reinforcement learning

Open AI Researchers Breakdown GPT-5

Open AI Researchers Breakdown GPT-5

OpenAI researchers discuss the step-change in capabilities in ChatGPT-5, from coding and reasoning to creative writing. They detail the data-centric training processes, the shift toward asynchronous agentic workflows, and the future of AI development and its impact on the startup ecosystem.

DeepMind's Secret AI Project That Will Change Everything [EXCLUSIVE]

DeepMind's Secret AI Project That Will Change Everything [EXCLUSIVE]

Google DeepMind's Genie 3 is a new generative interactive environment that creates photorealistic, controllable 3D worlds from text prompts in real-time. This summary explores its architecture, the concept of emergent consistency, and its primary application as a powerful simulator for training embodied AI agents.

Computational models for brain science

Computational models for brain science

Dr. Laschowski discusses his lab's research in computational neuroscience, focusing on three core areas: reverse-engineering human motor control using reinforcement and optimal control models, developing high-accuracy neural decoding algorithms for brain-machine interfaces (BMIs), and creating brain-inspired deep learning models for computer vision. The talk highlights a long-term vision of discovering the fundamental principles of intelligence to build more efficient and robust AI.

OpenAI’s IMO Team on Why Models Are Finally Solving Elite-Level Math

OpenAI’s IMO Team on Why Models Are Finally Solving Elite-Level Math

Members of the OpenAI team, Alex Wei, Sheryl Hsu, and Noam Brown, discuss their model's historic gold-medal performance at the International Mathematical Olympiad (IMO). They detail their unique approach using general-purpose reinforcement learning for hard-to-verify tasks, the model's surprising self-awareness, and the vast gap that remains between solving competition problems and achieving true mathematical research breakthroughs.

Scaling and the Road to Human-Level AI | Anthropic Co-founder Jared Kaplan

Scaling and the Road to Human-Level AI | Anthropic Co-founder Jared Kaplan

Jared Kaplan, co-founder of Anthropic, explains how the discovery of predictable, physics-like scaling laws in AI training provides a clear roadmap for progress. He details the two main phases of model training (pre-training and RL), discusses how scaling compute predictably unlocks longer-horizon task capabilities, and outlines the remaining challenges—memory, nuanced oversight, and organizational knowledge—on the path to human-level AI.

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

This workshop, led by former Google product directors, introduces a methodology for building reliable and tunable evaluation metrics for LLM applications. It details how to create granular 'scoring systems' that break down complex evaluations into simple, objective signals, and then use these systems for model comparison, prompt optimization, and online reinforcement learning.