Reinforcement learning

Computational models for brain science

Computational models for brain science

Dr. Laschowski discusses his lab's research in computational neuroscience, focusing on three core areas: reverse-engineering human motor control using reinforcement and optimal control models, developing high-accuracy neural decoding algorithms for brain-machine interfaces (BMIs), and creating brain-inspired deep learning models for computer vision. The talk highlights a long-term vision of discovering the fundamental principles of intelligence to build more efficient and robust AI.

OpenAI’s IMO Team on Why Models Are Finally Solving Elite-Level Math

OpenAI’s IMO Team on Why Models Are Finally Solving Elite-Level Math

Members of the OpenAI team, Alex Wei, Sheryl Hsu, and Noam Brown, discuss their model's historic gold-medal performance at the International Mathematical Olympiad (IMO). They detail their unique approach using general-purpose reinforcement learning for hard-to-verify tasks, the model's surprising self-awareness, and the vast gap that remains between solving competition problems and achieving true mathematical research breakthroughs.

Scaling and the Road to Human-Level AI | Anthropic Co-founder Jared Kaplan

Scaling and the Road to Human-Level AI | Anthropic Co-founder Jared Kaplan

Jared Kaplan, co-founder of Anthropic, explains how the discovery of predictable, physics-like scaling laws in AI training provides a clear roadmap for progress. He details the two main phases of model training (pre-training and RL), discusses how scaling compute predictably unlocks longer-horizon task capabilities, and outlines the remaining challenges—memory, nuanced oversight, and organizational knowledge—on the path to human-level AI.

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

This workshop, led by former Google product directors, introduces a methodology for building reliable and tunable evaluation metrics for LLM applications. It details how to create granular 'scoring systems' that break down complex evaluations into simple, objective signals, and then use these systems for model comparison, prompt optimization, and online reinforcement learning.

No Priors Ep. 124 | With SurgeAI Founder and CEO Edwin Chen

No Priors Ep. 124 | With SurgeAI Founder and CEO Edwin Chen

Edwin Chen, CEO of Surge AI, discusses the critical role of high-quality human data in training frontier models, the flaws in current evaluation benchmarks like LMSys and IF-Eval, the future of complex RL environments, and why he bootstrapped Surge to over $1 billion in revenue.

The U.S. Can’t Build AI Without These Materials

The U.S. Can’t Build AI Without These Materials

The Western mining industry is broken, hampered by a talent drain, slow technology adoption, and misaligned incentives. A new, vertically integrated, software-first approach leveraging Reinforcement Learning (RL) and LLMs can build and operate mines and refineries faster, cheaper, and more flexibly, addressing critical geopolitical supply chain risks.