Llm

How We Built a Leading Reasoning Model (Olmo 3)

How We Built a Leading Reasoning Model (Olmo 3)

A comprehensive overview of the entire process behind building Olmo 3 Think, covering the full stack from pre-training architecture and data selection to the detailed post-training recipe involving SFT, DPO, and a deep dive into the advanced infrastructure for scaling Reinforcement Learning (RL). The summary also includes critical reflections on the challenges and nuances of evaluating modern reasoning models.

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

A deep dive into the challenges and solutions for efficient Reinforcement Learning (RL) in enterprise settings. The talk contrasts synchronous and asynchronous RL, explains the critical trade-off of "staleness" versus stability, and details a first-principles system model used to optimize GPU allocation for maximum throughput.

Compilers in the Age of LLMs — Yusuf Olokoba, Muna

Compilers in the Age of LLMs — Yusuf Olokoba, Muna

Yusuf Olokoba, founder of Muna, details a compiler-based approach to transform Python AI functions into self-contained native binaries. This talk explores the technical pipeline, including custom AST-based tracing, type propagation, and the strategic use of LLMs for code generation, enabling a universal, OpenAI-style client for running any model on any platform.

The Unbearable Lightness of Agent Optimization — Alberto Romero, Jointly

The Unbearable Lightness of Agent Optimization — Alberto Romero, Jointly

This talk introduces Meta-ACE, a learned meta-optimization framework that dynamically orchestrates multiple strategies (context evolution, adaptive compute, hierarchical verification, and more) to maximize AI agent performance. The framework profiles each task to select an optimal strategy bundle, overcoming the single-dimension limitations of previous methods.

Inside the AI Black Box

Inside the AI Black Box

Emmanuel Ameisen of Anthropic's interpretability team explains the inner workings of LLMs, drawing analogies to biology. He covers surprising findings on how models plan, represent concepts across languages, and the mechanistic causes of hallucinations, offering practical advice for developers on evaluation and post-training strategies.

I’m Teaching AI Self-Improvement Techniques

I’m Teaching AI Self-Improvement Techniques

Aman Khan from Arize discusses the challenges of building reliable AI agents and introduces a novel technique called "metaprompting". This method uses continuous, natural language feedback to optimize an agent's system prompt, effectively training its "memory" or context, leading to significant performance gains even for smaller models.