Debugging

Inside the AI Black Box

Inside the AI Black Box

Emmanuel Ameisen of Anthropic's interpretability team explains the inner workings of LLMs, drawing analogies to biology. He covers surprising findings on how models plan, represent concepts across languages, and the mechanistic causes of hallucinations, offering practical advice for developers on evaluation and post-training strategies.

The Debugging Book • Andreas Zeller & Clare Sudbery

The Debugging Book • Andreas Zeller & Clare Sudbery

Professor Andreas Zeller discusses his interactive 'Debugging Book,' arguing that systematic, automated debugging is a critical but neglected skill. He explores powerful techniques like delta debugging and automated repair, explaining how developers can build their own tools to make debugging a more plannable and efficient process.

Building a Debugger • Sy Brand & Tim Misiak

Building a Debugger • Sy Brand & Tim Misiak

Sy Brand, author of "Building a Debugger," and Tim Misiak explore how implementing a debugger is one of the most effective ways to gain a deep understanding of operating systems, compilers, and hardware. They delve into the unexpected complexities of core features like stack unwinding and code stepping, the challenges posed by legacy APIs like ptrace and debug formats like DWARF, and the future of debugging with advancements in time travel debugging and tools for optimized code.

AI traces are worth a thousand logs

AI traces are worth a thousand logs

An exploration of how a single, structured trace, based on OpenTelemetry standards, offers a superior method for debugging, testing, and understanding AI agent behavior compared to traditional logging. Learn how programmatic access to traces enables robust evaluation and the creation of golden datasets for building more reliable autonomous systems.

AI Coding Agents Change Software Development Forever

AI Coding Agents Change Software Development Forever

A discussion on the promise and limitations of coding agents, covering key challenges like verification and debugging, and exploring how they can support developers through improved abstraction, collaboration, and handling long-term tasks.