Llm

Context Engineering for Coding Agents

Context Engineering for Coding Agents

A deep dive into advanced engineering techniques for coding agents, focusing on effective context management in LLMs like Claude. The talk introduces a practical framework using a brain-inspired analogy, proposing a Markdown-based 'wiki' as a long-term memory system to augment the agent's limited context window. This approach is demonstrated through a real-world challenge of extracting structured data from technical drawings.

Building Agent Interfaces: Lessons from Chrome DevTools (MCP) for Agents — Michael Hablich, Google

Building Agent Interfaces: Lessons from Chrome DevTools (MCP) for Agents — Michael Hablich, Google

Michael Hablich from the Chrome DevTools team shares hard-won engineering lessons on building effective and secure interfaces for AI agents. The talk covers moving from raw data to semantic summaries, measuring interface efficiency with 'tokens per successful outcome', designing for error recovery, and the critical importance of trust boundaries and deliberate friction in UI design for agents.

BDD, ADR, PRD, WTF: Capturing Decisions for Humans and AI Alike — Michal Cichra, Safe Intelligence

BDD, ADR, PRD, WTF: Capturing Decisions for Humans and AI Alike — Michal Cichra, Safe Intelligence

Michal Cichra from Safe Intelligence explains how to maintain consistency in AI-driven software development by capturing decisions and enforcing rules. He argues for reviving Behavior-Driven Development (BDD) with Cucumber to close the loop left by spec-driven development. The core idea is to enforce architectural and product decisions (ADRs, PRDs) through an automated loop of git hooks and CI, ensuring both human and AI developers adhere to established standards.

Scaling Meta's Multi-Agent Systems to a Billion Videos

Scaling Meta's Multi-Agent Systems to a Billion Videos

Meta's approach to solving modality misalignment and content theft in short-form video using a multi-agent system of smaller, specialized models instead of a single large LLM. The talk covers the architecture (Perceiver, Retriever, Reasoner), evaluation stack, and key cost-saving optimizations.

How Google DeepMind Runs Agents at Scale — KP Sawhney & Ian Ballantyne, Google DeepMind

How Google DeepMind Runs Agents at Scale — KP Sawhney & Ian Ballantyne, Google DeepMind

KP Sawhney from Google DeepMind discusses the internal strategies for scaling agentic AI, including managing token-hungry workflows, curating a 'Darwinian' skills library, and evolving the Deep Research pipeline from large context blobs to a collaborative file system.

CAG vs Long Context: How AI Models Use and Remember Information

CAG vs Long Context: How AI Models Use and Remember Information

Martin Keen explains how Long Context and Cache Augmented Generation (CAG) serve as powerful alternatives to RAG for providing external knowledge to LLMs. This summary details the mechanics of each approach, the role of the KV cache, the practical application through prompt caching, and the trade-offs in performance, cost, and latency for real-world AI workloads.