Scaling Meta's Multi-Agent Systems to a Billion Videos
Meta's approach to solving modality misalignment and content theft in short-form video using a multi-agent system of smaller, specialized models instead of a single large LLM. The talk covers the architecture (Perceiver, Retriever, Reasoner), evaluation stack, and key cost-saving optimizations.
Code Mode - Sunil Pai, Cloudflare
Sunil Pai from Cloudflare introduces "Code Mode," a paradigm where AI agents generate and execute code (like JavaScript) instead of using traditional JSON-based tool calling. This approach enables more efficient, stateful, and complex interactions with large-scale systems by leveraging the inherent capabilities of programming languages.
From Chaos to Choreography: Multi-Agent Orchestration Patterns That Actually Work — Sandipan Bhaumik
Sandipan Bhaumik from Databricks explains that scaling from one to many AI agents is a distributed systems problem, not an AI one. He details common architectural anti-patterns like shared mutable state that cause race conditions and silent failures. The talk provides a practical framework based on distributed systems engineering, covering crucial patterns like choreography vs. orchestration, immutable state management with versioning, data contracts, and failure recovery using circuit breakers and compensation (Saga) patterns. Bhaumik illustrates how to build a robust, production-grade multi-agent architecture using tools like Databricks, LangGraph, and MLflow.
Cognitive Exhaust Fumes, or: Read-Only AI Is Underrated — Šimon Podhajský, Head of AI, Waypoint
A deep dive into a "read-only" personal AI system that analyzes your digital footprint—or "cognitive exhaust fumes"—from sources like email, notes, and browsing history. The author argues that this observer approach provides more profound insights and is inherently safer than action-oriented AI agents, by preventing data contamination and mitigating the high-stakes risks of write-access errors.
Why AI Agents Forget Everything (And How To Fix That)
Mem0 is building a model-neutral, persistent memory layer for AI agents to solve the fundamental statelessness of LLMs. Co-founders Taranjeet Singh and Deshraj Yadav discuss their hybrid memory architecture, which reduces cost and latency compared to context stuffing, and their vision for a future where user memory is portable across all AI applications.
AI Needs Memory - Here's How It Works
A deep dive into the architectural and economic foundations of memory for AI agents. The talk explores the core tradeoffs between classical data storage and dynamic agent behavior, introduces a human-inspired framework for memory, and discusses practical strategies and future directions for building reliable, evolving AI systems.