Agentic systems

Multi-Agent Systems for the Misinformation Lifecycle

Multi-Agent Systems for the Misinformation Lifecycle

A detailed overview of a modular, five-agent system designed to combat the entire lifecycle of digital misinformation. Based on an ICWSM research paper, this practitioner's guide details the roles of the Classifier, Indexer, Extractor, Corrector, and Verifier agents. The system emphasizes scalability, explainability, and high precision, moving beyond the limitations of single-LLM solutions. The talk covers the complete blueprint, from agent coordination and MLOps to holistic evaluation and optimization strategies for production environments.

Build Hour: Agentic Tool Calling

Build Hour: Agentic Tool Calling

A deep dive into building agentic systems using OpenAI's latest APIs. The session covers the core concept of 'agentic tool calling' (reasoning + tools), outlines a four-part framework (Agent, Infrastructure, Product, Evaluation) for designing long-horizon tasks, and provides a hands-on demonstration of building a non-blocking task processing system with a real-time progress UI.

Too much lock-in for too little gain: agent frameworks are a dead-end // Valliappa Lakshmanan

Too much lock-in for too little gain: agent frameworks are a dead-end // Valliappa Lakshmanan

Lak Lakshmanan presents a robust architecture for building production-quality, framework-agnostic agentic systems. He advocates for using simple, composable GenAI patterns, off-the-shelf tools for governance, and a strong emphasis on a human-in-the-loop design to create continuously learning systems that avoid vendor lock-in.

From Spikes to Stories: AI-Augmented Troubleshooting in the Network Wild // Shraddha Yeole

From Spikes to Stories: AI-Augmented Troubleshooting in the Network Wild // Shraddha Yeole

Shraddha Yeole from Cisco ThousandEyes explains how they are transforming network observability by moving from complex dashboards to AI-augmented storytelling. The session details their use of an LLM-powered agent to interpret vast telemetry data, accelerate fault isolation, and improve MTTR, covering the technical architecture, advanced prompt engineering techniques, evaluation strategies, and key challenges.

Evaluation-Driven Development with MLflow 3.0

Evaluation-Driven Development with MLflow 3.0

Yuki Watanabe from Databricks introduces Evaluation-Driven Development (EDD) as a critical methodology for building production-ready AI agents. This talk explores the five pillars of EDD and demonstrates how MLflow 3.0's new features—including one-line tracing, automated evaluation, human-in-the-loop feedback, and monitoring—provide a comprehensive toolkit to ensure agent quality and reliability.

Why Language Models Need a Lesson in Education

Why Language Models Need a Lesson in Education

Stephanie Kirmer, a staff machine learning engineer at DataGrail, adapts her experience as a former professor to address the challenge of evaluating LLMs in production. She proposes a robust methodology using LLM-based evaluators guided by rigorous, human-calibrated rubrics to bring objectivity and scalability to the subjective task of assessing text generation quality.