Ai agents

12-factor Agents - Patterns of reliable LLM applications // Dexter Horthy

12-factor Agents - Patterns of reliable LLM applications // Dexter Horthy

Drawing from conversations with top AI builders, Dex argues that production-grade AI agents are not magical loops but well-architected software. This talk introduces "12-Factor Agents," a methodology centered on "Context Engineering" to build reliable, high-performance LLM-powered applications by applying rigorous software engineering principles.

EDD: The Science of Improving AI Agents // Shahul Elavakkattil Shereef // Agents in Production 2025

EDD: The Science of Improving AI Agents // Shahul Elavakkattil Shereef // Agents in Production 2025

This talk introduces Eval-Driven Development (EDD) as a scientific alternative to 'vibe-based' iteration for improving AI agents. It covers quantitative evaluation (choosing strong end-to-end metrics, aligning LLM judges) and qualitative evaluation (using error and attribution analysis to debug failures), providing a structured framework for consistent agent improvement.

How Grounded Synthetic Data is Saving the Publishing Industry // Robert Caulk

How Grounded Synthetic Data is Saving the Publishing Industry // Robert Caulk

Robert from Emergent Methods discusses how grounded synthetic news data can solve the publisher revenue crisis in the AI era. He details the process of 'Context Engineering' news into token-optimized, objective data for high-stakes AI agent tasks, covering their open-source models for entity extraction and bias mitigation, and the on-premise infrastructure that protects publisher content.

AI Agents for Cybersecurity: Enhancing Automation & Threat Detection

AI Agents for Cybersecurity: Enhancing Automation & Threat Detection

An exploration of how LLM-powered AI agents are transforming cybersecurity by moving beyond traditional static rules to provide dynamic, adaptive security operations. The summary covers key applications in threat detection and incident response, while also addressing critical risks like hallucinations and adversarial manipulation, emphasizing a "human-in-the-loop" approach.

Streamline evaluation, monitoring, optimization of AI data flywheel with NVIDIA and Weights & Biases

Streamline evaluation, monitoring, optimization of AI data flywheel with NVIDIA and Weights & Biases

A walkthrough of the NVIDIA Data Flywheel Blueprint, demonstrating how to use production data and Weights & Biases to systematically fine-tune AI agents. This process enhances model accuracy and efficiency by creating a continuous improvement cycle, moving beyond the limitations of prompt engineering.

The Hidden Bottlenecks Slowing Down AI Agents

The Hidden Bottlenecks Slowing Down AI Agents

Paul van der Boor and Bruce Martens from Prosus discuss the real bottlenecks in AI agent development, arguing that the primary challenges are not tools, but rather evaluation, data quality, and feedback loops. They detail their 'buy-first' philosophy, the practical reasons they often build in-house, and how new coding agents like Devon and Cursor are changing their development workflows.