Production AI

Oct 13, 2025

How to build agents that take ACTION

Alex Salazar, CEO of Arcade, argues that the true value of AI is not in chatbots but in agents that can take real-world actions. He details the primary reasons agents fail to reach production—security, cost, latency, and accuracy—and introduces an "Agent Hierarchy of Needs" as a framework for building robust, production-ready agents. The talk emphasizes a critical shift from exposing raw APIs to building intention-based tools and solving the complex challenge of agent authorization through a delegated model.

Oct 09, 2025

Why Your Cloud Isn't Ready for Production AI

Zhen Lu, CEO of Runpod, discusses the shift from Web 2.0 architectures to an "AI-first" cloud. The conversation covers the unique hardware and software requirements for production AI, key use cases like generative media and enterprise agents, and the critical challenges of reliability and operationalization in the new AI stack.

Sep 17, 2025

Production monitoring for AI applications using W&B Weave

Learn how W&B Weave's online evaluations enable real-time monitoring of AI applications in production, allowing teams to track performance, catch failures, and iterate on quality over time using LLM-as-a-judge scores.

Aug 19, 2025

Why Language Models Need a Lesson in Education

Stephanie Kirmer, a staff machine learning engineer at DataGrail, adapts her experience as a former professor to address the challenge of evaluating LLMs in production. She proposes a robust methodology using LLM-based evaluators guided by rigorous, human-calibrated rubrics to bring objectivity and scalability to the subjective task of assessing text generation quality.

Aug 13, 2025

12-factor Agents - Patterns of reliable LLM applications // Dexter Horthy

Drawing from conversations with top AI builders, Dex argues that production-grade AI agents are not magical loops but well-architected software. This talk introduces "12-Factor Agents," a methodology centered on "Context Engineering" to build reliable, high-performance LLM-powered applications by applying rigorous software engineering principles.

Aug 03, 2025

Practical tactics to build reliable AI apps — Dmitry Kuchin, Multinear

Moving an AI PoC from 50% to 100% reliability requires a new development paradigm. This talk introduces a practical, evaluations-first approach, reverse-engineering tests from real-world user scenarios and business outcomes to build a robust benchmark, prevent regressions, and enable confident optimization.

← Previous Next →