Llmops

The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks

The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks

Sandipan Bhaumik presents a five-pillar framework for successfully moving AI systems from demos to production, inspired by a retail bank's failed chatbot PoC. The framework covers defining numerical success (Evaluation), tracing every AI decision (Observability), building robust data pipelines (Data Foundation), managing multiple AI interactions (Multi-agent Orchestration), and ensuring accountability and security (Governance). He illustrates these concepts with a banking chatbot case study, emphasizing continuous evaluation, data quality, and a proactive incident playbook.

Scaling Agents on Kubernetes with acpx and ACP — Onur Solmaz, OpenClaw

Scaling Agents on Kubernetes with acpx and ACP — Onur Solmaz, OpenClaw

Onur Solmaz from OpenClaw discusses the challenge of managing 300-500 daily, often AI-generated, pull requests. He introduces ACPX, a headless CLI for the Agent Client Protocol (ACP), designed to automate PR triage through a node-based workflow. The talk culminates in a vision for on-demand, disposable agent pods on Kubernetes, managed by a Go operator that provisions and tears down full compute environments per task, wiring them into chat platforms like Slack.

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

This workshop by Mahmoud Mabrouk, CEO of Agenta AI, delves into building calibrated LLM-as-a-judge evaluations that reliably align with human judgment. It highlights how miscalibrated judges lead to false confidence and presents a practical workflow, including designing use-case specific metrics, detailed data annotation, and optimizing judge prompts using the GAPA algorithm. The talk emphasizes the importance of iterative debugging, model selection, and custom reflection templates for achieving trustworthy and effective LLM evaluations.

A Playground for AI Engineers

A Playground for AI Engineers

Paulo Vasconcellos from Hotmart details their journey of building "Agent as a Product", explaining how they blend classic ML models with LLMs for efficiency, evolve their MLOps platform for the generative AI era, and create real business value through AI-powered tutors and sales agents.

Fully Connected keynote: Building tools for agents at Weights & Biases

Fully Connected keynote: Building tools for agents at Weights & Biases

A summary of the keynote by Lukas Biewald (Weights & Biases) and Camille Fournier (CoreWeave) at Fully Connected London 2025. They discuss recent product updates for W&B Models and Weave, the synergy behind the CoreWeave acquisition, and a deep dive into building and automating an autonomous software engineer agent.

Surviving the AI Workforce Shakeup

Surviving the AI Workforce Shakeup

Ben Lorica and Evangelos Simoudis analyze the nuances of AI-driven layoffs, categorizing them into upskilling gaps, automation, and strategic R&D shifts. They also explore the immense pressure for ROI on AI infrastructure investments, leading to the emergence of LLMOps as a form of financial management and the critical need for breaking down organizational silos.