Production ai

Beyond the Gold Standard: Evaluating and Trusting Agents in the Wild // Sanjana Sharma

Beyond the Gold Standard: Evaluating and Trusting Agents in the Wild // Sanjana Sharma

A deep dive into the challenges of deploying AI agents in production, arguing that reliability stems not from model intelligence but from a "system-first" approach. The talk introduces a new architecture that separates the LLM's reasoning from a versioned, auditable "Context Layer" containing business logic and expert knowledge, which is continuously updated through a "Living Ground Truth" loop driven by expert feedback.

Tool Calling

Tool Calling

A panel discussion with experts from Arcade, Prosus Group, and MeaningStack who argue that most teams are building agents incorrectly. They dissect the failures of simple API wrappers, the pros and cons of MCP, and the critical role of governance and organizational structure in deploying agents successfully.

Building Agentic Tools for Production // Sam Partee

Building Agentic Tools for Production // Sam Partee

Sam Partee, CTO of Arcade AI, explains that building production-grade agentic systems requires moving beyond simple chatbots. He details the critical components for creating reliable, secure, and scalable tools, including rigorous schema management, the principle of least privilege, continuous evaluation, and a crucial distinction between 'exploratory' and 'operational' tools.

Fully Connected 2025 kickoff: The rise (and the challenges) of the agentic era

Fully Connected 2025 kickoff: The rise (and the challenges) of the agentic era

Robin Bordoli of Weights & Biases explores AI's exponential growth, from past achievements to the current agentic landscape. He discusses the rise of reinforcement learning, the challenge of productionizing reliable agents, and highlights how foundational issues in AI development persist even as model capabilities soar.

How to build agents that take ACTION

How to build agents that take ACTION

Alex Salazar, CEO of Arcade, argues that the true value of AI is not in chatbots but in agents that can take real-world actions. He details the primary reasons agents fail to reach production—security, cost, latency, and accuracy—and introduces an "Agent Hierarchy of Needs" as a framework for building robust, production-ready agents. The talk emphasizes a critical shift from exposing raw APIs to building intention-based tools and solving the complex challenge of agent authorization through a delegated model.

Why Your Cloud Isn't Ready for Production AI

Why Your Cloud Isn't Ready for Production AI

Zhen Lu, CEO of Runpod, discusses the shift from Web 2.0 architectures to an "AI-first" cloud. The conversation covers the unique hardware and software requirements for production AI, key use cases like generative media and enterprise agents, and the critical challenges of reliability and operationalization in the new AI stack.