Ai agents

Why building eval platforms is hard — Phil Hetzel, Braintrust

Why building eval platforms is hard — Phil Hetzel, Braintrust

An evaluation platform is more than a simple test runner; it's a complex system for creating shared definitions of quality. This talk explores the evolution of eval platforms from basic spreadsheets to sophisticated, integrated systems, highlighting the hidden data and systems engineering challenges involved in making them credible, scalable, and usable for building trustworthy AI agents.

Box CEO: Why Big Companies Are Falling Behind on AI | a16z

Box CEO: Why Big Companies Are Falling Behind on AI | a16z

Steven Sinofsky, Aaron Levie, and Martin Casado of a16z dissect the reality of AI adoption within large enterprises. They explore the significant gap between Silicon Valley's developer-centric culture and the complex, legacy-driven world of established organizations, explaining why many top-down AI initiatives fail. The discussion introduces a key architectural shift—treating AI agents as users rather than integrated software—and analyzes the immense integration, security, and data challenges that agents face. Ultimately, they argue that AI, rather than eliminating jobs, will create new ones by increasing system complexity and enabling professionals to operate at a higher level of abstraction.

Why Agents are Driving Software Development to the Cloud

Why Agents are Driving Software Development to the Cloud

Zach Lloyd, CEO of Warp, explains why the future of software development is moving from local, interactive agents to cloud-native, collaborative systems. He discusses the flaws in the "dev box" sandbox model, the decline of traditional SaaS interfaces in favor of "just-in-time apps," and how platforms like Warp's Oz are providing the necessary orchestration, observability, and access control for teams to effectively deploy AI agents at scale.

What is OpenClaw? Inside AI Agents, LLMs and the Agentic Loop

What is OpenClaw? Inside AI Agents, LLMs and the Agentic Loop

AI agents represent a paradigm shift from conversational AI to autonomous systems that can perform actions. This is achieved through an 'agentic loop' combining Large Language Models (LLMs) with tools, as exemplified by the OpenClaw framework, which enables complex, automated workflows while also raising important security considerations.

Every API Is a Tool for Agents - Matt Carey, Cloudflare

Every API Is a Tool for Agents - Matt Carey, Cloudflare

This talk explores how to overcome the context window limitations that prevent AI agents from accessing large APIs. It introduces "Codemode," a technique where agents write code against a typed SDK in a secure, sandboxed environment, moving beyond static tool definitions and enabling full API accessibility.

AgentCraft: Putting the Orc in Orchestration — Ido Salomon

AgentCraft: Putting the Orc in Orchestration — Ido Salomon

As the number of AI agents grows, humans become the bottleneck. This talk introduces AgentCraft, a game-inspired orchestrator that draws lessons from Real-Time Strategy (RTS) games to enhance human-agent collaboration, improve visibility, and increase agent autonomy, ultimately raising the ceiling of what's possible.