Posts

It's 2026, and We're Still Talking Evals

It's 2026, and We're Still Talking Evals

Maggie Konstanty, AI Product Manager at Prosus, provides a candid look into the realities of LLM evaluation in production. She argues that standard metrics like accuracy are misleading and advocates for a culture of continuous, goal-oriented evaluation focused on deep failure analysis and understanding real user behavior, asserting that mature teams inevitably build custom tooling to meet their specific needs.

Why Agents are Driving Software Development to the Cloud

Why Agents are Driving Software Development to the Cloud

Zach Lloyd, CEO of Warp, explains why the future of software development is moving from local, interactive agents to cloud-native, collaborative systems. He discusses the flaws in the "dev box" sandbox model, the decline of traditional SaaS interfaces in favor of "just-in-time apps," and how platforms like Warp's Oz are providing the necessary orchestration, observability, and access control for teams to effectively deploy AI agents at scale.

What is OpenClaw? Inside AI Agents, LLMs and the Agentic Loop

What is OpenClaw? Inside AI Agents, LLMs and the Agentic Loop

AI agents represent a paradigm shift from conversational AI to autonomous systems that can perform actions. This is achieved through an 'agentic loop' combining Large Language Models (LLMs) with tools, as exemplified by the OpenClaw framework, which enables complex, automated workflows while also raising important security considerations.

Collaborative AI Engineering — Maggie Appleton, GitHub Next

Collaborative AI Engineering — Maggie Appleton, GitHub Next

Maggie Appleton from GitHub Next argues that current agentic tools are flawed by focusing on individual productivity, ignoring the collaborative nature of software development. She introduces ACE (Agent Collaboration Environment), a multiplayer platform designed to solve team alignment issues by integrating planning, development, and shared context in a real-time, sandboxed environment.

Every API Is a Tool for Agents - Matt Carey, Cloudflare

Every API Is a Tool for Agents - Matt Carey, Cloudflare

This talk explores how to overcome the context window limitations that prevent AI agents from accessing large APIs. It introduces "Codemode," a technique where agents write code against a typed SDK in a secure, sandboxed environment, moving beyond static tool definitions and enabling full API accessibility.

AgentCraft: Putting the Orc in Orchestration — Ido Salomon

AgentCraft: Putting the Orc in Orchestration — Ido Salomon

As the number of AI agents grows, humans become the bottleneck. This talk introduces AgentCraft, a game-inspired orchestrator that draws lessons from Real-Time Strategy (RTS) games to enhance human-agent collaboration, improve visibility, and increase agent autonomy, ultimately raising the ceiling of what's possible.