AI Agents

Oct 10, 2025

IBM partners with Anthropic, plus OpenAI drops AgentKit

A deep dive into OpenAI's AgentKit, the IBM-Anthropic partnership focusing on the Agent Development Lifecycle (ADLC), the mathematical concept of modular manifolds for stabilizing model training, and a critical analysis of AI's real-world impact on professions like radiology.

Oct 10, 2025

Evals Aren't Useful? Really?

A deep dive into the critical importance of robust evaluation for building reliable AI agents. The summary covers bootstrapping evaluation sets, advanced testing techniques like multi-turn simulations and red teaming, and the necessity of integrating traditional software engineering and MLOps practices into the agent development lifecycle.

Oct 09, 2025

Building with MCP and the Claude API

A discussion with Anthropic engineers Alex Albert, John Welsh, and Michael Cohen about the Model Context Protocol (MCP). They cover its origins as an open standard, best practices for tool design and prompt engineering, and the future of the ecosystem where high-quality MCP servers will become a key competitive advantage.

Oct 09, 2025

Scale AI CEO on Meta’s $14B deal, scaling Uber Eats to $80B, & what frontier labs are building next

Jason Droege, CEO of Scale AI, discusses the evolution of AI training from simple labeling to complex, expert-driven tasks. He shares insights on the future of AI agents, the reality of enterprise AI adoption, and crucial business lessons learned from building Uber Eats from zero to a multi-billion dollar business.

Oct 08, 2025

Evals in Action: From Frontier Research to Production Applications

An overview of OpenAI's approach to AI evaluation, covering the GDP-val benchmark for frontier models and the practical tools available for developers to evaluate their own custom agents and applications.

Oct 07, 2025

Every AI Founder Should Be Asking These Questions

Jordan Fisher, co-founder of Standard AI and now at Anthropic, poses critical questions for startup founders facing the imminent arrival of AGI. He explores challenges from software commoditization and building trust in automated teams to finding durable moats and the ethical responsibility of building world-changing technology.

← Previous Next →