Posts

Beyond the Gold Standard: Evaluating and Trusting Agents in the Wild // Sanjana Sharma

Beyond the Gold Standard: Evaluating and Trusting Agents in the Wild // Sanjana Sharma

A deep dive into the challenges of deploying AI agents in production, arguing that reliability stems not from model intelligence but from a "system-first" approach. The talk introduces a new architecture that separates the LLM's reasoning from a versioned, auditable "Context Layer" containing business logic and expert knowledge, which is continuously updated through a "Living Ground Truth" loop driven by expert feedback.

Rethinking Notebooks Powered by AI

Rethinking Notebooks Powered by AI

Vincent Warmerdam from marimo discusses the recent acquisition by Weights & Biases and the future of Python notebooks. He argues that notebooks should evolve from static scratchpads into dynamic, AI-powered applications, highlighting marimo's features for LLM integration, agentic workflows, and creating interactive, reproducible development environments.

AWS is Too Expensive: Here is the Open Source Alternative

AWS is Too Expensive: Here is the Open Source Alternative

Umur Cubukcu, co-founder of Ubicloud, explains the principles of an "open cloud," centered on an open-source control plane, portability, and freedom from data lock-in. He details Ubicloud's strategy to compete with hyperscalers by offering superior price-performance on core services like PostgreSQL and compute, particularly for startups and enterprises seeking control and data sovereignty.

Guide to Architect Secure AI Agents: Best Practices for Safety

Guide to Architect Secure AI Agents: Best Practices for Safety

AI agents offer immense power but come with significant security risks. This guide outlines a comprehensive architecture for securing AI agents using DevSecOps, robust access controls, threat monitoring, and a principle-of-least-privilege approach to mitigate dangers like prompt injection and data leaks.

From SaaS to AI-First: How Companies Are Reshaping Innovation

From SaaS to AI-First: How Companies Are Reshaping Innovation

Hosts Sarah and Elad discuss the "SaaS-apocalypse," arguing that while AI is fundamentally changing software, the death of SaaS is overstated in the short term. They explore the unprecedented speed of revenue growth and collapsing token costs in the AI era, the new challenges in engineering management like "coding slop," and the strategic imperatives for founders to build durable, multi-product companies in a rapidly changing landscape.

Build Hour: Prompt Caching

Build Hour: Prompt Caching

Explore prompt caching to significantly reduce latency and costs for your AI applications. This guide breaks down the mechanics of KV caching, best practices for maximizing cache hits using `prompt_cache_key` and the Responses API, and real-world implementation insights from the agentic development platform, Warp.