Ai agents

Beyond the Gold Standard: Evaluating and Trusting Agents in the Wild // Sanjana Sharma

Beyond the Gold Standard: Evaluating and Trusting Agents in the Wild // Sanjana Sharma

A deep dive into the challenges of deploying AI agents in production, arguing that reliability stems not from model intelligence but from a "system-first" approach. The talk introduces a new architecture that separates the LLM's reasoning from a versioned, auditable "Context Layer" containing business logic and expert knowledge, which is continuously updated through a "Living Ground Truth" loop driven by expert feedback.

Guide to Architect Secure AI Agents: Best Practices for Safety

Guide to Architect Secure AI Agents: Best Practices for Safety

AI agents offer immense power but come with significant security risks. This guide outlines a comprehensive architecture for securing AI agents using DevSecOps, robust access controls, threat monitoring, and a principle-of-least-privilege approach to mitigate dangers like prompt injection and data leaks.

Simple AI Upsells 30% Better Than Trained Reps

Simple AI Upsells 30% Better Than Trained Reps

Founders of Simple AI, Catheryn Li & Zach Kamran, discuss their journey from building consumer apps to creating an AI sales agent that handles inbound calls for major brands. They cover their pivot, the technical challenges of integrating with legacy systems, and how their AI outperforms human reps by leveraging hyper-personalization and rapid A/B testing.

Boris Cherny: How We Built Claude Code

Boris Cherny: How We Built Claude Code

Boris Cherny, creator of Claude Code, shares the development philosophy behind the AI coding tool, emphasizing building for future models, leveraging latent user demand, and the surprising longevity of the terminal interface.

You Asked About AI: Agents, Hacking & LLMs

You Asked About AI: Agents, Hacking & LLMs

An exploration of the evolving AI landscape, covering the paradigm shift in cybersecurity due to AI agents, the practicalities of running local LLMs with tools like Ollama and vLLM, and the emerging stack for agent-to-agent communication.

The Shadow AI Problem Nobody's Talking About

The Shadow AI Problem Nobody's Talking About

Euro Beinat (Prosus Group) and Mert Öztekin (Just Eat Takeaway.com) discuss the practical challenges of scaling AI, focusing on developer productivity, the role of AI agents in automating the 'long tail' of tasks, and the critical importance of change management and governance to foster an AI-native culture without stifling innovation.