Feature

Designing AI Agents for the Complex Realities of Healthcare

Designing AI Agents for the Complex Realities of Healthcare

Dr. Sarah Gebauer presents a clinical framework for deploying AI agents in healthcare, drawing a powerful analogy between AI agents and medical residents. She outlines the critical risks, validation strategies, and post-deployment monitoring required to make agents useful, safe, and credible in high-stakes clinical environments.

Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody

Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody

Brendan Foody, CEO of Mercor, discusses the critical role of AI evaluations (evals) in model improvement, detailing how his company achieved unprecedented growth by supplying high-skilled experts to top AI labs. He explores the shift to Reinforcement Learning from AI Feedback (RLAIF), the future of work in an AI-driven economy, and why he believes the path to AGI is paved with better evals, not just more data.

No Priors Ep. 132 | With Decagon CEO and Co-Founder Jesse Zhang

No Priors Ep. 132 | With Decagon CEO and Co-Founder Jesse Zhang

Jesse Zhang, co-founder and CEO of Decagon, discusses how their AI agents are revolutionizing customer service for large enterprises by replacing mundane human labor. He covers their go-to-market strategy, the importance of a hardworking in-office culture, his journey as a second-time founder, and the future of an agentic world where AIs interact on behalf of companies and consumers.

Production monitoring for AI applications using W&B Weave

Production monitoring for AI applications using W&B Weave

Learn how W&B Weave's online evaluations enable real-time monitoring of AI applications in production, allowing teams to track performance, catch failures, and iterate on quality over time using LLM-as-a-judge scores.

Beyond Prompting: The Emerging Discipline of Context Engineering Reading Group

Beyond Prompting: The Emerging Discipline of Context Engineering Reading Group

This summary covers a deep dive into the paper "A Survey of Context Engineering for Large Language Models". The discussion reframes the conversation from simple prompt engineering to a more systematic approach of building information environments for LLMs. It explores the foundational components of context engineering—generation, processing, and management—and their application in advanced systems like Retrieval-Augmented Generation (RAG), memory, tool use, and multi-agent systems.

AI ransomware, hiring fraud and the end of Scattered Lapsus$ Hunters

AI ransomware, hiring fraud and the end of Scattered Lapsus$ Hunters

Experts from IBM X-Force discuss the alleged retirement of the Scattered Lapsus$ Hunters cybercrime gang, the ethics and implications of AI-powered ransomware, critical software supply chain vulnerabilities exposed by the recent npm hack, growing threats to Operational Technology (OT), and the emergence of AI-driven hiring fraud.