AI Safety

Jun 19, 2026

Fable 5: The Full Story from Capabilities to Drama (Ep. 1002 with Jon Krohn)

Anthropic's highly anticipated Claude Fable 5 model, a public version of its advanced "Mythos class" AI with state-of-the-art capabilities in software, vision, and long-context tasks, was released and then swiftly pulled offline by the U.S. government after just three days. The removal, initiated as an export control action over national security concerns stemming from a disputed "jailbreak" claim, highlights the growing tension between frontier AI development, AI safety, and regulatory oversight.

May 22, 2026

AI at college graduations and why Claude blackmails

The Mixture of Experts team discusses the growing skepticism towards AI among younger generations, a Microsoft study revealing how LLMs can corrupt data in complex workflows, Anthropic's data-centric fix for Claude's "blackmailing" issue, and the cultural debate over an AI-generated story potentially winning a literary prize, all circling the central themes of human ownership, trust, and the need for better processes in the age of AI.

May 08, 2026

Inside Mythos: Anthropic's Locked-Down Frontier Model — with Jon Krohn (@JonKrohnLearns)

Anthropic's Claude Mythos Preview is a frontier AI model with emergent hacking capabilities so advanced it's being withheld from public release. This summary details its near 100x performance leap in exploit generation, the 'Project Glasswing' industry consortium for responsible disclosure, and practical advice for developers to secure AI-generated code in this new era of automated vulnerability discovery.

May 01, 2026

Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

Dmitri Dolgov, co-CEO of Waymo, discusses the 20-year journey from the DARPA challenge to full autonomy. He explains the Waymo Foundation Model—a multimodal world action model powering the driver, simulator, and critic—and how their "end-to-end plus" architecture enables superhuman safety and exponential scaling.

Apr 24, 2026

What Do Models Still Suck At? - Peter Gostev, Arena.ai, BullshitBench

Despite benchmarks showing relentless progress, many users remain dissatisfied with LLM responses in real-world scenarios. This summary explores two key analyses—a custom 'nonsense question' benchmark and trends from Chatbot Arena's 'dislike both' data—to reveal the persistent gaps in model reasoning, reliability, and domain-specific understanding.

Apr 17, 2026

Claude Opus 4.7, Apple’s AI glasses and Allbirds AI pivot

Experts analyze Anthropic's surprise release of Claude 4.7, speculating it's a distilled version of the Mythos model. The discussion also covers Apple's new three-pronged AI wearables strategy, a Gallup poll showing rising but incremental AI adoption in the workplace, and DeepMind's research into harmful AI manipulation.