AI Safety

Sep 30, 2025

How To Train An LLM with Anthropic's Head of Pretraining

Anthropic's Head of Pre-training, Nick Joseph, details the immense engineering and infrastructure challenges behind training frontier models like Claude. He covers the evolution from early-stage custom frameworks to debugging hardware at massive scale, balancing pre-training with RL, and the strategic importance of data quality and team composition.

Sep 26, 2025

NVIDIA’s USD 100bn investment and Google's AP2

The panel discusses NVIDIA's $100 billion investment in OpenAI, analyzing the trend towards vertically integrated AI 'tribes'. They also explore the rise of specialized open-source models like Tongyi DeepResearch, Google's new AP2 agent protocol for secure e-commerce, the ongoing debate on AI existential risk, and Apple's practical approach to wearable AI with the new real-time translation feature in AirPods.

Sep 19, 2025

Designing AI Agents for the Complex Realities of Healthcare

Dr. Sarah Gebauer presents a clinical framework for deploying AI agents in healthcare, drawing a powerful analogy between AI agents and medical residents. She outlines the critical risks, validation strategies, and post-deployment monitoring required to make agents useful, safe, and credible in high-stakes clinical environments.

Aug 28, 2025

Why 70% of Companies Are FAILING at AI Safety (Shocking Survey Data): 2025 AI Governance Survey:

Ben Lorica and David Talby of 'The Data Exchange' podcast analyze the 2025 AI Governance Survey, revealing a significant gap between AI adoption and mature risk management. While 30% of organizations have models in production, many lack robust governance frameworks, incident response plans, and comprehensive monitoring, often prioritizing speed-to-market over safety and compliance.

Aug 27, 2025

Threat Intelligence: How Anthropic stops AI cybercrime

Anthropic's Threat Intelligence team discusses their new report on how AI models are being used in sophisticated cybercrime operations. They cover the concept of "vibe hacking," a large-scale employment scam run by North Korea, and Anthropic’s multi-layered strategy to detect and counteract these threats.

Aug 22, 2025

Gen AI pilots fail, GPT-5's hidden prompt revealed, reasoning model flaws and Claude closing chats

A deep dive into why most enterprise GenAI pilots are failing, the debate around hidden system prompts in models like GPT-5, new research questioning the reliability of "chain of thought" reasoning, and the controversy over Anthropic's "AI welfare" justification for shutting down conversations.

← Previous Next →