Voice AI

Jan 09, 2026

The Limits of Today’s AI Models

Karan Goel, CEO of Cartesia, discusses the fundamental limitations of Transformer architectures, arguing they behave more like retrieval systems than learning systems. He explains how State Space Models (SSMs) enable compression and abstraction, and why Cartesia is tackling multimodal intelligence by first solving for voice AI, aiming to develop a transferable 'recipe' for end-to-end representation learning.

Dec 22, 2025

AI Agents in 2026 | 3 Predictions For What’s To Come (a16z Big Ideas)

This episode explores three major shifts shaping the future of AI products. The discussion moves from the 'death of the prompt box' towards proactive AI that acts like a top-tier employee, to a new design paradigm of 'machine legibility' where we create for agents instead of humans. Finally, it covers the practical, real-world deployment of AI voice agents in enterprise sectors like healthcare and finance, signaling a move from AI as something you ask to something that does.

Nov 13, 2025

Building Voice Agents Just Got Easier

Anoop Dawar from Deepgram discusses the evolution of voice AI, from basic transcription to sophisticated, real-time voice agents. He covers the key technical challenges in production, such as latency and interruption handling, and introduces Deepgram's Flux system. The talk concludes with a look at the future of speech-to-speech models that can understand emotional nuance, moving closer to passing the audio Turing Test.

Nov 04, 2025

ElevenLabs CEO: Why Voice is the Next AI Interface

Mati Staniszewski, CEO of ElevenLabs, discusses the company's strategy for rapidly shipping research-grade AI. He covers their organizational structure of small, autonomous teams, a global and remote-first hiring philosophy, the transition from a creator-focused product to an enterprise platform, and the lessons learned in navigating complex licensing and scaling a go-to-market team.

Sep 21, 2025

How LiveKit Became An AI Company By Accident

Russ d'Sa, CEO of LiveKit, recounts the company's unexpected journey from a pandemic-era open-source WebRTC project to becoming a crucial infrastructure provider for AI voice interfaces, most notably for OpenAI's ChatGPT. He details the serendipitous moments that led to this pivot and shares his vision for LiveKit as the nervous system for a multimodal AI future.

Aug 28, 2025

Introducing gpt-realtime in the API

An overview of the new GPT-realtime speech-to-speech model and the general availability of the Real-Time API, detailing its architecture, advanced capabilities like image input and multilingualism, training methodology, and new enterprise-ready features.

← Previous Next →