Voice ai

Building Voice Agents Just Got Easier

Building Voice Agents Just Got Easier

Anoop Dawar from Deepgram discusses the evolution of voice AI, from basic transcription to sophisticated, real-time voice agents. He covers the key technical challenges in production, such as latency and interruption handling, and introduces Deepgram's Flux system. The talk concludes with a look at the future of speech-to-speech models that can understand emotional nuance, moving closer to passing the audio Turing Test.

ElevenLabs CEO: Why Voice is the Next AI Interface

ElevenLabs CEO: Why Voice is the Next AI Interface

Mati Staniszewski, CEO of ElevenLabs, discusses the company's strategy for rapidly shipping research-grade AI. He covers their organizational structure of small, autonomous teams, a global and remote-first hiring philosophy, the transition from a creator-focused product to an enterprise platform, and the lessons learned in navigating complex licensing and scaling a go-to-market team.

How LiveKit Became An AI Company By Accident

How LiveKit Became An AI Company By Accident

Russ d'Sa, CEO of LiveKit, recounts the company's unexpected journey from a pandemic-era open-source WebRTC project to becoming a crucial infrastructure provider for AI voice interfaces, most notably for OpenAI's ChatGPT. He details the serendipitous moments that led to this pivot and shares his vision for LiveKit as the nervous system for a multimodal AI future.

Introducing gpt-realtime in the API

Introducing gpt-realtime in the API

An overview of the new GPT-realtime speech-to-speech model and the general availability of the Real-Time API, detailing its architecture, advanced capabilities like image input and multilingualism, training methodology, and new enterprise-ready features.

Full Workshop: Realtime Voice AI — Mark Backman, Daily

Full Workshop: Realtime Voice AI — Mark Backman, Daily

An in-depth look at building real-time, production-grade voice AI agents using the open-source Pipecat framework. This summary covers the core concepts of voice AI pipelines, the shift to speech-to-speech models like Gemini Live, and advanced techniques for managing latency, context, and turn-taking.

Pipecat Cloud: Enterprise Voice Agents Built On Open Source - Kwindla Hultman Kramer, Daily

Pipecat Cloud: Enterprise Voice Agents Built On Open Source - Kwindla Hultman Kramer, Daily

A deep dive into the challenges of building production-grade, low-latency voice AI agents, and how the open-source, vendor-neutral framework Pipecat provides a comprehensive solution for development, deployment, and scaling. Learn about voice AI architecture, the trade-offs between speech-to-speech and text-based models, and practical deployment strategies.