Latency

Building Voice Agents Just Got Easier

Building Voice Agents Just Got Easier

Anoop Dawar from Deepgram discusses the evolution of voice AI, from basic transcription to sophisticated, real-time voice agents. He covers the key technical challenges in production, such as latency and interruption handling, and introduces Deepgram's Flux system. The talk concludes with a look at the future of speech-to-speech models that can understand emotional nuance, moving closer to passing the audio Turing Test.

Your realtime AI is ngmi — Sean DuBois (OpenAI), Kwindla Kramer (Daily)

Your realtime AI is ngmi — Sean DuBois (OpenAI), Kwindla Kramer (Daily)

Sean DuBois (OpenAI, Pion) and Kwindla Hultman Kramer (Daily, Pipecat) argue that to build successful real-time AI applications, developers must start from the network layer up, prioritizing WebRTC over WebSockets to manage latency effectively and enable advanced features like interruption and state management.