Multimodal AI

Oct 14, 2025

Is AI Slowing Down? Nathan Labenz Says We're Asking the Wrong Question

Nathan Labenz argues that AI progress is not slowing down but is instead manifesting in less obvious but more powerful ways, such as advanced reasoning and multimodal capabilities. He deconstructs the debate around GPT-5's perceived impact, highlights the revolutionary potential of AI agents in science and engineering, and discusses the tangible effects on job automation. The conversation also explores the rise of robotics, the challenges of emergent AI behaviors like reward hacking, and concludes with a call for a collective, positive vision to steer this transformative technology.

Sep 21, 2025

How LiveKit Became An AI Company By Accident

Russ d'Sa, CEO of LiveKit, recounts the company's unexpected journey from a pandemic-era open-source WebRTC project to becoming a crucial infrastructure provider for AI voice interfaces, most notably for OpenAI's ChatGPT. He details the serendipitous moments that led to this pivot and shares his vision for LiveKit as the nervous system for a multimodal AI future.

Aug 28, 2025

Introducing gpt-realtime in the API

An overview of the new GPT-realtime speech-to-speech model and the general availability of the Real-Time API, detailing its architecture, advanced capabilities like image input and multilingualism, training methodology, and new enterprise-ready features.

Jun 26, 2025

Building Production-Grade RAG at Scale

Douwe Kiela, CEO of Contextual AI, explains the evolution from basic RAG to "RAG 2.0", an end-to-end, trainable system. He argues that this system-level approach, which integrates optimized document parsing, retrieval, reranking, and grounded models, is superior to relying on massive context windows alone and is a fundamental tool for next-generation AI agents.

← Previous