Lyria

From Transcription to Live Music: Gemini's Audio Stack — Thor Schaeff, Google DeepMind

From Transcription to Live Music: Gemini's Audio Stack — Thor Schaeff, Google DeepMind

Thor Schaeff from Google DeepMind demos the advanced audio AI stack, starting with a single API call to Gemini for rich transcription (speaker names, emotions, translation). He showcases speech generation directed by "director's notes" instead of a voice catalog, the real-time, sound-to-sound Gemini 1.5 Flash Live model, and a live demo of Gemini Live using the Lyria 2 model as a tool to generate a full song on stage.

Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind

Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind

Guillaume Vernade from Google DeepMind demonstrates a full generative media pipeline, using Gemini to read a public domain book and act as a master prompt engineer for other models. Imagen generates character portraits, Veo animates scenes into video, Lyria composes a unique soundtrack for each chapter, and a clever TTS trick creates a multi-character audiobook.

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

A developer-focused, demo-heavy session on rapid AI prototyping using the Google DeepMind stack. It covers how to leverage the full capabilities of AI Studio, from video analysis and code execution with Gemini 3.1 Flash, to building full-stack applications with databases, and exploring the frontiers of generative media with Genie 3, Veo 3.1 Lite, and Lyria 3.