Gemini

From Transcription to Live Music: Gemini's Audio Stack — Thor Schaeff, Google DeepMind

From Transcription to Live Music: Gemini's Audio Stack — Thor Schaeff, Google DeepMind

Thor Schaeff from Google DeepMind demos the advanced audio AI stack, starting with a single API call to Gemini for rich transcription (speaker names, emotions, translation). He showcases speech generation directed by "director's notes" instead of a voice catalog, the real-time, sound-to-sound Gemini 1.5 Flash Live model, and a live demo of Gemini Live using the Lyria 2 model as a tool to generate a full song on stage.

⚡️ Google's Open AI Strategy — Omar Sanseviero, Google DeepMind

⚡️ Google's Open AI Strategy — Omar Sanseviero, Google DeepMind

An in-depth look at Gemma 4's novel transformer architecture with per-layer embeddings, enabling efficient parameter offloading for on-device inference. The discussion also covers its native multimodality, the state of fine-tuning, text-based diffusion models, and the growing intersection of research and engineering.

Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind

Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind

Guillaume Vernade from Google DeepMind demonstrates a full generative media pipeline, using Gemini to read a public domain book and act as a master prompt engineer for other models. Imagen generates character portraits, Veo animates scenes into video, Lyria composes a unique soundtrack for each chapter, and a clever TTS trick creates a multi-character audiobook.

How to Build the Future: Demis Hassabis

How to Build the Future: Demis Hassabis

Demis Hassabis, CEO of Google DeepMind, outlines the remaining challenges on the path to AGI, including memory, continual learning, and true reasoning. He discusses how learnings from AlphaGo are shaping agent development, the strategic importance of powerful small models like Gemma, and his vision for AI as the ultimate tool for scientific discovery, offering a framework for identifying breakthrough opportunities and advice for founders building in the age of AI.

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

A developer-focused, demo-heavy session on rapid AI prototyping using the Google DeepMind stack. It covers how to leverage the full capabilities of AI Studio, from video analysis and code execution with Gemini 3.1 Flash, to building full-stack applications with databases, and exploring the frontiers of generative media with Genie 3, Veo 3.1 Lite, and Lyria 3.

Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi

Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi

This hands-on workshop details the construction of a sophisticated, dual-part AI system for producing high-quality technical content. It begins with an MCP-powered deep research agent that autonomously plans, searches the web, and analyzes sources like YouTube to synthesize a grounded research artifact. The second part is a constrained, deterministic writing workflow that transforms this research into polished, non-sloppy content using an innovative "Evaluator-Optimizer" pattern for iterative refinement. The session emphasizes crucial AI engineering principles, such as choosing between agentic and workflow-based architectures, and concludes with a deep dive into implementing practical observability and evaluation pipelines to ensure the system is both measurable and improvable.