Gemini

How Google’s Nano Banana Achieved Breakthrough Character Consistency

How Google’s Nano Banana Achieved Breakthrough Character Consistency

Nicole Brichtova and Hansa Srinivasan, the leads behind Google's Nano Banana image model, detail the technical breakthroughs in character consistency. They discuss how a focus on high-quality data, Gemini's multimodal architecture, and rigorous human evaluation enabled the model to realistically represent individuals from a single photo. The conversation covers the future of visual AI, moving beyond text prompts to specialized UIs, and the ultimate goal of a single, powerful model that can transform any modality into another, unlocking new applications in personalized education, professional design, and creative storytelling.

Inside Google's AI turnaround: AI Mode, AI Overviews, and vision for AI-powered search | Robby Stein

Inside Google's AI turnaround: AI Mode, AI Overviews, and vision for AI-powered search | Robby Stein

Robby Stein, VP of Product at Google, shares insights on the development of Google's AI products, including AI Mode and AI Overviews. He discusses the product principles that guided the creation of billion-user products like Instagram Stories, the philosophy of "relentless improvement," and why AI is expanding, not replacing, Google Search.

Beyond Chatbots: How to build Agentic AI systems with Google Gemini // Philipp Schmid

Beyond Chatbots: How to build Agentic AI systems with Google Gemini // Philipp Schmid

A deep dive into the evolution from static chatbots to dynamic, agentic AI systems. Philipp Schmid of Google DeepMind explores how to design, build, and evaluate AI agents that leverage structured outputs, function calling, and workflow orchestration with Google Gemini, covering key agentic patterns and the future of AI development.

Monster prompt, OpenAI’s business play, nano-banana and US Open experimentations

Monster prompt, OpenAI’s business play, nano-banana and US Open experimentations

The panel discusses KPMG's 100-page prompt for its TaxBot, debating the future of prompt engineering versus fine-tuning. They also analyze OpenAI's potential move into selling cloud infrastructure, the impressive capabilities of Google's new image model, Nano-Banana, and new AI-powered fan experiences at the US Open.

Pipecat Cloud: Enterprise Voice Agents Built On Open Source - Kwindla Hultman Kramer, Daily

Pipecat Cloud: Enterprise Voice Agents Built On Open Source - Kwindla Hultman Kramer, Daily

A deep dive into the challenges of building production-grade, low-latency voice AI agents, and how the open-source, vendor-neutral framework Pipecat provides a comprehensive solution for development, deployment, and scaling. Learn about voice AI architecture, the trade-offs between speech-to-speech and text-based models, and practical deployment strategies.

No Priors Ep. 123 | With ReflectionAI Co-Founder and CEO Misha Laskin

No Priors Ep. 123 | With ReflectionAI Co-Founder and CEO Misha Laskin

Misha Laskin, co-founder of Reflection AI and former researcher at Google DeepMind, discusses the company's mission to build superhuman autonomous systems. He introduces Asimov, a code comprehension agent designed to solve the 80% of an engineer's time spent on understanding complex systems, rather than just code generation. Laskin delves into the intricacies of co-designing product and research, the critical role of customer-driven evaluations, the bottlenecks in scaling reinforcement learning (RL) — particularly the "reward problem" — and why he believes the future is one of "jagged superintelligence" emerging in specific, high-value domains like coding.