Gemma

May 24, 2026

⚡️ Google's Open AI Strategy — Omar Sanseviero, Google DeepMind

An in-depth look at Gemma 4's novel transformer architecture with per-layer embeddings, enabling efficient parameter offloading for on-device inference. The discussion also covers its native multimodality, the state of fine-tuning, text-based diffusion models, and the growing intersection of research and engineering.

May 05, 2026

Accelerating AI on Edge — Chintan Parikh and Weiyi Wang, Google DeepMind

A deep dive into Google's AI Edge stack for on-device AI, covering the new Gemma 4 models, the LiteRT framework for cross-platform deployment, and practical use cases in agent skills, tool calling, and hardware acceleration on CPUs, GPUs, and NPUs.

May 03, 2026

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Cormac Brick from Google's AI Edge team details the dual trends of on-device AI: large, system-level models like Gemma 4 enabling complex agent skills, and fine-tuned tiny LLMs for high-performance, in-app tasks. The summary covers the architecture of on-device function calling, the engineering trade-offs for edge deployment, and the practical workflow for fine-tuning and deploying models under 1B parameters on platforms like Android and iOS.

Apr 29, 2026

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

A developer-focused, demo-heavy session on rapid AI prototyping using the Google DeepMind stack. It covers how to leverage the full capabilities of AI Studio, from video analysis and code execution with Gemini 3.1 Flash, to building full-stack applications with databases, and exploring the frontiers of generative media with Genie 3, Veo 3.1 Lite, and Lyria 3.

Apr 20, 2026

Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI

Adria Grondin, developer of the Locally AI app, provides a technical walkthrough on running large language models like Google's Gemma on an iPhone using Apple's MLX framework. The talk covers the necessary tools, performance expectations, the importance of quantization, and the growing MLX ecosystem.

Apr 20, 2026

Gemma, DeepMind's Family of Open Models — Omar Sanseviero, Google DeepMind

A deep dive into Google DeepMind's Gemma 4, the latest family of open models. This summary covers the new model architectures like per-layer embeddings, on-device agentic capabilities, multimodal features, and the growing ecosystem of fine-tuned applications from medicine to sovereign AI.