Gemma

⚡️ Google's Open AI Strategy — Omar Sanseviero, Google DeepMind

⚡️ Google's Open AI Strategy — Omar Sanseviero, Google DeepMind

An in-depth look at Gemma 4's novel transformer architecture with per-layer embeddings, enabling efficient parameter offloading for on-device inference. The discussion also covers its native multimodality, the state of fine-tuning, text-based diffusion models, and the growing intersection of research and engineering.

Accelerating AI on Edge — Chintan Parikh and Weiyi Wang, Google DeepMind

Accelerating AI on Edge — Chintan Parikh and Weiyi Wang, Google DeepMind

A deep dive into Google's AI Edge stack for on-device AI, covering the new Gemma 4 models, the LiteRT framework for cross-platform deployment, and practical use cases in agent skills, tool calling, and hardware acceleration on CPUs, GPUs, and NPUs.

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Cormac Brick from Google's AI Edge team details the dual trends of on-device AI: large, system-level models like Gemma 4 enabling complex agent skills, and fine-tuned tiny LLMs for high-performance, in-app tasks. The summary covers the architecture of on-device function calling, the engineering trade-offs for edge deployment, and the practical workflow for fine-tuning and deploying models under 1B parameters on platforms like Android and iOS.

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

Build & deploy AI-powered apps — Paige Bailey, Google DeepMind

A developer-focused, demo-heavy session on rapid AI prototyping using the Google DeepMind stack. It covers how to leverage the full capabilities of AI Studio, from video analysis and code execution with Gemini 3.1 Flash, to building full-stack applications with databases, and exploring the frontiers of generative media with Genie 3, Veo 3.1 Lite, and Lyria 3.

Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI

Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI

Adria Grondin, developer of the Locally AI app, provides a technical walkthrough on running large language models like Google's Gemma on an iPhone using Apple's MLX framework. The talk covers the necessary tools, performance expectations, the importance of quantization, and the growing MLX ecosystem.

Gemma, DeepMind's Family of Open Models — Omar Sanseviero, Google DeepMind

Gemma, DeepMind's Family of Open Models — Omar Sanseviero, Google DeepMind

A deep dive into Google DeepMind's Gemma 4, the latest family of open models. This summary covers the new model architectures like per-layer embeddings, on-device agentic capabilities, multimodal features, and the growing ecosystem of fine-tuned applications from medicine to sovereign AI.