Mixture of experts

Sovereign Escape Velocity: Ownership w Open Models — Gus Martins, & Ian Ballantyne, Google DeepMind

Sovereign Escape Velocity: Ownership w Open Models — Gus Martins, & Ian Ballantyne, Google DeepMind

Google DeepMind's Ian Ballantyne and Gus Martins introduce Gemma 4, a family of open models delivering state-of-the-art performance with remarkable size efficiency. They discuss how models like the 31B variant outperform competitors 2-20x its size while running on a single GPU, the shift to an Apache 2.0 license to foster sovereignty and adoption, and the new economics of running powerful agentic workloads on hardware ranging from a Pixel phone to a single enterprise GPU.

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

Cursor's Federico Cassano and Fireworks' Dmytro Dzhulgakov detail their collaboration on Composer 2, a specialized foundation model for software engineering. They discuss their top-down training strategy, the infrastructure challenges of large-scale distributed Reinforcement Learning on sparse models, and how model specialization achieves frontier performance with superior efficiency.

Granite 4.1, IBM Bob & building a quantum ecosystem

Granite 4.1, IBM Bob & building a quantum ecosystem

This episode of Mixture of Experts breaks down IBM's enterprise-focused Granite 4.1 and Project Bob, Google DeepMind's DiLoCo distributed training method, the inference-efficient DeepSeek V4 model, and IBM's strategy for achieving quantum advantage through strategic partnerships.

Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind

Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind

Cassidy Hardin from Google DeepMind introduces Gemma 4, a new family of open-weight models with significant architectural and performance improvements. This summary covers the four new models (31B Dense, 26B MoE, and two "Effective" on-device models), deep dives into architectural changes like mixed global/local attention and Per-Layer Embeddings (PLE), and details the new native multimodal capabilities for vision and audio.

Gemma, DeepMind's Family of Open Models — Omar Sanseviero, Google DeepMind

Gemma, DeepMind's Family of Open Models — Omar Sanseviero, Google DeepMind

A deep dive into Google DeepMind's Gemma 4, the latest family of open models. This summary covers the new model architectures like per-layer embeddings, on-device agentic capabilities, multimodal features, and the growing ecosystem of fine-tuned applications from medicine to sovereign AI.

Granite 4.0: Small AI Models, Big Efficiency

Granite 4.0: Small AI Models, Big Efficiency

IBM's Granite 4.0 models introduce a groundbreaking hybrid architecture combining Mamba-2 and Transformer blocks with a Mixture of Experts (MoE) design. This approach enables smaller models to achieve superior performance, speed, and memory efficiency, even outperforming much larger models on key enterprise tasks while running on consumer-grade hardware.