State space models

Granite 4.0: Small AI Models, Big Efficiency

Granite 4.0: Small AI Models, Big Efficiency

IBM's Granite 4.0 models introduce a groundbreaking hybrid architecture combining Mamba-2 and Transformer blocks with a Mixture of Experts (MoE) design. This approach enables smaller models to achieve superior performance, speed, and memory efficiency, even outperforming much larger models on key enterprise tasks while running on consumer-grade hardware.

929: Dragon Hatchling: The Missing Link Between Transformers and the Brain — with Adrian Kosowski

929: Dragon Hatchling: The Missing Link Between Transformers and the Brain — with Adrian Kosowski

Adrian Kosowski from Pathway introduces the Baby Dragon Hatchling (BDH), a groundbreaking, post-transformer architecture inspired by neuroscience. BDH leverages sparse, positive activation to mimic brain function, offering a path to limitless context, superior reasoning, and unprecedented computational efficiency, potentially solving key limitations of current large language models.