Ai infrastructure

The AI Frontier: from FLOPs to Megawatts — Anjney Midha, AMP

The AI Frontier: from FLOPs to Megawatts — Anjney Midha, AMP

Anjney Midha unpacks the critical bottlenecks in AI scaling beyond just GPU acquisition, advocating for responsible infrastructure, community-aligned data centers, and an independent system operator model for compute. He discusses the perils of research hoarding, the rise of researcher CEOs, and how Anthropic's culture of "preparedness" and "output maxing" led to its success, while also highlighting his personal mission to use AI for precise end-of-life prediction.

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

Cursor's Federico Cassano and Fireworks' Dmytro Dzhulgakov detail their collaboration on Composer 2, a specialized foundation model for software engineering. They discuss their top-down training strategy, the infrastructure challenges of large-scale distributed Reinforcement Learning on sparse models, and how model specialization achieves frontier performance with superior efficiency.

Scaling the Next Paradigm of Heterogeneous Intelligence — Adrian Bertagnoli, Callosum

Scaling the Next Paradigm of Heterogeneous Intelligence — Adrian Bertagnoli, Callosum

Adrian Bertagnoli from Callosum argues that the era of scaling monolithic models on homogeneous GPU clusters is ending. He introduces "heterogeneous intelligence," a new paradigm where model architectures, chip types, and workflows are optimized together. By routing subtasks to the most efficient model and hardware, this approach achieves significant performance gains, as demonstrated by two key results: a 7x cost reduction in recursive reasoning tasks using Cerebras, and state-of-the-art performance on the Video Web Arena benchmark, outperforming leading GPT and Gemini models at a fraction of the cost and time.

Tokenmaxxing vs AI Hardware Bottlenecks — with Jon Krohn (@JonKrohnLearns)

Tokenmaxxing vs AI Hardware Bottlenecks — with Jon Krohn (@JonKrohnLearns)

While the 'tokenmaxxing' trend grows, the AI industry faces severe physical infrastructure bottlenecks. This summary explores the four key constraints choking AI compute: GPU packaging (CoWoS), high-bandwidth memory (HBM), the surprising surge in CPU demand from agentic AI, and critical electricity shortages, revealing how these challenges are shaping the future of AI development.

Why AI needs a new kind of supercomputer network — the OpenAI Podcast Ep. 18

Why AI needs a new kind of supercomputer network — the OpenAI Podcast Ep. 18

OpenAI's Mark Handley and Greg Steinbrecher detail Multipath Reliable Connection (MRC), a new networking protocol designed to overcome the unique challenges of large-scale AI model training. They explain how moving intelligence to the network's edge creates a resilient, efficient, and simple system that handles constant hardware failures without disrupting massive, synchronized GPU workloads.

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava discusses the explosive growth in AI inference, driven by the adoption of specialized and post-trained open-source models. He covers the strategic importance of owning the software layer on top of compute, navigating the severe GPU supply crunch with a multi-cloud fabric, the evolving landscape of AI workloads, and the operational lessons learned from scaling 30x in one year.