Inference

The Story Behind Cerebras’ $63 Billion IPO with Founder and CEO Andrew Feldman

The Story Behind Cerebras’ $63 Billion IPO with Founder and CEO Andrew Feldman

Cerebras CEO Andrew Feldman discusses the company's journey from a contrarian bet on wafer-scale computing to a $63 billion public company. He details the technical breakthroughs, the challenge of being ahead of the market, and how the recent explosion in AI demand for fast inference validated their architecture, leading to a landmark $20 billion deal with OpenAI.

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava discusses the explosive growth in AI inference, driven by the adoption of specialized and post-trained open-source models. He covers the strategic importance of owning the software layer on top of compute, navigating the severe GPU supply crunch with a multi-cloud fabric, the evolving landscape of AI workloads, and the operational lessons learned from scaling 30x in one year.

The Moonshot Podcast Season 2, Episode 6: Silicon Horizons

The Moonshot Podcast Season 2, Episode 6: Silicon Horizons

This podcast episode explores two X moonshot projects aimed at revolutionizing computer chips. Project Positron focused on creating specialized chips for real-time AI inference, acting as 'brains for robots'. Project Bodger took a meta-approach, using AI and inverse design to automate the chip design process itself, aiming to overcome the limitations of Moore's Law.

LLM Compression Explained: Build Faster, Efficient AI Models

LLM Compression Explained: Build Faster, Efficient AI Models

Learn how AI model compression and quantization techniques are essential for optimizing Large Language Model (LLM) performance and significantly reducing inference costs in production. This deep dive covers practical examples, benefits like reduced latency and increased throughput, and strategies for different AI use cases, demonstrating how to deploy scalable AI with minimal accuracy degradation.

Greetings, Earthlings: Philip Johnston of Starcloud on Data Centers in Space

Greetings, Earthlings: Philip Johnston of Starcloud on Data Centers in Space

Philip Johnston of Starcloud argues that space will become the primary location for AI compute within a decade. He explains how plummeting launch costs, superior solar energy economics in orbit, and the physics of heat dissipation will soon make space-based data centers cheaper and more scalable than their terrestrial counterparts, predicting a future where nearly a trillion dollars in annual CapEx shifts to space.

How Capital is Powering the AI Infrastructure Buildout with Magnetar Capital's Neil Tiwari

How Capital is Powering the AI Infrastructure Buildout with Magnetar Capital's Neil Tiwari

Neil Tiwari of Magnetar Capital explains the creative debt structures and financial innovations fueling the multi-trillion dollar AI infrastructure buildout. He debunks the myths around GPU collateral, revealing that the real security lies in contracted cash flows from investment-grade partners, and details how the industry's bottlenecks are shifting from chips to power distribution, steel, and specialized labor.