Scalability

The GPU Uptime Battle

The GPU Uptime Battle

Andy Pernsteiner, Field CTO of VAST Data, discusses the immense challenges of transitioning AI projects from prototype to production. He highlights the critical role of data infrastructure, the high cost of GPU downtime, and the necessity of building resilient, scalable platforms that can withstand real-world failures like power outages in massive data centers. The conversation emphasizes a shift in mindset towards empathy, better requirement gathering, and closer collaboration between data scientists and platform engineers to bridge the gap between development and operations.

Flipping the Inference Stack — Robert Wachen, Etched

Flipping the Inference Stack — Robert Wachen, Etched

The current AI inference stack, reliant on general-purpose GPUs, is economically and technically unsustainable for real-time AI at scale. AI hardware expert Robert Wachen argues that the future is specialized hardware, like Transformer-specific ASICs, which can unlock currently bottlenecked applications such as real-time video, code generation, and large-scale enterprise deployments by solving critical latency and cost-per-user challenges.