Scalability

Building AI Agent Systems and Scaling Challenges in Agentic AI

Building AI Agent Systems and Scaling Challenges in Agentic AI

Scaling agentic AI systems presents unique challenges beyond traditional software scaling. This summary explains why expanding a single agent's capabilities leads to non-linear increases in cost, latency, and failure propagation. The talk frames this as a systems design problem solved by moving from a monolithic agent to a multi-agent architecture with distributed responsibilities, and it explores the critical architectural trade-offs between horizontal and vertical scaling of agent capabilities.

The GPU Uptime Battle

The GPU Uptime Battle

Andy Pernsteiner, Field CTO of VAST Data, discusses the immense challenges of transitioning AI projects from prototype to production. He highlights the critical role of data infrastructure, the high cost of GPU downtime, and the necessity of building resilient, scalable platforms that can withstand real-world failures like power outages in massive data centers. The conversation emphasizes a shift in mindset towards empathy, better requirement gathering, and closer collaboration between data scientists and platform engineers to bridge the gap between development and operations.

Flipping the Inference Stack — Robert Wachen, Etched

Flipping the Inference Stack — Robert Wachen, Etched

The current AI inference stack, reliant on general-purpose GPUs, is economically and technically unsustainable for real-time AI at scale. AI hardware expert Robert Wachen argues that the future is specialized hardware, like Transformer-specific ASICs, which can unlock currently bottlenecked applications such as real-time video, code generation, and large-scale enterprise deployments by solving critical latency and cost-per-user challenges.