Gpu

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

A deep dive into the challenges and solutions for efficient Reinforcement Learning (RL) in enterprise settings. The talk contrasts synchronous and asynchronous RL, explains the critical trade-off of "staleness" versus stability, and details a first-principles system model used to optimize GPU allocation for maximum throughput.

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

Tuhin Srivastava, CEO of Baseten, joins Gradient Dissent to discuss the core challenges of AI inference, from infrastructure and runtime bottlenecks to the practical differences between vLLM, TensorRT-LLM, and SGLang. He shares how Baseten navigated years of searching for a market before the explosion of large-scale models, emphasizing a company-building philosophy focused on avoiding premature scaling and "burning the boats" to chase the biggest opportunities.

The GPU Uptime Battle

The GPU Uptime Battle

Andy Pernsteiner, Field CTO of VAST Data, discusses the immense challenges of transitioning AI projects from prototype to production. He highlights the critical role of data infrastructure, the high cost of GPU downtime, and the necessity of building resilient, scalable platforms that can withstand real-world failures like power outages in massive data centers. The conversation emphasizes a shift in mindset towards empathy, better requirement gathering, and closer collaboration between data scientists and platform engineers to bridge the gap between development and operations.

Building the Real-World Infrastructure for AI, with Google, Cisco & a16z

Building the Real-World Infrastructure for AI, with Google, Cisco & a16z

AI is driving an unprecedented buildout of physical infrastructure. Experts from Google and Cisco discuss the "AI industrial revolution," where power, compute, and networking are the new scarce resources, demanding a complete reinvention of the technology stack from silicon to software.

Nvidia CTO Michael Kagan: Scaling Beyond Moore's Law to Million-GPU Clusters

Nvidia CTO Michael Kagan: Scaling Beyond Moore's Law to Million-GPU Clusters

Nvidia CTO Michael Kagan explains how the Mellanox acquisition was key to scaling AI infrastructure from single GPUs to million-GPU data centers. He covers the critical role of networking in system performance, the shift from training to inference workloads, and his vision for AI's future in scientific discovery.

931: Boost Your Profits with Mathematical Optimization, feat. Jerry Yurchisin

931: Boost Your Profits with Mathematical Optimization, feat. Jerry Yurchisin

Gurobi's Jerry Yurchisin explains the power of mathematical optimization, a prescriptive approach that complements AI's predictive capabilities. This summary covers how to get started with free resources, the use of GPUs and LLMs to enhance optimization, real-world applications at companies like Toyota, and its relationship with quantum computing.