Ai systems

Why Agentic AI Fails: Infinite Loops, Planning Errors, and More

Why Agentic AI Fails: Infinite Loops, Planning Errors, and More

Agentic AI failures are often predictable system design flaws, not random hallucinations. This summary explores the top three failure modes—infinite loops, hallucinated planning, and unsafe tool use—and provides practical strategies for designing more reliable and robust AI agents.

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

Chris Fregly discusses his new book, "AI Systems Performance Engineering", covering the co-design and optimization of hardware, software, and algorithms across PyTorch, CUDA, and NVIDIA GPUs. The talk explores GPU architecture, system-level reliability challenges, and the use of modern coding agents for low-level kernel optimization.

Why AI Engineers Need to Understand GPU Hardware (with Chris Fregly)

Why AI Engineers Need to Understand GPU Hardware (with Chris Fregly)

Chris Fregly, author of 'AI Systems Performance Engineering', explains that true performance gains in AI come not from raw compute but from a deep, holistic understanding of the entire hardware and software stack. He emphasizes that memory bandwidth is the most critical GPU metric and introduces the concept of 'mechanical sympathy'—the co-design of hardware, software, and algorithms—as the key to unlocking efficiency and overcoming modern bottlenecks.

Building Production-Grade RAG at Scale

Building Production-Grade RAG at Scale

Douwe Kiela, CEO of Contextual AI, explains the evolution from basic RAG to "RAG 2.0", an end-to-end, trainable system. He argues that this system-level approach, which integrates optimized document parsing, retrieval, reranking, and grounded models, is superior to relying on massive context windows alone and is a fundamental tool for next-generation AI agents.