Feature

Lessons from 25 Trillion Tokens — Scaling AI-Assisted Development at Kilo

Lessons from 25 Trillion Tokens — Scaling AI-Assisted Development at Kilo

Scott Breitenother, CEO of Kilo, discusses the evolution of software development, where engineers are shifting from writing code to orchestrating AI agents. He shares lessons from processing 25 trillion tokens, emphasizing the critical role of trust, the importance of end-to-end ownership, and how this new paradigm leads to a 10x increase in shipping velocity.

Attention, World Models and the Future of AI — with Prof. Kyunghyun Cho

Attention, World Models and the Future of AI — with Prof. Kyunghyun Cho

Professor Kyunghyun Cho, a co-author of the first paper on attention, discusses the future of AI. He argues that today’s models have already captured most correlations in passive data, making the real challenge about actively choosing which data to collect. He also explores the open debate around world models, the surprising lack of coding agent adoption among his students, and the foundational work that led to Retrieval-Augmented Generation (RAG).

AI Models as a Service: Powering Agentic AI, Privacy, & RAG

AI Models as a Service: Powering Agentic AI, Privacy, & RAG

Cedric Clyburn explains the Models-as-a-Service (MaaS) pattern, detailing how organizations can build their own private AI infrastructure to deploy models like LLMs securely and at scale. He covers the benefits over public APIs, including cost control, data sovereignty, and lifecycle management, and outlines a technical architecture using Kubernetes, API gateways, and observability tools.

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

Chris Fregly discusses his new book, "AI Systems Performance Engineering", covering the co-design and optimization of hardware, software, and algorithms across PyTorch, CUDA, and NVIDIA GPUs. The talk explores GPU architecture, system-level reliability challenges, and the use of modern coding agents for low-level kernel optimization.

The Q/A Layer for the AI Coding Era

The Q/A Layer for the AI Coding Era

Weiwei Wu and Jeff An, co-founders of Momentic, discuss their AI-powered testing platform that acts as a verification layer for software. They explore how the rise of AI-generated code makes robust testing more critical than ever and share their vision for a future of "truth-driven development" where engineers write specs, not code.

Kubernetes at the Edge • Charles Humble & Hannah Foxwell • GOTO 2026

Kubernetes at the Edge • Charles Humble & Hannah Foxwell • GOTO 2026

Charles Humble discusses his e-book "Kubernetes at the Edge," exploring the definition of edge computing, its practical applications in industries like agriculture and healthcare, vendor selection strategies, and the critical importance of Day-2 operations. The conversation also delves into how edge computing promotes sustainability and concludes with a thoughtful examination of the tech industry's ethical responsibilities in the age of generative AI.