Feature

The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks

The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks

Sandipan Bhaumik presents a five-pillar framework for successfully moving AI systems from demos to production, inspired by a retail bank's failed chatbot PoC. The framework covers defining numerical success (Evaluation), tracing every AI decision (Observability), building robust data pipelines (Data Foundation), managing multiple AI interactions (Multi-agent Orchestration), and ensuring accountability and security (Governance). He illustrates these concepts with a banking chatbot case study, emphasizing continuous evaluation, data quality, and a proactive incident playbook.

You Might Not Need 50 Diffusion Steps — Ziv Ilan, Nvidia

You Might Not Need 50 Diffusion Steps — Ziv Ilan, Nvidia

Ziv Ilan from NVIDIA details how latency in video diffusion models can be drastically reduced to achieve real-time generation. He presents a layered approach combining dynamic quantization for memory and speed, chunk-based caching to skip redundant denoising computations, and, most critically, step distillation—training models to achieve high-quality output in significantly fewer steps. These techniques, packaged in the open-source FastGen repository, offer additive performance gains, enabling real-time video on a single Blackwell B200 GPU.

Simulating Humans at Scale: Simile's Joon Sung Park

Simulating Humans at Scale: Simile's Joon Sung Park

Joon Sung Park, founder and CEO of Simile and creator of Stanford's "Smallville" generative agents study, explains how Simile is building the "GPU of intelligence" to simulate human society, diverging from frontier models that act as the "CPU of intelligence." He details Simile's approach of grounding simulations with real human behavioral data, its diverse corporate applications, and its long-term vision to create a "CERN of human society" to solve fundamental societal challenges.

Context Engineering for Coding Agents

Context Engineering for Coding Agents

A deep dive into advanced engineering techniques for coding agents, focusing on effective context management in LLMs like Claude. The talk introduces a practical framework using a brain-inspired analogy, proposing a Markdown-based 'wiki' as a long-term memory system to augment the agent's limited context window. This approach is demonstrated through a real-world challenge of extracting structured data from technical drawings.

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

This session details a data-efficient method for training engineering surrogate models by using uncertainty quantification (UQ) to guide geometric data augmentation. Instead of random deformations, the approach lets the deep ensemble model identify its own knowledge gaps (epistemic uncertainty), then uses Free-Form Deformation (FFD) to generate new shapes specifically in those uncertain regions. This ensures every expensive simulation run yields maximally informative data, significantly improving model accuracy for a fixed computational budget across domains like structural mechanics and aerodynamics.

Stop Making Models Bigger, Make Them Behave — Kobie Crawdord, Snorkel

Stop Making Models Bigger, Make Them Behave — Kobie Crawdord, Snorkel

Snorkel.ai's research demonstrates how a 4-billion-parameter model, fine-tuned with Reinforcement Learning for under $500, significantly outperformed a 235-billion-parameter model on financial analysis tool-use tasks. The key was cultivating 'tool discipline' and error correction capabilities, rather than relying on sheer model size or deeper reasoning. Single-table training generalized effectively to harder multi-table problems, emphasizing the importance of targeted behavioral fixes identified through detailed evaluation rubrics.