LLM

Sep 04, 2025

Catastrophic agent failure and how to avoid it // Edward Upton // Agents in Production 2025

Edward, a founding engineer at Asteroid, discusses the critical challenge of managing catastrophic failures in agentic browser solutions, particularly in high-stakes domains like healthcare and insurance. He shares real-world examples of agent failures and outlines a practical framework for building more reliable, predictable, and accountable agents by scoping their capabilities, implementing robust human-in-the-loop tooling, and employing independent evaluation systems.

Sep 01, 2025

Advancing the Cost-Quality Frontier in Agentic AI // Krista Opsahl-Ong // Agents in Production 2025

Krista Opsahl-Ong from Databricks introduces Agent Bricks, a platform designed to overcome the key challenges of productionizing enterprise AI agents. The talk covers common use cases, the difficult trade-offs between cost and quality, and how Agent Bricks uses automated evaluation and advanced optimization techniques to build cost-effective, high-performance agents.

Sep 01, 2025

Small Language Models are the Future of Agentic AI Reading Group

This paper challenges the prevailing "bigger is better" narrative in AI, arguing that Small Language Models (SLMs) are not just sufficient but often superior for agentic AI tasks due to their efficiency, speed, and specialization. The discussion explores the paper's core arguments, counterarguments, and the practical implications of adopting a hybrid LLM-SLM approach.

Aug 27, 2025

Distilling 200+ Hours of NeurIPS: What’s Next for AI // Nikolaos Vasiloglou // MLOps Podcast #336

Nikolaos Vasiloglou, VP of Research ML at RelationalAI, shares his extensive analysis of the 2023 NeurIPS conference, distilling over 200 hours of content. Key themes include the dominance and evolution of agentic AI, the state of open-source vs. frontier LLMs, the first signs of deep learning models outperforming XGBoost on tabular data, and the critical rise of verification systems. He also explores the future of AI with data attribution for monetization and the concept of composable, LEGO-like language models.

Aug 27, 2025

The Top 100 Most Used AI Apps in 2025

In the fifth edition of the a16z Consumer AI 100, an analysis of the most-used AI-native products reveals a market that is beginning to stabilize after a period of chaotic growth. Key trends identified include the continued dominance of AI companionship and creative tools, the significant market entry of major players like Google and xAI's Grok, the rise of Chinese AI companies on the global stage, and the emergence of a powerful new category: "vibe coding." The data suggests a future of increased verticalization, prosumer tool adoption, and the development of more sophisticated network effects beyond simple data acquisition.

Aug 27, 2025

Too much lock-in for too little gain: agent frameworks are a dead-end // Valliappa Lakshmanan

Lak Lakshmanan presents a robust architecture for building production-quality, framework-agnostic agentic systems. He advocates for using simple, composable GenAI patterns, off-the-shelf tools for governance, and a strong emphasis on a human-in-the-loop design to create continuously learning systems that avoid vendor lock-in.

← Previous Next →