LLM

Jul 29, 2025

Scaling Enterprise-Grade RAG: Lessons from Legal Frontier - Calvin Qi (Harvey), Chang She (Lance)

A summary of the talk by Harvey and LanceDB on building a highly optimized retrieval architecture for the legal profession. It covers challenges like query complexity and data scale, the importance of evaluation, and how LanceDB's multimodal lakehouse architecture provides the necessary foundation.

Jul 29, 2025

Layering every technique in RAG, one query at a time - David Karam, Pi Labs (fmr. Google Search)

David Karam, formerly of Google Search, presents a pragmatic framework for enhancing RAG systems, advocating a "quality engineering" approach. The talk progresses through a ladder of techniques, from in-memory retrieval and BM25 to custom embeddings, re-ranking, and advanced orchestration, emphasizing that the choice of technique should be driven by empirical analysis of system failures ("loss analysis") and balanced by a "complexity-adjusted impact" mindset.

Jul 29, 2025

Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai

Will Bryk, CEO of Exa, explains why traditional keyword-based search is insufficient for AI agents and introduces a new paradigm of neural, semantic search. He demonstrates how a hybrid approach, combining neural for discovery and keyword for precision, enables AI agents to perform complex, multi-step information retrieval tasks that were previously impossible.

Jul 28, 2025

Balaji Srinivasan: How AI Will Change Politics, War, and Money

Technologist Balaji Srinivasan joins a16z's Erik Torenberg and Martin Casado to discuss the limitations and societal impact of AI, framing the conversation around the concept of "Polytheistic AGI"—multiple, culturally-specific AIs—versus a singular, god-like intelligence. They explore the practical system-level constraints on AI, its surprising evolution, the critical role of cryptography in grounding AI in reality, and the future of work and security in an AI-driven world.

Jul 27, 2025

Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger

A detailed summary of a workshop on building and deploying production-minded AI coding agents using Dagger. The session covers creating controlled, observable, and test-driven agent workflows and integrating them into CI/CD systems like GitHub Actions for automated, reliable software development.

Jul 27, 2025

Strategies for LLM Evals (GuideLLM, lm-eval-harness, OpenAI Evals Workshop) — Taylor Jordan Smith

Traditional benchmarks and leaderboards are insufficient for production AI. This summary details a practical, multi-layered evaluation strategy, moving from foundational system performance to factual accuracy and finally to safety and bias, using open-source tools like GuideLLM, lm-eval-harness, and Promptfoo.

← Previous Next →