Llm

Conext Engineering for Engineers

Conext Engineering for Engineers

Jeff Huber of Chroma argues that building reliable AI systems hinges on 'Context Engineering'—the deliberate curation of information within the context window. He challenges the efficacy of long-context models, presenting a 'Gather and Glean' framework to maximize recall and precision, and discusses specific challenges and techniques for AI agents, such as intelligent compaction.

Context Engineering: Lessons Learned from Scaling CoCounsel

Context Engineering: Lessons Learned from Scaling CoCounsel

Jake Heller, founder of Casetext, shares a pragmatic framework for turning powerful large language models like GPT-4 into reliable, professional-grade products. He details a rigorous, evaluation-driven approach to prompt and context engineering, emphasizing iterative testing, the critical role of high-quality context, and advanced techniques like reinforcement fine-tuning and strategic model selection.

Iterating on Your AI Evals // Mariana Prazeres // Agents in Production 2025

Iterating on Your AI Evals // Mariana Prazeres // Agents in Production 2025

Moving an AI agent from a promising demo to a reliable product is challenging. This talk presents a startup-friendly, iterative process for building robust evaluation frameworks, emphasizing that you must iterate on the evaluations themselves—the metrics and the data—not just the prompts and models. It outlines a practical "crawl, walk, run" approach, starting with simple heuristics and scaling to an advanced system with automated checks and human-in-the-loop validation.

Building an Agentic Platform — Ben Kus, CTO Box

Building an Agentic Platform — Ben Kus, CTO Box

Ben Kus, CTO of Box, outlines the technical evolution of their AI platform, detailing the transition from a promising but fragile LLM-based metadata extraction system to a robust, scalable agentic architecture. He explains why this shift was necessary to handle enterprise-level complexity and the key lessons learned.

Five hard earned lessons about Evals — Ankur Goyal, Braintrust

Five hard earned lessons about Evals — Ankur Goyal, Braintrust

Building successful AI applications requires a sophisticated engineering approach that goes beyond prompt engineering. This involves creating intentionally engineered evaluations (evals) that reflect user feedback, focusing on "context engineering" to optimize tool definitions and outputs, and maintaining a flexible, model-agnostic architecture to adapt to the rapidly evolving AI landscape.

How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock

How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock

BlackRock engineers Vaibhav Page and Infant Vasanth introduce a modular, Kubernetes-native AI framework designed to accelerate the development of custom knowledge applications for investment operations, reducing deployment time from months to days.