Build Hour: AgentKit
A deep dive into OpenAI's AgentKit, demonstrating how to visually build, deploy, and optimize multi-step, tool-calling agents using Agent Builder, ChatKit, and the integrated Evals platform.
A deep dive into OpenAI's AgentKit, demonstrating how to visually build, deploy, and optimize multi-step, tool-calling agents using Agent Builder, ChatKit, and the integrated Evals platform.
Chip Huyen, an AI expert and author of 'AI Engineering', explains the realities of building successful AI applications. She covers the nuances of model training, the critical role of data quality in RAG systems, the mechanics of RLHF, and why the future of AI improvement lies in post-training, system-level thinking, and solving UX problems rather than just chasing the newest models.
Nishikant Dhanuka from Prosus Group shares practical lessons on building effective AI agents for e-commerce and productivity. He covers why context engineering is more crucial than prompt tweaking, how to build a modern search pipeline, the failures of pure-chat interfaces, and why a robust evaluation framework is the real competitive advantage.
A deep dive into building sophisticated voice agents using OpenAI's Realtime API and Agents SDK. The session covers architectural patterns like chained vs. end-to-end models, the use of multi-agent systems with handoffs for specialized tasks, and best practices for production including debugging with traces, implementing guardrails, and creating robust evaluations.
A deep dive into Reinforcement Fine-Tuning (RFT), covering how to set up tasks, design effective graders, and run efficient training loops to improve model reasoning, based on a live demonstration from OpenAI's Build Hours.
KREA.ai's cofounder Diego Rodriguez discusses the critical failure of current AI evaluation metrics in understanding human perception and aesthetics, advocating for a new paradigm of personalized, perceptually-aware evals.