Fine tuning

"Vibe Coding is a Slot Machine" - Jeremy Howard

"Vibe Coding is a Slot Machine" - Jeremy Howard

fast.ai founder Jeremy Howard critiques the 'vibe coding' illusion, arguing that AI-assisted tools create a slot machine-like experience that erodes true software engineering skills. He revisits the origins of ULMFiT, champions interactive programming for building intuition, and reframes AI risk from existential threats to the dangers of power centralization and human enfeeblement.

How A Team Of 7 Keeps Breaking AI Benchmark Records

How A Team Of 7 Keeps Breaking AI Benchmark Records

Poetiq, a startup by former DeepMind researchers, has developed a recursive self-improvement meta-system that builds "reasoning harnesses" on top of existing LLMs. This approach avoids the costly "fine-tuning trap" and has achieved state-of-the-art results on benchmarks like ARC-AGI and Humanity's Last Exam by automatically optimizing prompts and discovering novel reasoning strategies.

Simple AI Upsells 30% Better Than Trained Reps

Simple AI Upsells 30% Better Than Trained Reps

Founders of Simple AI, Catheryn Li & Zach Kamran, discuss their journey from building consumer apps to creating an AI sales agent that handles inbound calls for major brands. They cover their pivot, the technical challenges of integrating with legacy systems, and how their AI outperforms human reps by leveraging hyper-personalization and rapid A/B testing.

Lessons from Building Open Source Libraries

Lessons from Building Open Source Libraries

Thomas Wolf, co-founder of Hugging Face, discusses his journey from physics to AI, the power of open-source models to accelerate innovation, the practical challenges of productionalizing AI demos, and why the biggest opportunities for founders now lie in the application layer on top of powerful foundation models.

Post-training best-in-class models in 2025

Post-training best-in-class models in 2025

An expert overview of post-training techniques for language models, covering the entire workflow from data generation and curation to advanced algorithms like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning (RL), along with practical advice on evaluation and iteration.

Build Hour: Agent RFT

Build Hour: Agent RFT

Will Hang and Theophile Sautory from OpenAI provide a deep dive into Agent RFT, a powerful method for fine-tuning large language models to become more effective, tool-using agents. They explain how Agent RFT enables models to learn directly from their interactions with custom tools and reward signals, leading to significant improvements in performance, latency, and efficiency on specialized tasks. The session includes a detailed code demo, best practices, and success stories from companies like Cognition, Ambience, and Rogo.