Feature

Why Language Models Need a Lesson in Education

Why Language Models Need a Lesson in Education

Stephanie Kirmer, a staff machine learning engineer at DataGrail, adapts her experience as a former professor to address the challenge of evaluating LLMs in production. She proposes a robust methodology using LLM-based evaluators guided by rigorous, human-calibrated rubrics to bring objectivity and scalability to the subjective task of assessing text generation quality.

How Reinforcement Learning can Improve your Agent

How Reinforcement Learning can Improve your Agent

This talk addresses the unreliability of current AI agents, arguing that prompting is insufficient. It posits that Reinforcement Learning (RL) is the most promising solution, delving into the mechanisms of RLHF and RLVR. The core challenge identified is 'reward hacking', and the discussion explores future directions to overcome it, such as RLAIF, data augmentation, and the development of interactive, online models that can learn in real-time.

AI Changed Stack Overflow for the Better

AI Changed Stack Overflow for the Better

Stack Overflow CEO Prashanth Chandrashekar discusses the platform's evolution in the AI era, focusing on licensing its trusted Q&A corpus to major AI labs, expanding beyond Q&A to include discussions and live chat, and the critical role of its enterprise solution in powering internal AI agents. A key insight from their upcoming developer survey reveals that while AI adoption for coding is rising, developer trust in AI-generated output is declining, reinforcing Stack Overflow's position as a vital source of human-curated, reliable knowledge.

Building Bridges: From Developer to Developer Advocate • David Edoh-Bedi & James Beswick

Building Bridges: From Developer to Developer Advocate • David Edoh-Bedi & James Beswick

David Edoh-Bedi, a Developer Advocate at Stripe, shares his journey from growing up in Togo to working on large-scale systems like Windows at Microsoft and eventually transitioning into developer relations. The conversation covers the essential skills for DevRel, the hidden complexities of global payment systems, and the evolution of software development from libraries to API-centric architectures.

Dylan Patel on GPT-5’s Router Moment, GPUs vs TPUs, Monetization

Dylan Patel on GPT-5’s Router Moment, GPUs vs TPUs, Monetization

A deep dive into the AI hardware landscape, exploring NVIDIA's dominance, the challenges for competitors like custom silicon and startups, and the critical infrastructure bottlenecks of power and data centers that define the next phase of the AI race.

Lets See What We Can do! with F# Computation Expressions • Andrew Browne • YOW! 2015

Lets See What We Can do! with F# Computation Expressions • Andrew Browne • YOW! 2015

Andrew Browne demystifies F# Computation Expressions, showing how this powerful feature is a simple syntactic transformation. He demonstrates building custom expressions for handling optional values (maybe), asynchronous sequences, and even creating a Domain-Specific Language (DSL) with Free Monads for robust testing.