Reinforcement learning

Marc Andreessen & Amjad Masad on “Good Enough” AI, AGI, and the End of Coding

Marc Andreessen & Amjad Masad on “Good Enough” AI, AGI, and the End of Coding

Amjad Masad, founder of Replit, joins a16z to discuss the rise of AI agents that can now plan, reason, and code for hours. He explains how reinforcement learning and verification loops unlocked long-horizon reasoning, why AI is advancing fastest in verifiable domains like code, and debates whether "good enough" AI might be a local maximum that blocks the path to AGI.

Machine Learning Explained: A Guide to ML, AI, & Deep Learning

Machine Learning Explained: A Guide to ML, AI, & Deep Learning

A breakdown of Machine Learning (ML), its relationship with AI and Deep Learning, and its core paradigms: supervised, unsupervised, and reinforcement learning. The summary explores classic models and connects them to modern applications like Large Language Models (LLMs) and Reinforcement Learning with Human Feedback (RLHF).

Scale AI CEO on Meta’s $14B deal, scaling Uber Eats to $80B, & what frontier labs are building next

Scale AI CEO on Meta’s $14B deal, scaling Uber Eats to $80B, & what frontier labs are building next

Jason Droege, CEO of Scale AI, discusses the evolution of AI training from simple labeling to complex, expert-driven tasks. He shares insights on the future of AI agents, the reality of enterprise AI adoption, and crucial business lessons learned from building Uber Eats from zero to a multi-billion dollar business.

Some thoughts on the Sutton interview

Some thoughts on the Sutton interview

A reflection on Richard Sutton's "Bitter Lesson," arguing that while his critique of LLMs' inefficiency and lack of continual learning is valid, imitation learning is a complementary and necessary precursor to true reinforcement learning, much like fossil fuels were to renewable energy.

Building an AI Physicist: ChatGPT Co-Creator’s Next Venture

Building an AI Physicist: ChatGPT Co-Creator’s Next Venture

Former researchers from OpenAI and Google DeepMind, Liam Fedus and Ekin Dogus Cubuk, discuss their new venture, Periodic Labs. They aim to create an 'AI physicist' by integrating large language models with real-world, iterative experiments, moving beyond simulation to solve fundamental challenges in physics and chemistry, starting with high-temperature superconductivity.

Richard Sutton – Father of RL thinks LLMs are a dead end

Richard Sutton – Father of RL thinks LLMs are a dead end

Richard Sutton, a foundational figure in reinforcement learning, argues that Large Language Models (LLMs) are a flawed paradigm for achieving true intelligence. He posits that LLMs are mimics of human-generated text, lacking genuine goals, world models, and the ability to learn continually from experience. Sutton advocates for a return to the principles of reinforcement learning, where an agent learns from the consequences of its actions in the real world, a method he believes is truly scalable and fundamental to all animal and human intelligence.