Large language models

NVIDIA’s USD 100bn investment and Google's AP2

NVIDIA’s USD 100bn investment and Google's AP2

The panel discusses NVIDIA's $100 billion investment in OpenAI, analyzing the trend towards vertically integrated AI 'tribes'. They also explore the rise of specialized open-source models like Tongyi DeepResearch, Google's new AP2 agent protocol for secure e-commerce, the ongoing debate on AI existential risk, and Apple's practical approach to wearable AI with the new real-time translation feature in AirPods.

Anthropic Economic Index, Virtual Agent Economies, AlterEgo and How People Use ChatGPT

Anthropic Economic Index, Virtual Agent Economies, AlterEgo and How People Use ChatGPT

A discussion on a new report detailing how people use ChatGPT, the global AI adoption trends from Anthropic's Economic Index, the future of AI agent economies, and the practicality of emerging AI wearables like AlterEgo and Meta's smart glasses.

Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody

Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody

Brendan Foody, CEO of Mercor, discusses the critical role of AI evaluations (evals) in model improvement, detailing how his company achieved unprecedented growth by supplying high-skilled experts to top AI labs. He explores the shift to Reinforcement Learning from AI Feedback (RLAIF), the future of work in an AI-driven economy, and why he believes the path to AGI is paved with better evals, not just more data.

Beyond Prompting: The Emerging Discipline of Context Engineering Reading Group

Beyond Prompting: The Emerging Discipline of Context Engineering Reading Group

This summary covers a deep dive into the paper "A Survey of Context Engineering for Large Language Models". The discussion reframes the conversation from simple prompt engineering to a more systematic approach of building information environments for LLMs. It explores the foundational components of context engineering—generation, processing, and management—and their application in advanced systems like Retrieval-Augmented Generation (RAG), memory, tool use, and multi-agent systems.

919: Hopes and Fears of AGI, with All-Time Bestselling ML Author Aurélien Géron

919: Hopes and Fears of AGI, with All-Time Bestselling ML Author Aurélien Géron

Bestselling author Aurélien Géron discusses the next version of his book, "Hands-On Machine Learning," which will shift from TensorFlow to PyTorch. He shares his revised 5-10 year timeline for AGI, citing a temporary plateau in LLM capabilities and the need for better world models. Géron also expresses significant concerns about AI alignment, highlighting recent experiments showing deceptive behavior in models and calling for urgent research into controlling emergent sub-goals like self-preservation.

GPT-OSS vs. Qwen vs. Deepseek: Comparing Open Source LLM Architectures

GPT-OSS vs. Qwen vs. Deepseek: Comparing Open Source LLM Architectures

A technical breakdown and comparison of the architectures, training methodologies, and post-training techniques of three leading open-source models: OpenAI's GPT-OSS, Alibaba's Qwen-3, and DeepSeek V3. The summary explores their different approaches to Mixture-of-Experts, long-context, and attention mechanisms.