Posts

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

Cursor's Federico Cassano and Fireworks' Dmytro Dzhulgakov detail their collaboration on Composer 2, a specialized foundation model for software engineering. They discuss their top-down training strategy, the infrastructure challenges of large-scale distributed Reinforcement Learning on sparse models, and how model specialization achieves frontier performance with superior efficiency.

End-to-End Foundation Models for the Energy Industry — with Jazmia Henry

End-to-End Foundation Models for the Energy Industry — with Jazmia Henry

Jazmia Henry details the end-to-end process of building specialized foundation models for the energy industry. She covers the four key stages from data curation of unstructured, handwritten documents to optimizing inference, and introduces her Grounded Continuous Evaluation (GCE) framework to combat reward hacking in reinforcement learning.

The Four Types of Memory Every AI Agent Needs

The Four Types of Memory Every AI Agent Needs

AI agents utilize four distinct types of memory, analogous to human cognition, to move beyond simple chatbot responses. This summary explores the CoALA framework, detailing working, semantic, procedural, and episodic memory and how they enable agents to learn, recall skills, and leverage past experiences.

Agentic Evaluations at Scale, For Everybody — Nicholas Kang & Michael Aaron, Google DeepMind

Agentic Evaluations at Scale, For Everybody — Nicholas Kang & Michael Aaron, Google DeepMind

Nicholas Kang and Michael Aaron from Google DeepMind's Kaggle team discuss the broken state of AI evaluations—scattered, non-transparent, and created by a homogenous group. They present their solutions: a community-driven benchmarks platform, a PvP Game Arena for non-saturating ELO ratings, standardized agent exams, and hackathons to crowdsource novel evals and address the limitations of current benchmarking practices.

Scaling Meta's Multi-Agent Systems to a Billion Videos

Scaling Meta's Multi-Agent Systems to a Billion Videos

Meta's approach to solving modality misalignment and content theft in short-form video using a multi-agent system of smaller, specialized models instead of a single large LLM. The talk covers the architecture (Perceiver, Retriever, Reasoner), evaluation stack, and key cost-saving optimizations.

Graph Neural Networks Explained: A Clear Guide to GNN Basics & Models

Graph Neural Networks Explained: A Clear Guide to GNN Basics & Models

An introduction to Graph Neural Networks (GNNs), covering fundamental concepts like nodes, edges, and embeddings. This post delves into the core message-passing mechanism and provides a detailed overview of key architectures including GCN, GraphSAGE, GAT, GIN, and Graph Transformers, explaining their unique approaches and mathematical formulations.