Arc agi

How A Team Of 7 Keeps Breaking AI Benchmark Records

How A Team Of 7 Keeps Breaking AI Benchmark Records

Poetiq, a startup by former DeepMind researchers, has developed a recursive self-improvement meta-system that builds "reasoning harnesses" on top of existing LLMs. This approach avoids the costly "fine-tuning trap" and has achieved state-of-the-art results on benchmarks like ARC-AGI and Humanity's Last Exam by automatically optimizing prompts and discovering novel reasoning strategies.

This Startup Beat Gemini 3 on ARC-AGI — at Half the Cost

This Startup Beat Gemini 3 on ARC-AGI — at Half the Cost

Poetic, a startup by ex-DeepMind researchers, has significantly advanced performance on the ARC-AGI benchmark by applying a recursive self-improvement system to Gemini 3. Co-founder Ian Fisher discusses how their approach of automating prompt and system engineering provides a substantial performance boost without needing access to model weights, and explores its potential as a path toward AGI.

How Intelligent Is AI, Really?

How Intelligent Is AI, Really?

Greg Kamradt of the ARC Prize Foundation explains how the ARC-AGI benchmark is shifting the focus of AI evaluation from memorization to true intelligence, defined as the ability to generalize and learn new skills efficiently. He discusses the history of ARC-AGI, how it revealed the limits of early LLMs and highlighted the recent "reasoning breakthrough," and details the upcoming interactive ARC-AGI v3, which will measure AI performance against a human baseline with zero instructions.