Llms

How A Team Of 7 Keeps Breaking AI Benchmark Records

How A Team Of 7 Keeps Breaking AI Benchmark Records

Poetiq, a startup by former DeepMind researchers, has developed a recursive self-improvement meta-system that builds "reasoning harnesses" on top of existing LLMs. This approach avoids the costly "fine-tuning trap" and has achieved state-of-the-art results on benchmarks like ARC-AGI and Humanity's Last Exam by automatically optimizing prompts and discovering novel reasoning strategies.

Mainframe modernization explained: COBOL and AI

Mainframe modernization explained: COBOL and AI

Experts from IBM discuss the nuanced role of AI in mainframe modernization, the immense infrastructural and product challenges behind global AI adoption, and the critical need for a multi-layered, security-by-design framework for the safe deployment of AI agents.

The Future of Coding: AI Agents & the Next Tech Revolution // Ricky Doar

The Future of Coding: AI Agents & the Next Tech Revolution // Ricky Doar

Ricky Doar, VP of Solutions at Cursor, shares best practices for leveraging AI in software development, focusing on effective problem decomposition, context management, and navigating both new and legacy codebases. He highlights common anti-patterns, such as over-reliance on AI, and offers strategies for debugging, model steerability, and building effective agent harnesses.

Boris Cherny: How We Built Claude Code

Boris Cherny: How We Built Claude Code

Boris Cherny, creator of Claude Code, shares the development philosophy behind the AI coding tool, emphasizing building for future models, leveraging latent user demand, and the surprising longevity of the terminal interface.

Handling AI-Generated Code: Challenges & Best Practices • Roman Zhukov & Damian Brady

Handling AI-Generated Code: Challenges & Best Practices • Roman Zhukov & Damian Brady

Roman Zhukov (Red Hat) and Damian Brady (GitHub) explore the evolving landscape of AI-assisted software development, discussing its impact on developer workflows, code quality, security, and the future of developer roles. They emphasize that while AI tools are powerful amplifiers, human oversight remains essential for quality, security, and legal compliance.

Context Engineering Our Way to Long-Horizon Agents: LangChain’s Harrison Chase

Context Engineering Our Way to Long-Horizon Agents: LangChain’s Harrison Chase

Harrison Chase, co-founder of LangChain, explains the evolution of AI agents from early, rigid scaffolding to modern, flexible "harnesses." He argues that "context engineering"—managing what an LLM sees—is the key to building effective long-horizon agents. Chase also explores how agent development differs from traditional software, highlighting the critical role of traces as the new source of truth and memory systems that enable agents to improve themselves over time.