System prompts

Continual System Prompt Learning for Code Agents – Aparna Dhinakaran, Arize

Continual System Prompt Learning for Code Agents – Aparna Dhinakaran, Arize

The talk by Aparna Dhinakaran introduces "system prompt learning" as an efficient alternative to traditional Reinforcement Learning for improving large language model-based coding agents. By leveraging LLM-as-a-judge evaluations to generate English feedback and explanations for code failures, agents can automatically refine their system prompts and rules. This method, demonstrated on Claude and Klein, significantly boosts performance on benchmarks like SWEBench with minimal data, highlighting the critical role of high-quality evaluation prompts.

Gen AI pilots fail, GPT-5's hidden prompt revealed, reasoning model flaws and Claude closing chats

Gen AI pilots fail, GPT-5's hidden prompt revealed, reasoning model flaws and Claude closing chats

A deep dive into why most enterprise GenAI pilots are failing, the debate around hidden system prompts in models like GPT-5, new research questioning the reliability of "chain of thought" reasoning, and the controversy over Anthropic's "AI welfare" justification for shutting down conversations.