Browser automation

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

A deep dive into AI harnesses, explaining how to build a programmatic environment around an LLM agent to ensure reliability without prompt engineering. The talk demonstrates building a harness for a browser agent to reliably log in and upvote a post on Hacker News using GPT-3.5 Turbo.

Inside Garry Tan's Claude Code Setup

Inside Garry Tan's Claude Code Setup

Garry Tan, President & CEO of Y Combinator, introduces GStack, an open-source toolkit that structures Claude into a complete AI engineering team. He demonstrates how GStack's skills—like 'Office Hours' for idea validation, 'Design Shotgun' for UI mockups, and browser-based QA—streamline the development process from concept to code.

Catastrophic agent failure and how to avoid it // Edward Upton // Agents in Production 2025

Catastrophic agent failure and how to avoid it // Edward Upton // Agents in Production 2025

Edward, a founding engineer at Asteroid, discusses the critical challenge of managing catastrophic failures in agentic browser solutions, particularly in high-stakes domains like healthcare and insurance. He shares real-world examples of agent failures and outlines a practical framework for building more reliable, predictable, and accountable agents by scoping their capabilities, implementing robust human-in-the-loop tooling, and employing independent evaluation systems.