Red Teaming

Feb 23, 2026

Time to become a hacker // Matt Sharp

In this talk, Matt Sharp explains that while 2025 is the year of AI agents, it's also the year of cybercrime. The rush to create frictionless, user-friendly agents has led to a neglect of fundamental security principles, creating a perfect environment for hackers who are now using these same powerful AI tools to innovate and scale their attacks.

Jan 08, 2026

Hacking AI Systems: How to (Still) Trick Artificial Intelligence • Katharine Jarmul • GOTO 2025

To build secure AI systems, we must first learn to break them. Katharine Jarmul explores the landscape of adversarial AI, detailing how attackers exploit fundamental weaknesses in deep learning models—from poisoned training data and overparameterization to the attention mechanism itself. This talk provides a practical taxonomy of attacks and a primer on building robust defenses.

Oct 10, 2025

Evals Aren't Useful? Really?

A deep dive into the critical importance of robust evaluation for building reliable AI agents. The summary covers bootstrapping evaluation sets, advanced testing techniques like multi-turn simulations and red teaming, and the necessity of integrating traditional software engineering and MLOps practices into the agent development lifecycle.

Oct 06, 2025

Ideas: More AI-resilient biosecurity with the Paraphrase Project

Microsoft’s Eric Horvitz and guests discuss the Paraphrase Project, a two-year red-teaming effort that uncovered and patched a significant biosecurity vulnerability, demonstrating a model for responsibly managing the dual-use risks of generative AI in protein design.

Oct 02, 2025

How to Become an Ethical Hacker: Skills, Certifications, & Advice

Cybersecurity experts Jeff Crume and Patrick Fussell outline the essential skills, mindset, certifications, and career paths for aspiring ethical hackers, offering practical advice for breaking into the field of penetration testing and red teaming.

Sep 25, 2025

When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs

Hanna Kim from KAIST explores the significant cybersecurity risks posed by web-enabled Large Language Model (LLM) agents. The research investigates how these agents, equipped with web search and navigation tools, can be misused to automate and scale cyberattacks involving personal data, such as PII collection, impersonation, and spear-phishing, while easily bypassing existing safety measures.