Red teaming

Evals Aren't Useful? Really?

Evals Aren't Useful? Really?

A deep dive into the critical importance of robust evaluation for building reliable AI agents. The summary covers bootstrapping evaluation sets, advanced testing techniques like multi-turn simulations and red teaming, and the necessity of integrating traditional software engineering and MLOps practices into the agent development lifecycle.

Ideas: More AI-resilient biosecurity with the Paraphrase Project

Ideas: More AI-resilient biosecurity with the Paraphrase Project

Microsoft’s Eric Horvitz and guests discuss the Paraphrase Project, a two-year red-teaming effort that uncovered and patched a significant biosecurity vulnerability, demonstrating a model for responsibly managing the dual-use risks of generative AI in protein design.

How to Become an Ethical Hacker: Skills, Certifications, & Advice

How to Become an Ethical Hacker: Skills, Certifications, & Advice

Cybersecurity experts Jeff Crume and Patrick Fussell outline the essential skills, mindset, certifications, and career paths for aspiring ethical hackers, offering practical advice for breaking into the field of penetration testing and red teaming.

When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs

When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs

Hanna Kim from KAIST explores the significant cybersecurity risks posed by web-enabled Large Language Model (LLM) agents. The research investigates how these agents, equipped with web search and navigation tools, can be misused to automate and scale cyberattacks involving personal data, such as PII collection, impersonation, and spear-phishing, while easily bypassing existing safety measures.

Ethical Hacking in Action: Red Teaming, Pen Testing, & Cybersecurity

Ethical Hacking in Action: Red Teaming, Pen Testing, & Cybersecurity

Explore the core tasks of ethical hacking, from vulnerability scanning to red teaming. This guide covers engagement structure, hacker methodologies, key frameworks like MITRE ATT&CK, and the essential tools for cybersecurity professionals.

915: How to Jailbreak LLMs (and How to Prevent It) — with Michelle Yi

915: How to Jailbreak LLMs (and How to Prevent It) — with Michelle Yi

Tech leader and investor Michelle Yi discusses the critical technical aspects of building trustworthy AI systems. She delves into adversarial attack and defense mechanisms, including red teaming, data poisoning, prompt stealing, and "slop squatting," and explores how advanced concepts like Constitutional AI and World Models can create safer, more reliable AI.