Generationship

915: How to Jailbreak LLMs (and How to Prevent It) — with Michelle Yi

915: How to Jailbreak LLMs (and How to Prevent It) — with Michelle Yi

Tech leader and investor Michelle Yi discusses the critical technical aspects of building trustworthy AI systems. She delves into adversarial attack and defense mechanisms, including red teaming, data poisoning, prompt stealing, and "slop squatting," and explores how advanced concepts like Constitutional AI and World Models can create safer, more reliable AI.