Machine learning

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

Members of Anthropic's interpretability team discuss their research into the inner workings of large language models. They explore the analogy of studying AI as a biological system, the surprising discovery of internal "features" or concepts, and why this research is critical for understanding model behavior like hallucinations, sycophancy, and long-term planning, ultimately aiming to ensure AI safety.

Encrypted Computation: What if Decryption Wasn’t Needed? • Katharine Jarmul • GOTO 2024

Encrypted Computation: What if Decryption Wasn’t Needed? • Katharine Jarmul • GOTO 2024

An exploration of encrypted computation, detailing how techniques like homomorphic encryption and multi-party computation can enable machine learning on encrypted data. The summary covers the core mathematical principles, real-world use cases, and open-source libraries to build more private and trustworthy AI systems.

909: Causal AI — with Dr. Robert Usazuwa Ness

909: Causal AI — with Dr. Robert Usazuwa Ness

Researcher Robert Ness discusses the practical implementation of Causal AI, distinguishing it from correlation-based machine learning. He covers the essential role of assumptions about the data-generating process, key Python libraries like DoWhy and Pyro, the intersection with LLMs, and a step-by-step workflow for tackling causal problems.