Mapping the Mind of a Neural Net: Goodfire’s Eric Ho on the Future of Interpretability
Eric Ho, founder of Goodfire, discusses the critical challenge of AI interpretability. He shares how his team is developing techniques to understand, audit, and edit neural networks at the feature level, including breakthrough results in resolving superposition with sparse autoencoders, successful model editing demonstrations, and real-world applications in genomics with Arc Institute's DNA foundation models. Ho argues that these white-box approaches are essential for building safe, reliable, and intentionally designed AI systems.