Information theory

The Mathematical Foundations of Intelligence [Professor Yi Ma]

The Mathematical Foundations of Intelligence [Professor Yi Ma]

Professor Yi Ma challenges our understanding of intelligence, proposing a unified mathematical theory based on two principles: parsimony and self-consistency. He argues that current large models merely memorize statistical patterns in already-compressed human knowledge (like text) rather than achieving true understanding. This framework re-contextualizes deep learning as a process of compression and denoising, allowing for the derivation of Transformer architectures like CRATE from first principles, paving the way for a more interpretable, white-box approach to AI.

The Mathematical Foundations of Intelligence [Professor Yi Ma]

The Mathematical Foundations of Intelligence [Professor Yi Ma]

Professor Yi Ma presents a unified mathematical theory of intelligence built on two principles: parsimony and self-consistency. He challenges the notion that large language models (LLMs) understand, arguing they are sophisticated memorization systems, and demonstrates how architectures like the Transformer can be derived from the first principle of compression.

Columbia CS Professor: Why LLMs Can’t Discover New Science

Columbia CS Professor: Why LLMs Can’t Discover New Science

Professor Vishal Misra of Columbia University introduces a formal model for understanding Large Language Models (LLMs) based on information theory. He explains how LLMs reason by navigating "Bayesian manifolds", using concepts like token entropy to explain the mechanics of chain-of-thought, and defines true AGI as the ability to create new manifolds rather than just exploring existing ones.