Attention mechanism

Attention, World Models and the Future of AI — with Prof. Kyunghyun Cho

Attention, World Models and the Future of AI — with Prof. Kyunghyun Cho

Professor Kyunghyun Cho, a co-author of the first paper on attention, discusses the future of AI. He argues that today’s models have already captured most correlations in passive data, making the real challenge about actively choosing which data to collect. He also explores the open debate around world models, the surprising lack of coding agent adoption among his students, and the foundational work that led to Retrieval-Augmented Generation (RAG).

Sparse Activation is the Future of AI (with Adrian Kosowski)

Sparse Activation is the Future of AI (with Adrian Kosowski)

Adrian Kosowski from Pathway explains their groundbreaking research on sparse activation in AI, moving beyond the dense architectures of transformers. Their model, Baby Dragon Hatchling (BDH), mimics the brain's efficiency by activating only a small fraction of its artificial neurons, enabling a new, more scalable, and compositional approach to reasoning that isn't confined by the vector space limitations of current models.

Inside GPT – The Maths Behind the Magic • Alan Smith • GOTO 2024

Inside GPT – The Maths Behind the Magic • Alan Smith • GOTO 2024

A deep dive into the internal workings of Large Language Models like GPT, explaining the journey from a text prompt through tokenization, embeddings, and the attention mechanism to generate a response.