Memory in LLMs: Weights and Activations - Jack Morris, Cornell
This talk explores the limitations of current methods for providing knowledge to LLMs, such as large context windows and Retrieval-Augmented Generation (RAG). The speaker argues that the future lies in training knowledge directly into the model's weights. This is achieved through a combination of generating large synthetic datasets from small amounts of source material and using parameter-efficient fine-tuning (PEFT) techniques like LoRA to avoid catastrophic forgetting. The goal is to create more capable, personalized, and efficient models by fundamentally altering how they store and access information.