Vector Quantization

Vector quantization

May 19, 2026

Personalization in the Era of LLMs - Shivam Verma, Spotify

Spotify is personalizing open-weight LLMs without full fine-tuning by combining three key components: foundational user embeddings from streaming history, 'Semantic IDs' that tokenize its 100M+ item catalog, and a 'soft tokenization' layer that projects a user's embedding directly into the LLM's context. This allows the model to autoregressively generate the next song or podcast as the next token in a sequence.

Jul 28, 2025

Make some noise: Teaching the language of audio to an LLM using sound tokens

Shivam Mehta from KTH presents a method for teaching Large Language Models (LLMs) to understand and generate audio by treating it as a discrete language. The approach involves a two-step process: first, creating an ultra-low bitrate (0.293 kbps) audio representation using a causal variational autoencoder, and second, fine-tuning a Llama 7B model with these audio tokens using LoRA.