Preference alignment

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Maxime Labonne from Liquid AI shares a playbook for post-training frontier small models (under 1GB) for on-device deployment. The talk breaks down the LFM2.5 recipe, which includes on-policy preference alignment and agentic reinforcement learning, and addresses unique challenges at the 1B scale, such as capability interference and 'doom loops', offering concrete solutions to build efficient models for tasks like data extraction and tool use.