LoRA | Tokenless

Lo ra

Jan 08, 2026

Post-training best-in-class models in 2025

An expert overview of post-training techniques for language models, covering the entire workflow from data generation and curation to advanced algorithms like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning (RL), along with practical advice on evaluation and iteration.

Aug 07, 2025

Streamline evaluation, monitoring, optimization of AI data flywheel with NVIDIA and Weights & Biases

A walkthrough of the NVIDIA Data Flywheel Blueprint, demonstrating how to use production data and Weights & Biases to systematically fine-tune AI agents. This process enhances model accuracy and efficiency by creating a continuous improvement cycle, moving beyond the limitations of prompt engineering.

Jul 31, 2025

Serving Voice AI at $1/hr: Open-source, LoRAs, Latency, Load Balancing - Neil Dwyer, Gabber

An in-depth look at Gabber's experience deploying the Orpheus text-to-speech model to production, covering latency optimization, high-fidelity LoRa-based voice cloning, and a cost-effective inference stack using vLLM and a consistent hash ring for load balancing.

Jul 28, 2025

Make some noise: Teaching the language of audio to an LLM using sound tokens

Shivam Mehta from KTH presents a method for teaching Large Language Models (LLMs) to understand and generate audio by treating it as a discrete language. The approach involves a two-step process: first, creating an ultra-low bitrate (0.293 kbps) audio representation using a causal variational autoencoder, and second, fine-tuning a Llama 7B model with these audio tokens using LoRA.