Fine tuning

Introducing Our Approach to Design Document Review Using Business-Specific Large Language Models

Introducing Our Approach to Design Document Review Using Business-Specific Large Language Models

Hitachi's Financial Business Unit developed a specialized LLM to automate the review of system design documents, addressing the inadequacy of general-purpose AI for mission-critical systems. This presentation details the model's development using Continued Pre-training and LoRA on proprietary data, its integration into a multi-agent architecture, and the use of Weights & Biases for MLOps, which led to a 70% reduction in manual review workload.

Post-training best-in-class models in 2025

Post-training best-in-class models in 2025

An expert overview of post-training techniques for language models, covering the entire workflow from data generation and curation to advanced algorithms like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning (RL), along with practical advice on evaluation and iteration.

Memory in LLMs: Weights and Activations - Jack Morris, Cornell

Memory in LLMs: Weights and Activations - Jack Morris, Cornell

This talk explores the limitations of current methods for providing knowledge to LLMs, such as large context windows and Retrieval-Augmented Generation (RAG). The speaker argues that the future lies in training knowledge directly into the model's weights. This is achieved through a combination of generating large synthetic datasets from small amounts of source material and using parameter-efficient fine-tuning (PEFT) techniques like LoRA to avoid catastrophic forgetting. The goal is to create more capable, personalized, and efficient models by fundamentally altering how they store and access information.

Build Hour: Agent RFT

Build Hour: Agent RFT

Will Hang and Theophile Sautory from OpenAI provide a deep dive into Agent RFT, a powerful method for fine-tuning large language models to become more effective, tool-using agents. They explain how Agent RFT enables models to learn directly from their interactions with custom tools and reward signals, leading to significant improvements in performance, latency, and efficiency on specialized tasks. The session includes a detailed code demo, best practices, and success stories from companies like Cognition, Ambience, and Rogo.

Fine-Tuned Models Are Getting Out of Hand

Fine-Tuned Models Are Getting Out of Hand

A deep dive into how fine-tuned Small Language Models (SLMs) and RAG systems can be combined to create personalized AI agents that learn user-specific workflows, emulate decision-making, and collaborate with humans, moving beyond conversational interfaces to direct action within enterprise environments.

Introducing serverless reinforcement learning: Train reliable AI agents without worrying about GPUs

Introducing serverless reinforcement learning: Train reliable AI agents without worrying about GPUs

Kyle Corbett and Daniel from CoreWeave (formerly Openpipe) discuss the practical advantages of Reinforcement Learning (RL) over Supervised Fine-Tuning (SFT) for building reliable and efficient AI agents. They introduce Serverless RL, a new platform designed to eliminate the infrastructure complexities of RL training, and share a playbook for teams looking to get started.