Weights & biases

Power agents with full context of your experiments and traces with W&B MCP server

Power agents with full context of your experiments and traces with W&B MCP server

The W&B Model Context Protocol (MCP) is a hosted endpoint that enables AI agents to intelligently interact with all Weights & Biases data, including runs, traces, evaluations, and reports. It features discovery tools for smart queries, automated analysis for comparing experiments and identifying regressions, and seamless integration with IDEs, coding agents, and chat interfaces like Mistral AI for streamlined ML workflows and on-the-go reporting.

Large-scale agentic quant research with Weights & Biases

Large-scale agentic quant research with Weights & Biases

Explore how Weights & Biases (W&B) enhances reliability, reproducibility, and explainability in large-scale, agent-driven quantitative research. This video demonstrates two core applications: debugging multi-agent alpha research pipelines with W&B Weave to identify root causes and iterate on forecasts, and automating strategy optimization using W&B Models to tune agent weights and gain insights from performance convergence and parallel coordinate plots.

Migrating from Neptune to Weights & Biases

Migrating from Neptune to Weights & Biases

A technical guide on migrating ML experiments from Neptune to Weights & Biases, covering the migration script, API-level code changes, and best practices for organizing projects and analyzing results in the W&B platform before the Neptune sunset.

Rethinking Notebooks Powered by AI

Rethinking Notebooks Powered by AI

Vincent Warmerdam from marimo discusses the recent acquisition by Weights & Biases and the future of Python notebooks. He argues that notebooks should evolve from static scratchpads into dynamic, AI-powered applications, highlighting marimo's features for LLM integration, agentic workflows, and creating interactive, reproducible development environments.

Introducing Our Approach to Design Document Review Using Business-Specific Large Language Models

Introducing Our Approach to Design Document Review Using Business-Specific Large Language Models

Hitachi's Financial Business Unit developed a specialized LLM to automate the review of system design documents, addressing the inadequacy of general-purpose AI for mission-critical systems. This presentation details the model's development using Continued Pre-training and LoRA on proprietary data, its integration into a multi-agent architecture, and the use of Weights & Biases for MLOps, which led to a 70% reduction in manual review workload.

Streamline evaluation, monitoring, optimization of AI data flywheel with NVIDIA and Weights & Biases

Streamline evaluation, monitoring, optimization of AI data flywheel with NVIDIA and Weights & Biases

A walkthrough of the NVIDIA Data Flywheel Blueprint, demonstrating how to use production data and Weights & Biases to systematically fine-tune AI agents. This process enhances model accuracy and efficiency by creating a continuous improvement cycle, moving beyond the limitations of prompt engineering.