W& b weave

Large-scale agentic quant research with Weights & Biases

Large-scale agentic quant research with Weights & Biases

Explore how Weights & Biases (W&B) enhances reliability, reproducibility, and explainability in large-scale, agent-driven quantitative research. This video demonstrates two core applications: debugging multi-agent alpha research pipelines with W&B Weave to identify root causes and iterate on forecasts, and automating strategy optimization using W&B Models to tune agent weights and gain insights from performance convergence and parallel coordinate plots.

Fully Connected Tokyo: [Hands-on workshop] Automation of document workflows in financial industry

Fully Connected Tokyo: [Hands-on workshop] Automation of document workflows in financial industry

This workshop by Upstage demonstrates how to automate financial document workflows using a combination of their specialized Document AI (Document Parse) and Large Language Models (LLMs). The session covers building robust information extraction pipelines, addressing challenges like varied templates and data formatting, and implementing systematic evaluation using Weights & Biases Weave. It also presents real-world case studies from the insurance industry, showcasing significant improvements in efficiency and data utilization.

Build and monitor multi-agent contact centers using Weights & Biases

Build and monitor multi-agent contact centers using Weights & Biases

This post explores the shift from costly legacy contact center software to multi-agent AI systems. It demonstrates how to build, monitor, and evaluate these complex agentic systems using the Weights & Biases AI Developer Platform, with a focus on tracing, quality assessment, and ensuring consistent customer support.

Production monitoring for AI applications using W&B Weave

Production monitoring for AI applications using W&B Weave

Learn how W&B Weave's online evaluations enable real-time monitoring of AI applications in production, allowing teams to track performance, catch failures, and iterate on quality over time using LLM-as-a-judge scores.