Mlops

Introduction to LLM serving with SGLang - Philip Kiely and Yineng Zhang, Baseten

Introduction to LLM serving with SGLang - Philip Kiely and Yineng Zhang, Baseten

A deep dive into SGLang, an open-source serving framework for LLMs. This summary covers its core features, history, performance optimization techniques like CUDA Graph and Eagle 3 speculative decoding, and how to contribute to the project.

Real-time Feature Generation at Lyft // Rakesh Kumar // MLOps Podcast #334

Real-time Feature Generation at Lyft // Rakesh Kumar // MLOps Podcast #334

Rakesh Kumar from Lyft details the evolution of their real-time feature generation platform, from cron jobs to a sophisticated streaming architecture using Apache Beam and Flink. Key discussions include solving the 'hot shard' problem with geohashes, building a custom geospatial feature store, and optimizing pipelines with YAML-based configurations.

The Quantum Advantage Is Real—But Where's the Infrastructure?

The Quantum Advantage Is Real—But Where's the Infrastructure?

While general-purpose quantum computers are a decade away, specialized quantum accelerators are already tackling high-speed inference for AI problems in finance and pharma. This summary explores the practical use cases, the immense data ops and MLOps challenges due to the 'no-cloning theorem,' and the need for a new modeling paradigm based on topological data analysis.

MLflow 3.0: The Future of AI Agents

MLflow 3.0: The Future of AI Agents

Eric Peter from Databricks outlines the evolution from the traditional MLOps lifecycle to the more complex Agent Ops lifecycle. He details the five essential components of a successful agent development platform and introduces MLflow 3.0, a new release designed to provide a comprehensive, open-standard solution for building, evaluating, and deploying AI agents.