Vlm

Your Agent Can Now Train Models — Merve Noyan, Hugging Face

Your Agent Can Now Train Models — Merve Noyan, Hugging Face

Merve Noyan from Hugging Face discusses how open-source models have achieved parity with closed-source counterparts, highlighting the Hugging Face ecosystem built to support this shift. She covers tools for model selection, local agent deployment, and the transformative "Hugging Face Skills" that allow agents to automate complex ML engineering tasks like fine-tuning models with a single prompt.

Chelsea Finn: Building Robots That Can Do Anything

Chelsea Finn: Building Robots That Can Do Anything

Developing general-purpose robots requires a shift from specialized, single-task systems to broad foundation models. This is achieved through a combination of large-scale, diverse, real-world data collection and a specific training methodology: pre-training on all available data and then fine-tuning on a curated, high-quality subset of demonstrations. This recipe, combined with architectural innovations to preserve the capabilities of Vision-Language Model (VLM) backbones, enables robots to perform complex, long-horizon tasks, generalize to unseen environments, and respond to open-ended human instructions.