Computer vision

Physical AI Forum | Builders Reveal the New Moat & Playbook | Creator & Founder's Cut | Mar 2026 |4K

Physical AI Forum | Builders Reveal the New Moat & Playbook | Creator & Founder's Cut | Mar 2026 |4K

In a live panel at the Physical AI Builders Forum, founders and operators in computer vision, robotics, and multimodal AI share their 2026 playbooks. The discussion covers the architectural differences between physical and generative AI, the strategic shift from frame AI to scene AI for enterprise value, and the critical skills needed to build and scale a modern AI business.

Computer use in Codex

Computer use in Codex

Ari Weinstein discusses how Codex's 'computer use' feature allows the AI agent to operate local Mac applications in the background by combining multimodal vision with accessibility data, enabling non-intrusive, parallel task execution.

How Transformers Finally Ate Vision – Isaac Robinson, Roboflow

How Transformers Finally Ate Vision – Isaac Robinson, Roboflow

Isaac Robinson from Roboflow explains why Vision Transformers (ViTs), despite their initial disadvantages in computational complexity and lack of inductive bias, ultimately surpassed Convolutional Neural Networks (CNNs) for computer vision tasks. The talk covers the critical roles of massive, ViT-specific pre-training methods like MAE and DINO, the architectural evolution through models like Swin, ConvNeXt, and Hiera, and optimizations borrowed from the LLM ecosystem. It culminates in a discussion on the practical deployment challenges of large foundation models like SAM and how Neural Architecture Search can bridge the gap.

Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

Dmitri Dolgov, co-CEO of Waymo, discusses the 20-year journey from the DARPA challenge to full autonomy. He explains the Waymo Foundation Model—a multimodal world action model powering the driver, simulator, and critic—and how their "end-to-end plus" architecture enables superhuman safety and exponential scaling.

Robots Don't Need More Compute. They Need This.

Robots Don't Need More Compute. They Need This.

Encord co-founders Eric and Ulrich discuss their $60M Series C, the company's origins before the AI hype, and their focus on building the essential data infrastructure for physical AI and robotics—the next frontier after LLMs.

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Wayve CEO Alex Kendall discusses their contrarian, AI-first approach to autonomous driving. He explains their journey from a garage prototype using reinforcement learning to developing a generalizable AI driver that has driven zero-shot in over 500 cities. Kendall emphasizes a strategy focused on licensing this embodied AI for mass-market consumer vehicles—a 100-million-unit-per-year opportunity—rather than building bespoke robotaxis, arguing that the future is an AI that can drive any car, anywhere.