Video web arena

Scaling the Next Paradigm of Heterogeneous Intelligence — Adrian Bertagnoli, Callosum

Scaling the Next Paradigm of Heterogeneous Intelligence — Adrian Bertagnoli, Callosum

Adrian Bertagnoli from Callosum argues that the era of scaling monolithic models on homogeneous GPU clusters is ending. He introduces "heterogeneous intelligence," a new paradigm where model architectures, chip types, and workflows are optimized together. By routing subtasks to the most efficient model and hardware, this approach achieves significant performance gains, as demonstrated by two key results: a 7x cost reduction in recursive reasoning tasks using Cerebras, and state-of-the-art performance on the Video Web Arena benchmark, outperforming leading GPT and Gemini models at a fraction of the cost and time.