Model capabilities

Why Tejal Patwardhan stopped underestimating the models - Episode 21

Why Tejal Patwardhan stopped underestimating the models - Episode 21

Tejal Patwardhan, head of OpenAI's frontier evals team, discusses the critical evolution of AI evaluations. She explains why traditional benchmarks fail as models become more capable, how OpenAI develops realistic, long-horizon tests (including groundbreaking wet lab experiments), and the implications of rapidly advancing multimodal and reasoning models for scientific discovery and the future of human work.