Data quality

The Startup Powering The Data Behind AGI

The Startup Powering The Data Behind AGI

Edwin Chen, founder and CEO of Surge AI, shares the company's origin story, its rapid, bootstrapped growth, and its research-driven philosophy on data. He critiques traditional data labeling, explains why metrics like inter-annotator agreement fail for complex tasks, and offers a sharp analysis of benchmark hacking. Chen also details the future of data, from multimodal and agentic reasoning in rich RL environments to the need for hyper-specialized expertise for scientific discovery.

Building Advanced Agents Over Complex Data // Jerry Liu

Building Advanced Agents Over Complex Data // Jerry Liu

Jerry from LlamaIndex explains why naive Retrieval-Augmented Generation (RAG) fails in production and dives deep into advanced data quality techniques—from parsing complex documents and hierarchical indexing to chunking best practices—required to build robust, high-quality LLM applications.

Open AI Researchers Breakdown GPT-5

Open AI Researchers Breakdown GPT-5

OpenAI researchers discuss the step-change in capabilities in ChatGPT-5, from coding and reasoning to creative writing. They detail the data-centric training processes, the shift toward asynchronous agentic workflows, and the future of AI development and its impact on the startup ecosystem.