Doc etl

Unlocking Unstructured Data with LLMs

Unlocking Unstructured Data with LLMs

Shreya Shankar of UC Berkeley discusses DocETL, a MapReduce-style framework that leverages LLMs to extract, analyze, and structure insights from unstructured enterprise data. The conversation covers practical architecture patterns, the role of non-determinism, strategies for model selection (including fine-tuning and multi-LLM pipelines), and the importance of user experience in this emerging field.