Home / Companies / Unstructured / Blog / Post Details
Content Deep Dive

Introducing SCORE-Bench: An Open Benchmark for Document Parsing

Blog post from Unstructured

Post Details
Company
Date Published
Author
Unstructured
Word Count
1,154
Language
English
Hacker News Points
-
Summary

In the document parsing field, transparency issues hinder fair comparisons of system accuracy, as traditional evaluation metrics are outdated for modern vision-language models that produce diverse valid outputs. To address this, SCORE-Bench, a newly open-sourced benchmark dataset, offers a diverse collection of real-world documents with expert annotations, enabling fair comparisons and independent validation of document parsing systems. SCORE-Bench includes complex and varied formats, such as handwritten forms and technical manuals, to differentiate robust production-ready systems from research prototypes, addressing real-world challenges like poor scan quality and mixed languages. The new Structural and Content Robust Evaluation (SCORE) framework mitigates biases in traditional metrics by evaluating systems on content fidelity, hallucination control, and table extraction, proving particularly challenging for systems due to skewed text, dense layouts, and semantic ambiguity. The Unstructured pipelines demonstrate leading performance across several metrics, such as Adjusted Clean Concatenated Text (CCT) for content fidelity and maintaining low hallucination rates, establishing themselves as the most production-ready solutions. The dataset and evaluation code are available on Hugging Face and GitHub, inviting the community to test and benchmark their systems using this comprehensive methodology.