Introducing SCORE-Bench: An Open Benchmark for Document Parsing

Post Details

Company

Unstructured

Date Published

Dec. 2, 2025

Author

Unstructured

Word Count

1,154

Language

English

Hacker News Points

-

Source URL

unstructured.io/blog/introducing-score-bench-an-open-benchmark-for-document-parsing

Summary

In the document parsing field, transparency issues hinder fair comparisons of system accuracy, as traditional evaluation metrics are outdated for modern vision-language models that produce diverse valid outputs. To address this, SCORE-Bench, a newly open-sourced benchmark dataset, offers a diverse collection of real-world documents with expert annotations, enabling fair comparisons and independent validation of document parsing systems. SCORE-Bench includes complex and varied formats, such as handwritten forms and technical manuals, to differentiate robust production-ready systems from research prototypes, addressing real-world challenges like poor scan quality and mixed languages. The new Structural and Content Robust Evaluation (SCORE) framework mitigates biases in traditional metrics by evaluating systems on content fidelity, hallucination control, and table extraction, proving particularly challenging for systems due to skewed text, dense layouts, and semantic ambiguity. The Unstructured pipelines demonstrate leading performance across several metrics, such as Adjusted Clean Concatenated Text (CCT) for content fidelity and maintaining low hallucination rates, establishing themselves as the most production-ready solutions. The dataset and evaluation code are available on Hugging Face and GitHub, inviting the community to test and benchmark their systems using this comprehensive methodology.