Home / Companies / Reducto / Blog / Post Details
Content Deep Dive

Introducing RolmOCR: A Faster, Lighter Open Source Document Model Built on olmOCR

Blog post from Reducto

Post Details
Company
Date Published
Author
-
Word Count
629
Language
English
Hacker News Points
-
Summary

Earlier this year, the Allen Institute for AI released olmOCR, an open-source OCR model for parsing complex documents, which has now been succeeded by RolmOCR, a faster, memory-efficient alternative that maintains robust performance across various document types. RolmOCR, built on the updated Qwen2.5-VL-7B model, omits the use of metadata, reducing prompt length and resource consumption without significantly impacting accuracy in most cases, though it may perform less effectively in scenarios where metadata provides essential context. Trained on the same dataset as olmOCR but incorporating rotated data to improve robustness, RolmOCR demonstrates either improved or equivalent performance in OCR tasks, such as better character recognition in handwritten notes and more accurate information extraction from low-contrast images, although it may sometimes miss structured elements like subtitles in the absence of metadata. Released under the Apache 2.0 license, RolmOCR is available for open-source exploration and development, with the potential for further enhancements tailored to specific needs, and feedback or comparisons with other models are welcomed by the developers.