LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family
Blog post from HuggingFace
LightOnOCR-2-1B is a second-generation, lightweight, high-performance optical character recognition (OCR) model developed by LightOn, optimized for end-to-end conversion of document pages into clean text without relying on multi-stage pipelines. Released under the Apache 2.0 license, it offers enhanced transcription capabilities and outputs bounding boxes for figures and images, making it versatile for different workflows. The model significantly outperforms its predecessor and competitors in terms of accuracy and speed, being notably smaller and faster than models like Chandra-9B and PaddleOCR-VL-0.9B. LightOnOCR-2-1B is supported by two open annotation datasets with over 16 million annotated pages, focusing on European languages and robustness to image degradation, and is integrated into the Hugging Face Transformers ecosystem for ease of use. The release includes various checkpoints for fine-tuning and layout-oriented applications, allowing users to select models based on specific needs such as transcription quality or image localization.