Home / Companies / Unstructured / Blog / Post Details
Content Deep Dive

Speeding Up Text Generation with Non-Autoregressive Language Models

Blog post from Unstructured

Post Details
Company
Date Published
Author
Unstructured
Word Count
852
Language
English
Hacker News Points
-
Summary

Unstructured is focusing on optimizing Vision Transformers (ViTs) for faster conversion of PDFs and images into structured formats like JSON for industrial applications. The team is addressing the high computational cost associated with autoregressive language models, which generate text sequentially and have quadratic cost relative to token length. They are exploring non-autoregressive methods that generate text without relying on previously generated tokens, thus reducing processing time. Two notable methods, ELMER and CALM, aim to improve text generation speed through techniques like early exit and bi-directional generation, without compromising accuracy. These approaches are being tested to enhance document understanding models for real-world use, with ongoing research shared on platforms like LinkedIn, Huggingface, and GitHub.