Home / Companies / Firecrawl / Blog / Post Details
Content Deep Dive

Introducing /parse: Turn any document into LLM-ready data

Blog post from Firecrawl

Post Details
Company
Date Published
Author
Eric Ciarla
Word Count
635
Language
English
Hacker News Points
-
Summary

Firecrawl has introduced a new feature called /parse, which allows users to upload local files and receive clean, structured outputs similar to those obtained from web pages. This feature supports various file formats such as PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, and HTML, with a size limit of 50 MB per file. Powered by a Rust-based engine, /parse offers fast processing by classifying pages and utilizing GPU resources only when necessary, thus ensuring efficient extraction of text while preserving the layout, tables, and reading order in documents. Users can request outputs in markdown or structured JSON format, with options for additional features like summaries and structured extraction based on a provided JSON schema. This integration facilitates seamless document processing for web and local files, enhancing data extraction capabilities for enterprises while maintaining data security through features like Zero Data Retention.