Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

Building Harvey-style tabular review from scratch, but better

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Abdur-Rahman Butler
Word Count
4,508
Language
-
Hacker News Points
-
Summary

The guide provides a detailed walkthrough for building a state-of-the-art tabular review application from scratch, aimed at surpassing existing tools by Harvey and Legora in terms of functionality, cost efficiency, accuracy, and performance. It emphasizes using specialized legal enrichment, embedding, and extraction models from Isaacus, avoiding generative models that are prone to hallucinations. The guide highlights how to transform unstructured documents into structured entities using models like Kanon 2 Enricher and Embedder, and how to create and extend knowledge graphs for legal document review. It outlines the setup of a server using FastAPI and a vector database for efficient span-level classification and relationship extraction, leveraging tools such as Qdrant for vector search. The application enables users to navigate documents as knowledge graphs, linking entities and sections interactively, and is open-source, allowing for adaptation and commercialization.