Home / Companies / Vectara / Blog / Post Details
Content Deep Dive

HHEM v2: A New and Improved Factual Consistency Scoring Model

Blog post from Vectara

Post Details
Company
Date Published
Author
Forrest Bao, Miaoran Li and Rogger Luo
Word Count
1,765
Language
English
Hacker News Points
-
Summary

Hallucinations in generative AI, particularly in Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems, present a significant challenge by producing outputs not grounded in the input data, thereby affecting the reliability of these technologies. Traditional methods of detecting these discrepancies, such as employing LLM judges, are often costly, slow, and inaccurate. Vectara's open-source Hughes Hallucination Evaluation Model (HHEM), which has been widely downloaded, offers a solution by providing a factual consistency score that is both efficient and multilingual, supporting languages like English, German, and French. HHEM v2, an improved version, offers calibrated scores with probabilistic meanings, ensuring more accurate detection of factual inconsistencies while maintaining low latency, making it more efficient than larger models like GPT-3.5. Despite being tested against established benchmarks like AggreFact and RAGTruth, which highlight the challenges of accurately detecting hallucinations, HHEM v2 stands out for its superior performance in assessing factual consistency, thereby enhancing trust in generative AI outputs and offering a practical tool for enterprises seeking reliable AI solutions.