Home / Companies / Vectara / Blog / Post Details
Content Deep Dive

DeepSeek-R1 hallucinates more than DeepSeek-V3

Blog post from Vectara

Post Details
Company
Date Published
Author
Forrest Bao, Chenyu Xu and Ofer Mendelevitch
Word Count
960
Language
English
Hacker News Points
-
Summary

Deepseek.AI's newly released reasoning model, Deepseek-R1, has sparked widespread attention due to its impressive reasoning capabilities and cost-effectiveness compared to OpenAI's O1 model, despite debates around its $5.5 million development cost. Open-sourced under an MIT license, Deepseek-R1, however, exhibits a significantly higher hallucination rate of 14.3% compared to its predecessor, Deepseek-V3, as demonstrated through evaluations using Vectara’s HHEM and Google’s FACTS methodologies. The analysis reveals that while Deepseek-R1 maintains consistency in most samples, it produces more borderline hallucinations, leading to a higher variability in scores. Comparisons with the GPT series suggest that reasoning-enhanced models might have inherent trade-offs with hallucination rates, although the GPT series appears to balance reasoning and faithfulness better than the Deepseek models. The findings highlight the importance of careful training to mitigate hallucination risks and underscore the ongoing need for advancements in reasoning model development.