Reducing Hallucinations in LLMs
Blog post from Vectara
Large Language Models (LLMs) often face the issue of hallucination, where they generate incorrect information due to outdated or inaccurate embedded knowledge. Retrieval-Augmented Generation (RAG) is one method to address this by integrating external knowledge bases to provide relevant context for queries, although it doesn't completely eliminate hallucinations. The Hughes Hallucination Evaluation Model (HHEM) Leaderboard measures factual consistency in LLMs, highlighting that even top models like GPT-4-Turbo exhibit a 2.5% hallucination rate. Researchers explore several techniques to mitigate hallucinations, including decoding strategies like beam search and DoLa, factuality alignment using Direct Preference Optimization (DPO), and post-editing methods such as FAVA, which refines initial responses by correcting factual errors. While these methods improve over the baseline Greedy decoding, they each have limitations and dependencies on data sets, and some are not suitable for streaming applications. The study suggests that combining these approaches could yield better results in reducing LLM hallucination rates, and encourages further research in this area.