Company
Date Published
Author
Hui Wen Goh, Jay Zhang, Ulyana Tkachenko, Jonas Mueller
Word count
1890
Language
English
Hacker News points
None

Summary

The Trustworthy Language Model (TLM) enhances the accuracy of responses from various base language models (LLMs) such as GPT-4, GPT-3.5, and Claude 3 by scoring the trustworthiness of the responses to reduce errors without altering the prompts or relying on any additional models. TLM demonstrates the ability to decrease error rates significantly across a range of datasets like TriviaQA, ARC, SVAMP, and GSM8k, showcasing improvements over the base models. It operates by sampling multiple candidate responses, scoring their trustworthiness, and selecting the most reliable one, thereby improving the accuracy of LLM responses. While TLM increases accuracy, it may require longer runtimes, making it more suitable for data processing tasks rather than latency-sensitive applications.