Company
Date Published
Author
-
Word count
1350
Language
English
Hacker News points
None

Summary

The auto-evaluator tool aims to improve the quality of question-answer systems by evaluating answer quality and guiding improved settings for QA chains, components, and models. The tool combines elements from recent work on model-written evaluation sets and model-graded evaluation, allowing users to easily configure QA with modular components for testing. The app is now releasing an open-source, free-to-use hosted app and API, expanding usability and providing opportunities for improvement in file handling, retrieval approaches, and prompt refinement. The tool currently uses GPT-3.5-turbo as the grader but may be improved by switching to GPT4, which is preferable according to OpenAI discussions. Contributions related to file handling, prompts, models, or retrievers are considered high impact areas for further improvement.