Why Does a Trained Question Answering Model Need Data?

Company

deepset

Date Published

May 18, 2022

Author

Tuana Çelik

Word count

1659

Language

English

Hacker News points

None

URL

www.deepset.ai/blog/why-trained-qa-model-needs-data

Summary

A question answering model requires data to learn how to extract answers from it, as the models themselves do not contain any answers. The data used to train a model is referred to as context, which the model uses to search for and extract relevant information from. Pre-trained general-purpose models can often be sufficient for many use cases, but fine-tuning with domain-specific data may be necessary in certain scenarios. The process of providing data to a question answering model involves storing it in a database or data storage solution that can efficiently retrieve and pass on small pieces of text to the model, one after the other, while also handling the complexity of formatting and efficiency requirements. Solutions such as Haystack provide functionality for this journey from data to answers, allowing developers to define pipelines that combine multiple components, including databases, retrievers, readers, and more, to efficiently provide data to a question answering model.