Open In App

Precision and Recall in Information Retrieval

Last Updated : 26 Aug, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Information Systems can be measured with two metrics: precision and recall. When a user decides to search for information on a topic, the total database and the results to be obtained can be divided into 4 categories:

  1. Relevant and Retrieved
  2. Relevant and Not Retrieved
  3. Non-Relevant and Retrieved
  4. Non-Relevant and Not Retrieved

Relevant items are those documents that help the user in answering his question.Non-Relevant items are items that don’t provide actually useful information. For each item there are two possibilities it can be retrieved or not retrieved by the user’s query. Precision is defined as the ratio of the number of relevant and retrieved documents(number of items retrieved that are actually useful to the user and match his search need) to the number of total retrieved documents from the query. Recall is defined as ratio of the number of retrieved and relevant documents(the number of items retrieved that are relevant to the user and match his needs) to the number of possible relevant documents(number of relevant documents in the database).Precision measures one aspect of information retrieval overhead for a user associated with a particular search. If a search has 85 percent precision then 15(100-85) percent of user effort is overhead reviewing non-relevant items. Recall measures to what extent a system processing a particular query is able to retrieve the relevant items the user is interested in seeing. Recall is a very useful concept but due to the denominator is non-calculable in operational systems. If the system is made known the total set of relevant items in the database, recall can be made calculable.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads