Tasks and Functionalities of Data Mining

Data Mining functions are used to define the trends or correlations contained in data mining activities.

In comparison, data mining activities can be divided into 2 categories:

  1. Descriptive Data Mining:
    It includes certain knowledge to understand what is happening within the data without a previous idea. The common data features are highlighted in the data set.
    For examples: count, average etc.



  2. Predictive Data Mining:
    It helps developers to provide unlabeled definitions of attributes. Based on previous tests, the software estimates the characteristics that are absent.
    For example: Judging from the findings of a patient’s medical examinations that is he suffering from any particular disease.

Data Mining Functionality:

1. Class/Concept Descriptions:
Classes or definitions can be correlated with results. In simplified, descriptive and yet accurate ways, it can be helpful to define individual groups and concepts.
These class or concept definitions are referred to as class/concept descriptions.

  • Data Characterization:
    This refers to the summary of general characteristics or features of the class that is under the study. For example. To study the characteristics of a software product whose sales increased by 15% two years ago, anyone can collect these type of data related to such products by running SQL queries.

  • Data Discrimination:
    It compares common features of class which is under study. The output of this process can be represented in many forms. Eg., bar charts, curves and pie charts.

2. Mining Frequent Patterns, Associations, and Correlations:
Frequent patterns are nothing but things that are found to be most common in the data.

There are different kinds of frequency that can be observed in the dataset.

  • Frequent item set:
    This applies to a number of items that can be seen together regularly for eg: milk and sugar.
  • Frequent Subsequence:
    This refers to the pattern series that often occurs regularly such as purchasing a phone followed by a back cover.
  • Frequent Substructure:
    It refers to the different kinds of data structures such as trees and graphs that may be combined with the itemset or subsequence.

Association Analysis:
The process involves uncovering the relationship between data and deciding the rules of the association. It is a way of discovering the relationship between various items. for example, it can be used to determine the sales of items that are frequently purchased together.

Correlation Analysis:
Correlation is a mathematical technique that can show whether and how strongly the pairs of attributes are related to each other. For example, Highted people tend to have more weight.

GeeksforGeeks has prepared a complete interview preparation course with premium videos, theory, practice problems, TA support and many more features. Please refer Placement 100 for details

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :
Practice Tags :


Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.