Basis of Classification of Data
For performing statistical analysis, various kinds of data are gathered by the investigator or analyst. The information gathered is usually in raw form which is difficult to analyze. To make the analysis meaningful and easy, the raw data is converted or classified into different categories based on their characteristics. This grouping of data into different categories or classes with similar or homogeneous characteristics is known as the Classification of Data. Each division or class of the gathered data is known as a Class. The different basis of classification of statistical information are Geographical, Chronological, Qualitative (Simple and Manifold), and Quantitative or Numerical.
For example, if an investigator wants to determine the poverty level of a state, he/she can do so by gathering the information of people of that state and then classifying them on the basis of their income, education, etc.
According to Conner, “Classification is the process of arranging things (either actually or notionally) in groups or classes according to their resemblances and affinities, and gives expression to the unity of attributes that may exist amongst a diversity of individuals.”
The main objectives of Classification of Data are as follows:
- Explain similarities and differences of data
- Simplify and condense data’s mass
- Facilitate comparisons
- Study the relationship
- Prepare data for tabular presentation
- Present a mental picture of the data
Basis of Classification of Data
The classification of statistical data is done after considering the scope, nature, and purpose of an investigation and is generally done on four bases; viz., geographical location, chronology, qualitative characteristics, and quantitative characteristics.
1. Geographical Classification
The classification of data on the basis of geographical location or region is known as Geographical or Spatial Classification. For example, presenting the population of different states of a country is done on the basis of geographical location or region.
2. Chronological Classification
The classification of data with respect to different time periods is known as Chronological or Temporal Classification. For example, the number of students in a school in different years can be presented on the basis of a time period.
3. Qualitative Classification
The classification of data on the basis of descriptive or qualitative characteristics like region, caste, sex, gender, education, etc., is known as Qualitative Classification. A qualitative classification can not be quantified and can be of two types; viz., Simple Classification and Manifold Classification.
When based on only one attribute, the given data is classified into two classes, which is known as Simple Classification. For example, when the population is divided into literate and illiterate, it is a simple classification.
When based on more than one attribute, the given data is classified into different classes, and then sub-divided into more sub-classes, which is known as Manifold Classification. For example, when the population is divided into literate and illiterate, then sub-divided into male and female, and further sub-divided into married and unmarried, it is a manifold classification.
4. Quantitative Classification
The classification of data on the basis of the characteristics, such as age, height, weight, income, etc., that can be measured in quantity is known as Quantitative Classification. For example, the weight of students in a class can be classified as quantitative classification.