Classification of Data

In this article, we are going to discuss the classification of data in which we will cover structured, unstructured data, and semi-structured data. Also, we will cover the features of the data. Let’s discuss one by one. 

Data Classification : 

Process of classifying data in relevant categories so that it can be used or applied more efficiently. The classification of data makes it easy for the user to retrieve it. Data classification holds its importance when comes to data security and compliance and also to meet different types of business or personal objective. It is also of major requirement, as data must be easily retrievable within a specific period of time. 

Types of Data Classification : 

Data can be broadly classified into 3 types.



1. Structured Data :

Structured data is created using a fixed schema and is maintained in tabular format. The elements in structured data are addressable for effective analysis. It contains all the data which can be stored in the SQL database in a tabular format. Today, most of the data is developed and processed in the simplest way to manage information. 

Examples –

Relational data, Geo-location, credit card numbers, addresses, etc. 

Consider an example for Relational Data like you have to maintain a record of students for a university like the name of the student, ID of a student, address, and Email of the student. To store the record of students used the following relational schema and table for the same.

S_ID S_Name S_Address S_Email
1001 A Delhi A@gmail.com
1002 B Mumbai B@gmail.com

 

2. Unstructured Data :

It is defined as the data in which is not follow a pre-defined standard or you can say that any does not follow any organized format. This kind of data is also not fit for the relational database because in the relational database you will see a pre-defined manner or you can say organized way of data. Unstructured data is also very important for the big data domain and To manage and store Unstructured data there are many platforms to handle it like No-SQL Database.



Examples –

Word, PDF, text, media logs, etc.

3. Semi-Structured Data : 

Semi-structured data is information that does not reside in a relational database but that have some organizational properties that make it easier to analyze. With some process, you can store them in a relational database but is very hard for some kind of semi-structured data, but semi-structured exist to ease space. 

Example –

XML data.  

Features of Data Classification : 

The main goal of the organization of data is to arrange the data in such a form that it becomes fairly available to the users. So it’s basic features as following.

  • Homogeneity – The data items in a particular group should be similar to each other.
  • Clarity – There must be no confusion in the positioning of any data item in a particular group.
  • Stability – The data item set must be stable i.e. any investigation should not affect the same set of classification.
  • Elastic – One should be able to change the basis of classification as the purpose of classification changes.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :
Practice Tags :


Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.