Difference between Structured, Semi-structured and Unstructured data

Big Data includes huge volume, high velocity, and extensible variety of data. These are 3 types: Structured data, Semi-structured data, and Unstructured data.

  1. Structured data –
    Structured data is data whose elements are addressable for effective analysis. It has been organized into a formatted repository that is typically a database. It concerns all data which can be stored in database SQL in a table with rows and columns. They have relational keys and can easily be mapped into pre-designed fields. Today, those data are most processed in the development and simplest way to manage information. Example: Relational data.
  2. Semi-Structured data –
    Semi-structured data is information that does not reside in a relational database but that have some organizational properties that make it easier to analyze. With some process, you can store them in the relation database (it could be very hard for some kind of semi-structured data), but Semi-structured exist to ease space. Example: XML data.
  3. Unstructured data –
    Unstructured data is a data that is which is not organized in a pre-defined manner or does not have a pre-defined data model, thus it is not a good fit for a mainstream relational database. So for Unstructured data, there are alternative platforms for storing and managing, it is increasingly prevalent in IT systems and is used by organizations in a variety of business intelligence and analytics applications. Example: Word, PDF, Text, Media logs.

Differences between Structured, Semi-structured and Unstructured data:

Properties Structured data Semi-structured data Unstructured data
Technology It is based on Relational database table It is based on XML/RDF It is based on character and binary data
Transaction management Matured transaction and various concurrency technique Transaction is adapted from DBMS not matured No transaction management and no concurrency
Version management Versioning over tuples,row,tables Versioning over tuples or graph is possible Versioned as whole
Flexibility It is sehema dependent and less flexible It is more flexible than structuded data but less than flexible than unstructured data it very flexible and there is abbsence of schema
Scalability It is very difficult to scale DB schema It’s scaling is simpler than sstructured data It is very scalable
Robustness Very robust New technology, not very spread
Query performance Structured query allow complex joining Queries over anonymous nodes are possible Only textual query are possible

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.

Improved By : agentkirkwood

Article Tags :
Practice Tags :


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.