Open In App
Related Articles

Difference between Structured, Semi-structured and Unstructured data

Like Article
Save Article
Report issue

Big Data includes huge volume, high velocity, and extensible variety of data. There are 3 types: Structured data, Semi-structured data, and Unstructured data. 

  1. Structured data – 
    Structured data is data whose elements are addressable for effective analysis. It has been organized into a formatted repository that is typically a database. It concerns all data which can be stored in database SQL in a table with rows and columns. They have relational keys and can easily be mapped into pre-designed fields. Today, those data are most processed in the development and simplest way to manage information. Example: Relational data. 
  2. Semi-Structured data – 
    Semi-structured data is information that does not reside in a relational database but that has some organizational properties that make it easier to analyze. With some processes, you can store them in the relation database (it could be very hard for some kind of semi-structured data), but Semi-structured exist to ease space. Example: XML data. 
  3. Unstructured data – 
    Unstructured data is a data which is not organized in a predefined manner or does not have a predefined data model, thus it is not a good fit for a mainstream relational database. So for Unstructured data, there are alternative platforms for storing and managing, it is increasingly prevalent in IT systems and is used by organizations in a variety of business intelligence and analytics applications. Example: Word, PDF, Text, Media logs. 

Differences between Structured, Semi-structured and Unstructured data: 

PropertiesStructured dataSemi-structured dataUnstructured data
TechnologyIt is based on Relational database tableIt is based on XML/RDF(Resource Description Framework).It is based on character and binary data
Transaction managementMatured transaction and various concurrency techniquesTransaction is adapted from DBMS not maturedNo transaction management and no concurrency
Version managementVersioning over tuples,row,tablesVersioning over tuples or graph is possibleVersioned as a whole
FlexibilityIt is schema dependent and less flexibleIt is more flexible than structured data but less flexible than unstructured dataIt is more flexible and there is absence of schema
ScalabilityIt is very difficult to scale DB schemaIt’s scaling is simpler than structured dataIt is more scalable.
RobustnessVery robustNew technology, not very spread
Query performanceStructured query allow complex joining Queries over anonymous nodes are possibleOnly textual queries are possible

Last Updated : 06 Mar, 2023
Like Article
Save Article
Share your thoughts in the comments
Similar Reads