Skip to content
Related Articles

Related Articles

Improve Article

Difference Between Hadoop and SQL

  • Last Updated : 30 Apr, 2020

Hadoop: It is a framework that stores Big Data in distributed systems and then processes it parallelly. Four main components of Hadoop are Hadoop Distributed File System(HDFS), Yarn, MapReduce, and libraries. It involves not only large data but a mixture of structured, semi-structured, and unstructured information. Amazon, IBM, Microsoft, Cloudera, ScienceSoft, Pivotal, Hortonworks are some of the companies using Hadoop technology.

SQL: Structured Query Language is a domain-specific language used in computing and to handle data management in relational database management systems, it also processes data streams in relational data stream management systems. In nutshell, SQL is a standard Database language that is used for creating, storing and extracting data from relational databases such as MySQL, Oracle, SQL Server, etc.

Below is a table of differences between Hadoop and SQL:

FeatureHadoopSQL
TechnologyModernTraditional
VolumeUsually in PetaBytesUsually in GigaBytes
OperationsStorage, processing, retrieval and pattern extraction from dataStorage, processing, retrieval and pattern mining of data
Fault ToleranceHadoop is highly fault tolerantSQL has good fault tolerance
StorageStores data in the form of key-value pairs, tables, hash map etc in distributed systems.Stores structured data in tabular format with fixed schema in cloud
ScalingLinearNon linear
ProvidersCloudera, Horton work, AWS etc. provides Hadoop systems.Well-known industry leaders in SQL systems are Microsoft, SAP, Oracle etc.
Data AccessBatch oriented data accessInteractive and batch oriented data access
CostIt is open source and systems can be cost effectively scaledIt is licensed and costs a fortune to buy a SQL server, moreover if system runs out of storage additional charges also emerge
TimeStatements are executed very quicklySQL syntax is slow when executed in millions of rows
OptimizationIt stores data in HDFS and process though Map Reduce with huge optimization techniques.It does not have any advanced optimization techniques
StructureDynamic schema, capable of storing and processing log data, real-time data, images, videos, sensor data etc.(both structured and unstructured)Static Schema, capable of storing data(fixed schema) in tabular format only(structured)
Data UpdateWrite data once, read data multiple timesRead and Write data multiple times
IntegrityLowHigh
InteractionHadoop uses JDBC(Java Database Connectivity) to communicate with SQL systems to send and receive dataSQL systems can read and write data to Hadoop systems
HardwareUses commodity hardwareUses propriety hardware
TrainingLearning Hadoop for entry-level as well as seasoned profession is moderately hardLearning SQL is easy for even entry-level professionals
My Personal Notes arrow_drop_up
Recommended Articles
Page :