Difference Between Apache Hadoop and Apache Storm
Apache Hadoop: It is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.
Apache Storm: It is a distributed stream processing computation framework written predominantly in the Clojure programming language. Originally created by Nathan Marz and the team at BackType, the project was open-sourced after being acquired by Twitter.
Below is a table of differences between Apache Hadoop and Apache Storm:
|Features||Apache Hadoop||Apache Storm|
|Processing||Distributed batch processing which uses MapReduce||Distributed real-time data processing which uses DAGs|
|Latency||High Latency i.e slow computation||Low Latency i.e fast computation|
|Written Language||Whole frame work is written in Java||Frame work is written in Clojure and Java|
|Streaming processing||It is State-ful streaming processing||It is State-less streaming processing|
|Setup||Easy to setup but operating cluster is hard||Easy to use|
|Data streaming||Data is dynamic and continuously streamed||Data is static and nonvolatile i.e data is persistence|
|Use cases||It is used in Twitter, Navisite, Wego etc||It is used in Black Box Data, Search Engine Data etc|
|Architecture||Hadoop comprises HDFS (used for data storage) and MapReduce (used for Computation) as architectural units.||Storm comprises streams, spouts, and bolts as their architectural units.|