Architecture and Working of Hive
- User Interface (UI) –
As the name describes User interface provide an interface between user and hive. It enables user to submit queries and other operations to the system. Hive web UI, Hive command line, and Hive HD Insight (In windows server) are supported by the user interface.
- Hive Server – It is referred to as Apache Thrift Server. It accepts the request from different clients and provides it to Hive Driver.
- Driver –
Queries of the user after the interface are received by the driver within the Hive. Concept of session handles is implemented by driver. Execution and Fetching of APIs modelled on JDBC/ODBC interfaces is provided by the user.
- Compiler –
Queries are parses, semantic analysis on the different query blocks and query expression is done by the compiler. Execution plan with the help of the table in the database and partition metadata observed from the metastore are generated by the compiler eventually.
- Metastore –
All the structured data or information of the different tables and partition in the warehouse containing attributes and attributes level information are stored in the metastore. Sequences or de-sequences necessary to read and write data and the corresponding HDFS files where the data is stored. Hive selects corresponding database servers to stock the schema or Metadata of databases, tables, attributes in a table, data types of databases, and HDFS mapping.
- Execution Engine –
Execution of the execution plan made by the compiler is performed in the execution engine. The plan is a DAG of stages. The dependencies within the various stages of the plan is managed by execution engine as well as it executes these stages on the suitable system components.
Diagram – Architecture of Hive that is built on the top of Hadoop
In the above diagram along with architecture, job execution flow in Hive with Hadoop is demonstrated step by step.
- Step-1: Execute Query –
Interface of the Hive such as Command Line or Web user interface delivers query to the driver to execute. In this, UI calls the execute interface to the driver such as ODBC or JDBC.
- Step-2: Get Plan –
Driver designs a session handle for the query and transfer the query to the compiler to make execution plan. In other words, driver interacts with the compiler.
- Step-3: Get Metadata –
In this, the compiler transfers the metadata request to any database and the compiler gets the necessary metadata from the metastore.
- Step-4: Send Metadata –
Metastore transfers metadata as an acknowledgment to the compiler.
- Step-5: Send Plan –
Compiler communicating with driver with the execution plan made by the compiler to execute the query.
- Step-6: Execute Plan –
Execute plan is sent to the execution engine by the driver.
- Execute Job
- Job Done
- Dfs operation (Metadata Operation)
- Step-7: Fetch Results –
Fetching results from the driver to the user interface (UI).
- Step-8: Send Results –
Result is transferred to the execution engine from the driver. Sending results to Execution engine. When the result is retrieved from data nodes to the execution engine, it returns the result to the driver and to user interface (UI).