Pre-requisite: Hive 3.1.2 Installation, Hadoop 3.1.2 Installation
HiveQL or HQL is a Hive query language that we used to process or query structured data on Hive. HQL syntaxes are very much similar to MySQL but have some significant differences. We will use the hive command, which is a bash shell script to complete our hive demo using CLI(Command Line Interface). We can easily start hive shell by simply typing hive in the terminal. Make sure that the /bin directory of your hive installation is mentioned in the .basrc file. The .bashrc file executes automatically when the user logs into the system and all necessary commands mentioned in this script file will run. We can simply check whether the /bin directory is available or not by simply opening it with the command as shown below.
sudo gedit ~/.bashrc
In case if the path is not added then add it so that we can directly run the hive shell from the terminal without moving to the hive directory. Otherwise, we can start hive manually by moving to apache-hive-3.1.2/bin/ directory and by pressing the hive command.
Before performing hive make sure that all of your Hadoop daemons are started and working. We can simply start all the Hadoop daemon with the below command.
start-dfs.sh # this will start namenode, datanode and secondary namenode start-yarn.sh # this will start node manager and resource manager jps # To check running daemons
Databases In Apache Hive
The Database is a storage schema that contains multiple tables. The Hive Databases refer to the namespace of tables. If you don’t specify the database name by default Hive uses its default database for table creation and other purposes. Creating a Database allows multiple users to create tables with a similar name in different schemas so that their names don’t match.
So, let’s start our hive shell for performing our tasks with the below command.
See the already existing databases using the below command.
show databases; # this will show the existing databases
Create Database Syntax:
We can create a database with the help of the below command but if the database already exists then, in that case, Hive will throw an error.
CREATE DATABASE|SCHEMA <database name> # we can use DATABASE or SCHEMA for creation of DB
CREATE DATABASE Test; # create databse with name Test show databases; # this will show the existing databases
If we again try to create a Test database hive will throw an error/warning that the database with the name Test already exists. In general, we don’t want to get an error if the database exists. So we use the create database command with [IF NOT EXIST] clause. This will do not throw any error.
CREATE DATABASE|SCHEMA [IF NOT EXISTS] <database name>
CREATE SCHEMA IF NOT EXISTS Test1; SHOW DATABASES;
Syntax To Drop Existing Databases:
DROP DATABASE <db_name>; or DROP DATABASE IF EXIST <db_name> # The IF EXIST clause again is used to suppress error
DROP DATABASE IF EXISTS Test; DROP DATABASE Test1;
Now quit hive shell with quit command.