In Apache Hive we can create tables to store structured data so that later on we can process it. The table in the hive is consists of multiple columns and records. The table we create in any database will be stored in the sub-directory of that database. The default location where the database is stored on HDFS is /user/hive/warehouse. The way of creating tables in the hive is very much similar to the way we create tables in SQL. We can perform the various operations with these tables like Joins, Filtering, etc.
To perform the below operation make sure your hive is running. Below are the steps to launch a hive on your local system.
Step 1: Start all your Hadoop Daemon
start-dfs.sh # this will start namenode, datanode and secondary namenode start-yarn.sh # this will start node manager and resource manager jps # To check running daemons
Step 2: Launch hive from terminal
Creating Table in Hive
Let’s create a database first so that we can create tables inside it. The command for creating a database is shown below.
Syntax To Make Database:
CREATE DATABASE <database-name>;
CREATE DATABASE student_detail; # this will create database student_detail SHOW DATABASES; # list down all the available databases
Now, to have access to this database we have to use it.
Syntax To Create Table in Hive
CREATE TABLE [IF NOT EXISTS] <table-name> ( <column-name> <data-type>, <column-name> <data-type> COMMENT 'Your Comment', <column-name> <data-type>, . . . <column-name> <data-type> ) COMMENT 'Add if you want' LOCATION 'Location On HDFS' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
1. We can add a comment to the table as well as to each individual column.
2. ROW FORMAT DELIMITED shows that whenever a new line is encountered the new record entry will start.
3. FIELDS TERMINATED BY ‘,’ shows that we are using ‘,’ delimiter to separate each column.
4. We can also override the default database location with the LOCATION option.
So let’s create the table student_data in our student_detail database with the help of the command shown below.
CREATE TABLE IF NOT EXISTS student_data( Student_Name STRING COMMENT 'This col. Store the name of student', Student_Rollno INT COMMENT 'This col. Stores the rollno of student', Student_Marks FLOAT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
We have successfully created the table student_data in our student_detail database with 3 different fields Student_Name, Student_Rollno, Student_Marks as STRING, INT, FLOAT respectively.
We can list down the table available in our database with the help of the command explained below.
SHOW TABLES [IN <database_name>];
SHOW TABLES IN student_detail;
Now, Finally, let’s check the location on HDFS where our student_detail database and student_data table is made. Move to localhost:50070/ for Hadoop 2 and to localhost:9870/ for Hadoop 3. Then Utilities -> Browse the file system and go to /user/hive/warehouse which is a default location where hive databases are created.
In the above image, we can observe that the student_data table is available in HDFS inside student_detail.db.