Apache hive is a data-warehousing tool built on top of Hadoop. The structured data can be handled with the Hive query language. In this article, we are going to see options that are available with databases in the Hive.
The database is used for storing information. The hive will create a directory for each of its created databases. All the tables that are created inside the database will be stored inside the sub-directories of the database directory. We can find the location on HDFS(Hadoop Distributed File System) where the directories for the database are made by checking hive.metastore.warehouse.dir property in /conf/hive-site.xml file.
/user/hive/warehouse is the default directory location set in hive.metastore.warehouse.dir property where all database and table directories are made. The location is configurable and we can change it as per our requirement. For example, if we have created the database with the name Test then Hive will create the directory /user/hive/warehouse/Test.db. Let’s perform a quick demo on this.
Step 1: Start all the Hadoop daemons.
Step 2: Start Hive shell.
Step 3: Create a database with the name Test.
CREATE DATABASE <database-name>;
create database Test;
Step 4: Check the location /user/hive/warehouse on HDFS to see whether the database directory is made or not. For Hadoop 3 go to http://localhost:9870 and For Hadoop 2 go to http://localhost:50070 to browse the name node. Click on Utilities -> Browse the file system then move to /user/hive/warehouse.
In the above image, we can see that the Test.db database is available.
Options Available with Database in Hive
The Location option helps the user to override the default location where the database directory is made. As we know the default directory where the databases made is /user/hive/warehouse. so we can change this directory with this option.
Let’s create a directory with the name hive_db on HDFS with the help of the below command.
hdfs dfs -mkdir /hive_db
Now, the syntax to use the Location option with the create database command is shown below.
CREATE DATABASE <database_name> LOCATION '/<directory_path_on_HDFS>';
Create the database with the name Temp in /hive_db directory on HDFS. Here, the LOCATION will override the default location where the database directory is made. Now the tables you make for this database will be created inside /hive_db in HDFS.
CREATE DATABASE Temp LOCATION '/hive_db';
We can add comments with the database we have created. We can add a few reasons why we have created that database etc.
CREATE DATABASE <database_name> COMMENT '<comment you are adding>';
CREATE DATABASE student COMMENT 'The DB stores data of students';
We can use DESCRIBE to describe our database. It is used with databases, tables, and view in the hive. The option will show the database location and the other information regarding that database.
DESCRIBE DATABASE <database_name>;
DESCRIBE DATABASE Temp; DESCRIBE DATABASE student;
4. WITH DBPROPERTIES
We can add some properties or information to our database in the form of a key-value pair with this option. The properties added with this option can only be viewed by using EXTENDED option with DESCRIBE DATABASE command.
CREATE DATABASE <database_name> WITH DBPROPERTIES ('<key-name>' = '<value>');
CREATE DATABASE employee WITH DBPROPERTIES ('made by' = 'GFG', 'date' = '2020-10-10', 'company' = 'GeeksForGeeks');
Now, let’s see these values with the Describe database.
DESCRIBE DATABASE employee; # does not show the property added with WITH DBPRROPERTIES DESCRIBE DATABASE EXTENDED employee; # shows the property added with WITH DBPROPERTIES
The USE command is used to use databases to work on it. Since multiple databases are available so we can select or choose the database to use with the USE command or option.
DROP is used to drop the existing database.
DROP DATABASE <database_name>;
DROP DATABASE employee;
SHOW is used to show the existing available database list.