We are going to create a database and create a table in our database. And will cover Database operations in HIVE Using CLOUDERA – VMWARE Work Station. Let’s discuss one by one.
- Hive is an ETL tool that provides an SQL-like interface between the user and the Hadoop distributed file system which integrates Hadoop.
- It is built on top of Hadoop.
- It facilitates reading, writing, and handling wide datasets that stored in distributed storage and queried by Structure Query Language (SQL) syntax.
- Need to install Cloudera – vmware workstation.
- Link to download for windows –https://www.cloudera.com/downloads/cdh.html
Cloudera enables you to deploy and manage Apache Hadoop, manipulate and analyze your data, and keep that data secure and protected.
Steps to Open Cloudera after Installation
Step 1: On your desktop VMware workstation is available. Open that.
Step 2: Now you will get an interface. Click on open a virtual device.
Step 3: Select path – In this step, you have to select the path and file where you have downloaded the file.
Step 4: Now your virtual environment is creating.
Step 5: You can view your virtual machine details in this path.
Step 6: Now open the terminal to get started with hive commands.
Step 7: Now type hive in the terminal. It will give output as follows.
[cloudera@quickstart ~]$ hive 2020-12-09 20:59:24,314 WARN [main] mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present. Continuing without it. Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties WARNING: Hive CLI is deprecated and migration to Beeline is recommended. hive>
Step 8: Now, you are all set and ready to start typing your hive commands.
Database Operations in HIVE
1. Create a database
create database database_name;
create database geeksportal;
2. Creating a table
create table geeksportal.geekdata(id int,name string);
Here id and string are the two columns.
3. Display Database
Output: Display the databases created.
4. Describe Database
describe database database_name;
describe database geeksportal;
Output: Display the HDFS path of a particular database.