We are going to create a database and create a table in our database. And will cover Database operations in HIVE Using CLOUDERA – VMWARE Work Station. Let’s discuss one by one.
- Hive is an ETL tool that provides an SQL-like interface between the user and the Hadoop distributed file system which integrates Hadoop.
- It is built on top of Hadoop.
- It facilitates reading, writing, and handling wide datasets that stored in distributed storage and queried by Structure Query Language (SQL) syntax.
Cloudera enables you to deploy and manage Apache Hadoop, manipulate and analyze your data, and keep that data secure and protected.
Steps to Open Cloudera after Installation
Step 1: On your desktop VMware workstation is available. Open that.
Step 2: Now you will get an interface. Click on open a virtual device.
Step 3: Select path – In this step, you have to select the path and file where you have downloaded the file.
Step 4: Now your virtual environment is creating.
Step 5: You can view your virtual machine details in this path.
Step 6: Now open the terminal to get started with hive commands.
Step 7: Now type hive in the terminal. It will give output as follows.
[cloudera@quickstart ~]$ hive
2020-12-09 20:59:24,314 WARN [main] mapreduce.TableMapReduceUtil:
The hbase-prefix-tree module jar containing PrefixTreeCodec is not present. Continuing without it.
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
Step 8: Now, you are all set and ready to start typing your hive commands.
Database Operations in HIVE
1. Create a database
create database database_name;
create database geeksportal;
2. Creating a table
create table geeksportal.geekdata(id int,name string);
Here id and string are the two columns.
3. Display Database
Output: Display the databases created.
4. Describe Database
describe database database_name;
describe database geeksportal;
Output: Display the HDFS path of a particular database.