Open In App

Hadoop – Python Snakebite CLI Client, Its Usage and Command References

Last Updated : 14 Oct, 2020
Improve
Improve
Like Article
Like
Save
Share
Report

Python Snakebite comes with a CLI(Command Line Interface) client which is an HDFS based client library. The hostname or IP address of the NameNode and RPC port of the NameNode must be known in order to use python snakebite CLI. We can list all of these port values and hostname by simply creating our own configuration file which contains all of these details of NameNode like the hostname of Localhost and RPC(Remote Procedure Call) port. In our demonstration, we will be using a more simpler way to use snakebite CLI by directly passing this port and host values to the command itself. Remote Procedure Call or RPC is a way to allocate port dynamically and is used for server and remote administration applications.  

The values we are using here for hostname and port value can be found in the hadoop/etc/hadoop/core-site.xml file in fs.default.name property in your system. We can visit Snakebite CLI documentation to get more information about Snakebite CLI configuration.

We can also check fs.default.name property value with the help of the below command.

hdfs getconf -confKey fs.defaultFS      # We can also use fs.default.name but fs.defaultFS is most favourable

 Let’s see the fs.default.name property value manually in core-site.xml file in our system to know host or port.

We can see our default host is localhost or the port is 9000.

Usage Of Snakebite CLI

With the help of python snakebite CLI, we can easily implement most of the commands that we use with hdfs dfs like ls, mv, rm, put, get, du, df, etc. So let’s perform some basic operation to understand how Snakebite CLI works.

Using Snakebite CLI via path in command line – eg: hdfs://namenode_host:port/path

1. Listing all the directory’s available in the root directory of HDFS

Syntax:

snakebite ls hdfs://localhost:9000/<path>

Example:

snakebite ls hdfs://localhost:9000/

Listing all the directory's available in the root directory of HDFS

2. Removing a file from HDFS

Syntax:

snakebite rm  hdfs://localhost:9000/<file_path_with_name>

Example:

snakebite rm  hdfs://localhost:9000/data.txt

3. Creating a Directory(Name of the directory is /sample in my case)

Syntax:

snakebite mkdir hdfs://localhost:9000/<path_with_directory_name>

Example:

snakebite mkdir hdfs://localhost:9000/sample

4. Removing a Directory(Name of the directory is /sample in my case)

snakebite rmdir hdfs://localhost:9000/sample

Now with the above example, we get the idea of how we can implement and use the snakebite command-line interface. The important difference between the snakebite CLI and hdfs dfs is that the snakebite is a complete python client library and does not use any java library to communicate with the HDFS. The snakebite library’s command interacts faster with HDFS then hdfs dfs.  

CLI Command Reference

The Python Snakebite library provides lots of facilities to work with HDFS. All the switches and commands for reference can be listed with help of simple snakebite command. 

 snakebite     

We can observe that all the commands available in hdfs dfs similar commands are also available in the snakebite command-line interface. Let’s perform a few more to get a better insight into snakebite CLI.

Check the snakebite version with the below command

snakebite --ver

1. cat: It is used to print the file data

Example:

snakebite cat hdfs://localhost:9000/test.txt

2. copyToLocal (or) get: To copy files/folders from hdfs store to the local file system. 

Syntax:

snakebite copyToLocal <source> <destination>

Example:

snakebite copyToLocal  hdfs://localhost:9000/test.txt /home/dikshant/Pictures

3. touchz: It creates an empty file.

Syntax:

snakebite touchz  hdfs://localhost:9000/<name_of_directory>

Example:

snakebite touchz  hdfs://localhost:9000/demo_file

4. du: display disk usage statistics

snakebite du  hdfs://localhost:9000/    # show disk usage of root directory

snakebite du  hdfs://localhost:9000/Hadoop_File   # show disk usage of /Hadoop_File directory i.e. already available

5. stat: It will give the last modified time of directory or path. In short, it will give stats of the directory or file

snakebite stat  hdfs://localhost:9000/

snakebite stat  hdfs://localhost:9000/Hadoop_File

6 setrep: This command is used to change the replication factor of a file/directory in HDFS. By default, it is 3 for anything which is stored in HDFS (as set in hdfs core-site.xml) 

snakebite setrep 5  hdfs://localhost:9000/test.txt

In the below image, we can observe that we have change the replication factor from 1 to 5 for the test.txt file.

Similarly, we can perform multiple operations on HDFS using python snakebite CLI.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads