How to Install Apache Pig in Linux?

Pig is a high-level platform or tool which is used to process large datasets. It provides a high-level of abstraction for processing over the MapReduce. It provides a high-level scripting language, known as Pig Latin which is used to develop the data analysis codes.

In order to install Apache Pig, you must have Hadoop and Java installed on your system.

Step 1: Download the new release of Apache Pig from this Link. In my case I have downloaded the pig-0.17.0.tar.gz version of Pig which is latest and about 220MB in size.

Step 2: Now move the downloaded Pig tar file to your desired location. In my case I am Moving it to my /Documents folder.

Apache Pig Installation - 1



Step 3: Now we extract this tar file with the help of below command (make sure to check your tar filename):

tar -xvf pig-0.17.0.tar.gz

Apache Pig Installation -2

Step 4: Once it is installed it’s time for us to switch to our Hadoop user. In my case it is hadoopusr. If you have not created the separate dedicated user for Hadoop then, in that case, no need to move that file and set the path according to your PIG PATH in the .bashrc file. To switch user you can use below command or you can also switch manually by switch user settings.

su - hadoopusr

Apache Pig Installation - 3

Step 5: Now we need to move this extracted folder to the hadoopusr user. For that, use the below command(make sure name of your extracted folder is pig-0.17.0 otherwise change it accordingly)

sudo mv pig-0.17.0 /usr/local/

Apache Pig Installation - 4

Step 6: Now once we moved it we need to change the environment variable for Pig’s location. For that open the bashrc file with below command.

sudo gedit ~/.bashrc

Apache Pig Installation - 5



Once the file open save the below path inside this bashrc file.

filter_none

edit
close

play_arrow

link
brightness_4
code

#Pig location
export PIG_INSTALL=/usr/local/pig-0.17.0
export PATH=$PATH:/usr/local/pig-0.17.0/bin

chevron_right


Apache Pig Installation - 6

Step 7: Then check whether you have configured it correctly or not using the below command:

source ~/.bashrc

Apache Pig Installation - 7

Step 8: Once you get it correct that’s it we have successfully install pig to our Hadoop single node setup, now we start pig with below pig command.

pig

Apache Pig Installation - 8 class=

Step 9: You can check your pig version with the below command.

pig -version

Checking Apache Pig Version

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :
Practice Tags :


1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.