How to Install Apache Pig in Linux?

Last Updated : 05 Oct, 2021

Pig is a high-level platform or tool which is used to process large datasets. It provides a high-level of abstraction for processing over the MapReduce. It provides a high-level scripting language, known as Pig Latin which is used to develop the data analysis codes.

In order to install Apache Pig, you must have Hadoop and Java installed on your system.

Step 1: Download the new release of Apache Pig from this Link. In my case I have downloaded the pig-0.17.0.tar.gz version of Pig which is latest and about 220MB in size.

Step 2: Now move the downloaded Pig tar file to your desired location. In my case I am Moving it to my /Documents folder.

Apache Pig Installation - 1

Step 3: Now we extract this tar file with the help of below command (make sure to check your tar filename):

tar -xvf pig-0.17.0.tar.gz

Apache Pig Installation -2

Step 4: Once it is installed it’s time for us to switch to our Hadoop user. In my case it is hadoopusr. If you have not created the separate dedicated user for Hadoop then, in that case, no need to move that file and set the path according to your PIG PATH in the .bashrc file. To switch user you can use below command or you can also switch manually by switch user settings.

su - hadoopusr

Apache Pig Installation - 3

Step 5: Now we need to move this extracted folder to the hadoopusr user. For that, use the below command(make sure name of your extracted folder is pig-0.17.0 otherwise change it accordingly)

sudo mv pig-0.17.0 /usr/local/

Apache Pig Installation - 4

Step 6: Now once we moved it we need to change the environment variable for Pig’s location. For that open the bashrc file with below command.

sudo gedit ~/.bashrc

Apache Pig Installation - 5

Once the file open save the below path inside this bashrc file.

#Pig location 
export PIG_INSTALL=/usr/local/pig-0.17.0 
export PATH=$PATH:/usr/local/pig-0.17.0/bin 

Apache Pig Installation - 6

Step 7: Then check whether you have configured it correctly or not using the below command:

source ~/.bashrc

Apache Pig Installation - 7

Step 8: Once you get it correct that’s it we have successfully install pig to our Hadoop single node setup, now we start pig with below pig command.