Open In App

Installing and Setting Up Hadoop in Pseudo-Distributed Mode in Windows 10

To Perform setting up and installing Hadoop in the pseudo-distributed mode in Windows 10 using the following steps given below as follows. Let’s discuss one by one.

Step 1: Download Binary Package :



Download the latest binary from the following site as follows.

http://hadoop.apache.org/releases.html



For reference, you can check the file save to the folder as follows.

C:\BigData

Step 2: Unzip the binary package

Open Git Bash, and change directory (cd) to the folder where you save the binary package and then unzip as follows.

$ cd C:\BigData
MINGW64: C:\BigData
$ tar -xvzf  hadoop-3.1.2.tar.gz

For my situation, the Hadoop twofold is extricated to C:\BigData\hadoop-3.1.2.  

Next, go to this GitHub Repo and download the receptacle organizer as a speed as demonstrated as follows. Concentrate the compress and duplicate all the documents present under the receptacle envelope to C:\BigData\hadoop-3.1.2\bin. Supplant the current records too. 

Step 3: Create folders for datanode and namenode :

HADOOP_HOME=” C:\BigData\hadoop-3.1.2”
HADOOP_BIN=”C:\BigData\hadoop-3.1.2\bin”
JAVA_HOME=<Root of your JDK installation>”
Right-click -> Properties -> Advanced System settings -> Environment variables. 

Step 4: To make Short Name of Java Home path

echo %HADOOP_HOME%
echo %HADOOP_BIN%
echo %PATH%

Step 5: Configure Hadoop

Once environment variables are set up, we need to configure Hadoop by editing the following configuration files.

hadoop-env.cmd
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml
hadoop-env.cmd

First, let’s configure the Hadoop environment file. Open C:\BigData\hadoop-3.1.2\etc\hadoop\hadoop-env.cmd and add below content at the bottom

set HADOOP_PREFIX=%HADOOP_HOME%
set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop
set YARN_CONF_DIR=%HADOOP_CONF_DIR%
set PATH=%PATH%;%HADOOP_PREFIX%\bin

Step 6: Edit hdfs-site.xml 

After editing core-site.xml, you need to set the replication factor and the location of namenode and datanodes. Open C:\BigData\hadoop-3.1.2\etc\hadoop\hdfs-site.xml and below content within <configuration> </configuration> tags.

<configuration>
 <property>
    <name>dfs.replication</name>
    <value>1</value>
 </property>
 <property>
    <name>dfs.namenode.name.dir</name>
    <value>C:\BigData\hadoop-3.2.1\data\namenode</value>
 </property>
 <property>
    <name>dfs.datanode.data.dir</name>
    <value>C:\BigData\hadoop-3.1.2\data\datanode</value>
 </property>
</configuration>

Step 7: Edit core-site.xml

Now, configure Hadoop Core’s settings. Open C:\BigData\hadoop-3.1.2\etc\hadoop\core-site.xml and below content within <configuration> </configuration> tags.

<configuration>
 <property>
   <name>fs.default.name</name>
   <value>hdfs://0.0.0.0:19000</value>
 </property>  
</configuration>

Step 8: YARN configurations

Edit file yarn-site.xml      

Make sure the following entries are existing as follows.

<configuration> <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value> </property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

Step 9: Edit mapred-site.xml

At last, how about we arrange properties for the Map-Reduce system. Open C:\BigData\hadoop-3.1.2\etc\hadoop\mapred-site.xml and beneath content inside <configuration> </configuration> labels. In the event that you don’t see mapred-site.xml, at that point open mapred-site.xml.template record and rename it to mapred-site.xml

<configuration>
 <property>
    <name>mapreduce.job.user.name</name>    <value>%USERNAME%</value>
 </property>
 <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
 </property>
 <property>
    <name>yarn.apps.stagingDir</name>    <value>/user/%USERNAME%/staging</value>
 </property>
 <property>
    <name>mapreduce.jobtracker.address</name>
    <value>local</value>
 </property>
</configuration>

Check if C:\BigData\hadoop-3.1.2\etc\hadoop\slaves file is present, if it’s not then created one and add localhost in it and save it.

Step 10: Format Name Node :

To organize the Name Node, open another Windows Command Prompt and run the beneath order. It might give you a few admonitions, disregard them.

                       Format Hadoop Name Node

Step 11: Launch Hadoop :

Open another Windows Command brief, make a point to run it as an Administrator to maintain a strategic distance from authorization mistakes. When opened, execute the beginning all.cmd order. Since we have added %HADOOP_HOME%\sbin to the PATH variable, you can run this order from any envelope. In the event that you haven’t done as such, at that point go to the %HADOOP_HOME%\sbin organizer and run the order.

You can check the given below screenshot for your reference 4 new windows will open and cmd terminals for 4 daemon processes like as follows. 

Don’t close these windows, minimize them. Closing the windows will terminate the daemons. You can run them in the background if you don’t like to see these windows.

Step 12: Hadoop Web UI

In conclusion, how about we screen to perceive how are Hadoop daemons are getting along. Also you can utilize the Web UI for a wide range of authoritative and observing purposes. Open your program and begin.

Step 13: Resource Manager

Open localhost:8088 to open Resource Manager

Step 14: Node Manager

Open localhost:8042 to open Node Manager

Step 15: Name Node :

Open localhost:9870 to check out the health of Name Node

Step 16: Data Node :

Open localhost:9864 to check out Data Node


Article Tags :