Install Hadoop

How to Install Hadoop with Step by Step Configuration on Ubuntu?

Table of Contents

In this tutorial, we are going to install Hadoop Apache on a ubuntu system followed by the configuration of install Hadoop.

Open the terminal in Ubuntu. Hit CtrlShiftT

Step 1) Add a new user.

To keep things clean we will create a new user with the name “hadoop”.

sudo addgroup hadoop

You will be asked for a password. Enter the password and press enter.

How to Install Hadoop with Step by Step Configuration on Ubuntu?

Then execute the following command to create user.

sudo adduser --ingroup hadoop supper_user

Then enter the information.

Note: Remember the password which you entered.

How to Install Hadoop with Step by Step Configuration on Ubuntu?

Step 2) Now Configure SSH

If the SSH is not installed in your system enter the given below command in the terminal.

sudo apt-get install openssh-server

Now add the supper_user to the sudo group.

sudo adduser supper_user sudo
How to Install Hadoop with Step by Step Configuration on Ubuntu?

First, convert to the supper_user that you created above using the command given below.

su supper_user

Enter the password which you set above while creating the user.

How to Install Hadoop with Step by Step Configuration on Ubuntu?

Now for creating the authentication key pairs for SSH execute the command below.

ssh-keygen -t rsa -P ""

Remember to execute this command from supper_user that we created.

How to Install Hadoop with Step by Step Configuration on Ubuntu?

Now activate the SSH access using the command given below.

cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Let’s now test our SSH using the following command.

ssh localhost

If you see the below-given output. Congratulations you are on your way.

How to Install Hadoop with Step by Step Configuration on Ubuntu?

Step 3) Download the Hadoop from the given website

 Download Hadoop

Download the Hadoop

I downloaded the Hadoop in the download folder. Now move to download folder using the command given below.

cd /home/ahmed/Downloads/
How to Install Hadoop with Step by Step Configuration on Ubuntu?
sudo tar xzf hadoop-3.3.0.tar.gz
How to Install Hadoop with Step by Step Configuration on Ubuntu?
sudo chown -R supper_user:hadoop hadoop-3.3.0
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Step 4) Configuration of Hadoop

We are going to modify the following files.

  1. bashrc
  2. hadoop-env.sh
  3. core-site.xml
  4. hdfs-site.xml
  5. mapred-site-xml
  6. Yarn-site.xml

File 1:

sudo vi .bashrc
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Once you added the above-given variable in the file, exit from the editor.

Now apply changes using the given command below.

source ~/.bashrc
How to Install Hadoop with Step by Step Configuration on Ubuntu?

File 2:

Edit hadoop-env.sh File using the command given below.

sudo vi /home/ahmed/Downloads/hadoop-3.3.0/etc/hadoop/hadoop-env.sh

Uncomment the export statement add the path of Java.

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
How to Install Hadoop with Step by Step Configuration on Ubuntu?

File 3:

Now lets edit core-site.xml file.

sudo vi $HADOOP_HOME/etc/hadoop/core-site.xml

Add the below-given data in the core-site.xml

<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/hdoop/tmpdata</value>
</property>
<property>
  <name>fs.default.name</name>
  <value>hdfs://127.0.0.1:9000</value>
</property>
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Now move to the given below directory

cd $HADOOP_HOME/etc/hadoop
How to Install Hadoop with Step by Step Configuration on Ubuntu?
sudo mkdir -p /app/hadoop/tmp
How to Install Hadoop with Step by Step Configuration on Ubuntu?
sudo chown -R supper_user:hadoop /app/hadoop/tmp
How to Install Hadoop with Step by Step Configuration on Ubuntu?
sudo chmod 750 /app/hadoop/tmp
How to Install Hadoop with Step by Step Configuration on Ubuntu?

File 4:

Now let’s edit hdfs-site.xml File

sudo vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the following lines in the configuration tag.

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/supper_user/hdfs</value>
</property>
</configuration>
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Save and exit.

File 5:

Now edit mapred-site.xml file.

sudo vi $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the following data in the mapred-site.xml file.

<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>localhost:54311</value>
<description>MapReduce job tracker runs at this host and port.
</description>
</property>
</configuration>
How to Install Hadoop with Step by Step Configuration on Ubuntu?

File 6:

Now we are going to edit yarn-site.xml file.

sudo vi $HADOOP_HOME/etc/hadoop/yarn-site.xml
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>127.0.0.1</value>
</property>
<property>
  <name>yarn.acl.enable</name>
  <value>0</value>
</property>
<property>
  <name>yarn.nodemanager.env-whitelist</name>   
  <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Save and exit.

Now let’s format HDFS NameNode using the command below.

$HADOOP_HOME/bin/hdfs namenode -format
How to Install Hadoop with Step by Step Configuration on Ubuntu?
$HADOOP_HOME/sbin/start-yarn.sh
How to Install Hadoop with Step by Step Configuration on Ubuntu?

Type “jps” to check the Hadoop processes are running.How to Install Hadoop with Step by Step Configuration on Ubuntu?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article
Subscribe
By pressing the Subscribe button, you confirm that you have read our Privacy Policy.
Need a Free Demo Class?
Join H2K Infosys IT Online Training
Enroll Free demo class