Installation of Apache Hadoop on Ubuntu
Jul 8, 2021
Make sure to subscribe to our newsletter and be the first to know the news.
Jul 8, 2021
Make sure to subscribe to our newsletter and be the first to know the news.
We are going to learn how to install Apache Hadoop on Ubuntu from scratch.
There are two pre-requisite for you that you should have Java and Hadoop Files available on Ubuntu copy with you.
Paste the tar files for jdk and hadoop on your ubuntu desktop.
Open the terminal by right click.
Fire the command to change the location to Desktop
Use the tar commands to extract files.
Now go to Desktop → Hadoop Folder → etc→hadoop and select core-site.xml
And make following entry in the core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
It will now look like
Now select hdfs-site.xml and make following entry in that file
<property>
<name>dfs.name.dir</name>
<value>/home/username/metadata_nn/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/username/metadata_nn/dfs/name/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
It will now look like.
Now select yarn-site.xml
And make following entries in the yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property
It will now look like
Go back to Desktop and now go inside the jdk folder and start terminal there and type command pwd
Copy the given output by selecting and then right clicking
Now go back to
Desktop→ Hadoop Folder→ etc→ hadoop and select hadoop-env.sh
And make following change to JAVA_HOME variable. Replace the value with copied path
If this is your first time installation. You need to install openssh by following command
If you get a connection port 22 exception- do the following procedure
After this step completes try the above step again.
Once done, we should try to get our keygen
Press enter when it asks for the file to save the key and y when y/n question
Copy the generated key to a file
Go to Desktop→ hadoop folder→ bin and format the namenode
Once NameNode is formatted, let's set the path for Java and Hadoop home in bashrc
Go to Desktop and start the terminal and type following command'
It take sometime, check the editor sign on left for bashrc file.
Once opened make following entry at end of the bashrc file
export HADOOP_HOME=Path of your Hadoop Home
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export JAVA_HOME=Path of your Java Home
export PATH=$PATH:$JAVA_HOME/bin
My file looked like
Once done editing . Fire the following command to store the changes
source ~/.bashrc
Once the changes are done, Hadoop can be started from anywhere by invoking following command.
You can check which all nodes are working by following command.
When you want to stop hadoop.
Fire following command
stop-all.sh from anywhere
You can check the whole procedure in the following video
hello sir this website is done. hello world. naved sir++;