WCC and Linux: Hadoop single node installation on linux

Hadoop single node installation on linux

Purpose : Install Hadoop in single machine ,then use put file to Hadoop file system and get those files

Steps :

1.Download and installation of Hadoop

2.Configuration

3.Use Basic command like ls, put,get,

Detailed Steps :

A. Hadoop install

Prerequisites

1.Install JAVA

2.Create hduser in OS

3. Enable SSH

Steps :

1. Download hadoop-2.6.4.tar.gz : http://hadoop.apache.org/releases.html

2. copy to /opt/app

3. tar -xzf hadoop-2.6.4.tar.gz

4.mv hadoop-2.6.4 hadoop

5. chown -R hduser:hduser hadoop

B. Start and Verify

1. edit hadoop-env.sh

Add export JAVA_HOME=

OR edit bash profile with JAVA_HOME

2. go to /opt/app/hadoop/sbin

3. ./start-all.sh (provide , password when required)

[hadoop@sterinlap sbin]$ ./start-all.sh

This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh

Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.

Starting namenodes on []

hadoop@localhost's password:

localhost: starting namenode, logging to /opt/app/hadoop/logs/hadoop-hadoop-namenode-sterinlap.out

hadoop@localhost's password:

localhost: starting datanode, logging to /opt/app/hadoop/logs/hadoop-hadoop-datanode-sterinlap.out

Starting secondary namenodes [0.0.0.0]

hadoop@0.0.0.0's password:

0.0.0.0: starting secondarynamenode, logging to /opt/app/hadoop/logs/hadoop-hadoop-secondarynamenode-sterinlap.out

starting yarn daemons

starting resourcemanager, logging to /opt/app/hadoop/logs/yarn-hadoop-resourcemanager-sterinlap.out

hadoop@localhost's password:

localhost: starting nodemanager, logging to /opt/app/hadoop/logs/yarn-hadoop-nodemanager-sterinlap.out

4. Verify the process by

ps -ef | grep hadoop

[hduser@sterinlap sbin]$ ps -ef | grep hadoop

hduser 9745 1 20 11:22 pts/5 00:00:06 /home/sterin/Public/jdk1.8.0_45/bin/java -Dproc_resourcemanager -Xmx1000m -Dhadoop.log.di

hduser 10077 1 23 11:22 ? 00:00:06 /home/sterin/Public/jdk1.8.0_45/bin/java -Dproc_nodemanager

Total : 2 process

5. Verify hadoop command by “hadoop fs -ls”

Location of commands : /opt/app/hadoop/bin

/opt/app/hadoop/bin/hadoop fs -ls

Found 11 items

-rwxr-xr-x 1 oracle oracle 159223 2016-02-12 15:27 container-executor

( Even though this shows current path , just to verify the installation )

6. Stop all Hadoop process

/opt/app/hadoop/sbin/stop-all.sh

Configuration

1. Create tmp directory for hadoop,

under /opt/app/hadoop : /opt/app/hadoop/tmp

2.edit core-site.xml

/opt/app/hadoop/etc/hadoop/core-site.xml

<name>hadoop.tmp.dir</name>

<value>/opt/app/hadoop/tmp</value>

<description>A base for other temporary directories.</description>

</property>

<name>fs.default.name</name>

<value>hdfs://localhost:54310</value>

<description>The name of the default file system. A URI whose

scheme and authority determine the FileSystem implementation. The

uri's scheme determines the config property (fs.SCHEME.impl) naming

the FileSystem implementation class. The uri's authority is used to

determine the host, port, etc. for a filesystem.</description>

</property>

</configuration>

3. Create mapred-site.xml

Location :/opt/app/hadoop/etc/hadoop/

cp mapred-site.xml.template mapred-site.xml

4. Edit mapred-site.xml

<name>mapred.job.tracker</name>

<value>localhost:54311</value>

<description>The host and port that the MapReduce job tracker runs

at. If "local", then jobs are run in-process as a single map

and reduce task.

</description>

</property>

</configuration>

5. Create namenode and the datanode folders

Location : /opt/app/hadoop/

mkdir hadoop_store/hdfs/namenode

mkdir hadoop_store/hdfs/datanode

6. Edit hdfs-site.xml

Location : /opt/app/hadoop/etc/hadoop/

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

<description>Default block replication.

The actual number of replications can be specified when the file is created.

The default is used if replication is not specified in create time.

</description>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/opt/app/hadoop/hadoop_store/hdfs/namenode</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/opt/app/hadoop/hadoop_store/hdfs/datanode</value>

</property>

</configuration>

7. Add path in bashrc ( bash profile under home directory)

edit .bashrc

PATH=$PATH:/opt/app/hadoop/bin

8. Format the hadoop file system

hadoop namenode -format

16/04/25 12:11:03 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG: host = -10-184-37-177.in..com/10.xx.37.177

STARTUP_MSG: args = [-format]

STARTUP_MSG: version = 2.6.4

16/04/25 12:11:05 INFO util.ExitUtil: Exiting with status 0

16/04/25 12:11:05 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at l-10-184-37-177..com/10.xx.37.177

9. Start the hadoop

Location :/opt/app/hadoop/sbin

./start-all.sh ( like step B-2)

10. jps

12882 NameNode

13189 DataNode

14152 NodeManager

13816 ResourceManager

13529 SecondaryNameNode

14394 Jps

11. Checking the file system ( listing the files )

hadoop fs -ls /

It will be blank

12. create new folder

hadoop fs -mkdir /new

13. Verify it :

hadoop fs -ls /

Found 1 items

drwxr-xr-x - sterin supergroup 0 2016-04-25 12:18 /new

14. Put Command

fs -put /tmp/sterin /new

/tmp/sterin : My local file

/new : In Hadoop

hadoop fs -put /tmp/sterin /new

15 . Get command

hadoop fs -get /new/sterin /home/sterin/Downloads/Chrome

/new/sterin : In hadoop source

/home/sterin/Downloads/Chrome : Local file system

URLs : http://localhost:50070/ web UI of the NameNode daemon

WCC and Linux

Pages

Monday, May 2, 2016

Hadoop single node installation on linux

No comments:

Post a Comment

Labels