Hadoop single node
installation on linux
Purpose : Install
Hadoop in single machine ,then use put file to Hadoop file system and
get those files
Steps :
1.Download and
installation of Hadoop
2.Configuration
3.Use Basic command
like ls, put,get,
Detailed Steps :
A. Hadoop install
Prerequisites
1.Install JAVA
2.Create hduser in
OS
3. Enable SSH
Steps :
1. Download
hadoop-2.6.4.tar.gz : http://hadoop.apache.org/releases.html
2. copy to /opt/app
3. tar -xzf
hadoop-2.6.4.tar.gz
4.mv hadoop-2.6.4
hadoop
5. chown -R
hduser:hduser hadoop
B. Start and Verify
1. edit
hadoop-env.sh
Add export
JAVA_HOME=
OR edit bash profile
with JAVA_HOME
2. go to
/opt/app/hadoop/sbin
3. ./start-all.sh
(provide , password when required)
[hadoop@sterinlap
sbin]$ ./start-all.sh
This script is
Deprecated. Instead use start-dfs.sh and start-yarn.sh
Incorrect
configuration: namenode address dfs.namenode.servicerpc-address or
dfs.namenode.rpc-address is not configured.
Starting
namenodes on []
hadoop@localhost's
password:
localhost:
starting namenode, logging to
/opt/app/hadoop/logs/hadoop-hadoop-namenode-sterinlap.out
hadoop@localhost's
password:
localhost:
starting datanode, logging to
/opt/app/hadoop/logs/hadoop-hadoop-datanode-sterinlap.out
Starting
secondary namenodes [0.0.0.0]
hadoop@0.0.0.0's
password:
0.0.0.0:
starting secondarynamenode, logging to
/opt/app/hadoop/logs/hadoop-hadoop-secondarynamenode-sterinlap.out
starting yarn
daemons
starting
resourcemanager, logging to
/opt/app/hadoop/logs/yarn-hadoop-resourcemanager-sterinlap.out
hadoop@localhost's
password:
localhost:
starting nodemanager, logging to
/opt/app/hadoop/logs/yarn-hadoop-nodemanager-sterinlap.out
4. Verify the
process by
ps -ef | grep hadoop
[hduser@sterinlap
sbin]$ ps -ef | grep hadoop
hduser 9745
1 20 11:22 pts/5 00:00:06
/home/sterin/Public/jdk1.8.0_45/bin/java -Dproc_resourcemanager
-Xmx1000m -Dhadoop.log.di
hduser 10077
1 23 11:22 ? 00:00:06
/home/sterin/Public/jdk1.8.0_45/bin/java -Dproc_nodemanager
Total : 2 process
5. Verify hadoop
command by “hadoop fs -ls”
Location of commands
: /opt/app/hadoop/bin
/opt/app/hadoop/bin/hadoop
fs -ls
Found 11 items
-rwxr-xr-x 1
oracle oracle 159223 2016-02-12 15:27 container-executor
( Even though this
shows current path , just to verify the installation )
6. Stop all Hadoop
process
/opt/app/hadoop/sbin/stop-all.sh
Configuration
1. Create tmp
directory for hadoop,
under
/opt/app/hadoop : /opt/app/hadoop/tmp
2.edit core-site.xml
/opt/app/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/app/hadoop/tmp</value>
<description>A
base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The
name of the default file system. A URI whose
scheme and
authority determine the FileSystem implementation. The
uri's scheme
determines the config property (fs.SCHEME.impl) naming
the FileSystem
implementation class. The uri's authority is used to
determine the
host, port, etc. for a filesystem.</description>
</property>
</configuration>
3. Create
mapred-site.xml
Location
:/opt/app/hadoop/etc/hadoop/
cp
mapred-site.xml.template mapred-site.xml
4. Edit
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The
host and port that the MapReduce job tracker runs
at. If "local",
then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
5. Create
namenode and the datanode folders
Location :
/opt/app/hadoop/
mkdir
hadoop_store/hdfs/namenode
mkdir
hadoop_store/hdfs/datanode
6.
Edit hdfs-site.xml
Location : /opt/app/hadoop/etc/hadoop/
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual
number of replications can be specified when the file is created.
The default is
used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/app/hadoop/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/app/hadoop/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
7. Add path in
bashrc ( bash profile under home directory)
edit .bashrc
PATH=$PATH:/opt/app/hadoop/bin
8. Format the
hadoop file system
hadoop namenode
-format
16/04/25
12:11:03 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG:
Starting NameNode
STARTUP_MSG:
host = -10-184-37-177.in..com/10.xx.37.177
STARTUP_MSG:
args = [-format]
STARTUP_MSG:
version = 2.6.4
16/04/25
12:11:05 INFO util.ExitUtil: Exiting with status 0
16/04/25
12:11:05 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG:
Shutting down NameNode at l-10-184-37-177..com/10.xx.37.177
9. Start the
hadoop
Location
:/opt/app/hadoop/sbin
./start-all.sh
( like step B-2)
10. jps
12882 NameNode
13189 DataNode
14152
NodeManager
13816
ResourceManager
13529
SecondaryNameNode
14394 Jps
11. Checking the
file system ( listing the files )
hadoop fs -ls /
It
will be blank
12.
create new folder
hadoop fs -mkdir /new
13.
Verify it :
hadoop fs -ls /
Found 1 items
drwxr-xr-x - sterin supergroup 0 2016-04-25 12:18 /new
14.
Put Command
fs
-put /tmp/sterin /new
<source>
<target>
/tmp/sterin : My local file
/new : In Hadoop
hadoop fs -put /tmp/sterin /new
15
. Get command
hadoop fs -get /new/sterin /home/sterin/Downloads/Chrome
/new/sterin : In hadoop source
/home/sterin/Downloads/Chrome : Local file system
URLs
: http://localhost:50070/ web UI of the NameNode daemon
No comments:
Post a Comment