Setting Apache Hadoop on Nitrous.IO

Standard

Once you have setup any language box (I choose to setup node box) proceed to run the following :


$cd workspace

$ wget http://mirror.nus.edu.sg/apache/hadoop/common/hadoop-2.5.1/hadoop-1.2.1.tar.gz

$ssh-keygen -t dsa -P '' -f ~./ssh/id_dsa

$cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$chmod 600 ~/.ssh/authorized_keys

$vim hadoop-1.2.1/config/hadoop-env.sh

//Here we set the jvm path. You need to check using $java -version else it would give a error saying JAVA_HOME not set.

// export JAVA_HOME = /usr/lib/jvm/java-7-oracle

$ bin/hadoop namenode -format

$ bin/hadoop fs -mkdir input

$ bin/hadoop fs -put conf input

$ bin/hadoop fs -cp conf/*.xml input

$ bin start-all.sh

$bin/hadoop jar hadoop-examples-1.2.1.jar grep input output 'dfs[a-z.]+'

$bin/hadoop fs -rmr output

$bin/hadoop jar hadoop-examples-1.2.1.jar wordcount input output

$bin/hadoop fs -rmr output

$bin/stop-all.sh

Now you need to configure core-site.xml


<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/action/tmp</value>
</property>
</configuration>

also hdfs-site.xml


<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

Lastly mapred-site.xml


<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

Thats all! Just run http://localhsot:50030 for Hadoop.

Share Button