COMMAND | DESCRIPTION |
---|---|
sudo apt-get install sun-java6-jdk | Install java |
if you don't have hadoop bundle download here download hadoop | |
sudo tar xzf file_name.tar.gz | Extract hadoop bundle |
vi conf/hadoop-env.sh | Edit configuration file hadoop-env.sh and set JAVA_HOME: export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/java-6-sun) |
Go your hadoop installation directory(HADOOP_HOME) and type: bin/hadoop | This will display the usage documentation for the hadoop |
Congratulations Your Hadoop Setup is Completed. Now lets run some examples | |
bin/hadoop jar hadoop-*-examples.jar pi 10 100 | Run pi example |
mkdir input cp conf/*.xml input bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' cat output/* | Run grep example |
mkdir inputwords cp conf/*.xml inputwords bin/hadoop jar hadoop-*-examples.jar wordcount inputwords outputwords | run word count example |
If you got any error while running examples visit Hadoop Troubleshooting |
Running Hadoop in Standalone Mode
This section contains instructions for Hadoop installation on ubuntu. This is Hadoop quickstart tutorial to setup Hadoop quickly. This is shortest tutorial of Hadoop installation, here you will get all the commands and their description required to install Hadoop in Standalone mode(single node cluster)
Subscribe to:
Post Comments (Atom)
Any feedback and suggestions are invited.
ReplyDeleteThanx for visiting my blog.:):):)
i cannot edit hadoop-env.sh file.. permission denied
ReplyDeleteAre you working as root user?
ReplyDeleteif not you must explicitly provide permission to that user.
run this command as root
chown -R YOUR-USER-NAME PATH-TO-HADOOP-DIR
or you can run cmd specified in above tutorial as
sudo vi conf/hadoop-env.sh
This comment has been removed by the author.
ReplyDeleteCan you post steps for windows user. I am a amateur in Hadoop and would like to setup a single node and run pi, word count example. Thanks.
ReplyDeleteI have my hadoop running in pseudo distributed mode. Could you please suggest necessary changes to be made to make it run in Standalone mode.
ReplyDeleteI think you should go from pseudo distributed to distributed
Deletemost simple way is to deploy on standalone then pseudo distributed then distributed
any ways if you want to make pseudo distributed to standalone then just remove entries from configuration files (core-site.xml, hdfs-site.xml, mapred-site.xml)