COMMAND | DESCRIPTION |
---|---|
sudo apt-get install sun-java6-jdk | Install java |
If you don't have hadoop bundle download here download hadoop | |
sudo tar xzf file_name.tar.gz | Extract hadoop bundle |
Go to your hadoop installation directory(HADOOP_HOME) | |
vi conf/hadoop-env.sh | Edit configuration file hadoop-env.sh and set JAVA_HOME: export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/java-6-sun) |
vi conf/core-site.xml then type: <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration> | Edit configuration file core-site.xml |
vi conf/hdfs-site.xml then type: <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> | Edit configuration file hdfs-site.xml |
vi conf/mapred.xml then type: <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> </configuration> | Edit configuration file mapred-site.xml and type: |
sudo apt-get install openssh-server openssh-client | install ssh |
ssh-keygen -t rsa -P "" cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys ssh localhost | Setting passwordless ssh |
bin/hadoop namenode –format | Format the new distributed-filesystem During this operation : Name node get start Name node get formatted Name node get stopped |
bin/start-all.sh | Start the hadoop daemons |
jps | It should give output like this: 14799 NameNode 14977 SecondaryNameNode 15183 DataNode 15596 JobTracker 15897 TaskTracker |
Congratulations Hadoop Setup is Completed | |
http://localhost:50070/ | web based interface for name node |
http://localhost:50030/ | web based interface for job tracker |
Now lets run some examples | |
bin/hadoop jar hadoop-*-examples.jar pi 10 100 | run pi example |
bin/hadoop dfs -mkdir input bin/hadoop dfs -put conf input bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' bin/hadoop dfs -cat output/* | run grep example |
bin/hadoop dfs -mkdir inputwords bin/hadoop dfs -put conf inputwords bin/hadoop jar hadoop-*-examples.jar wordcount inputwords outputwords bin/hadoop dfs -cat outputwords/* | run wordcount example |
bin/stop-all.sh | Stop the hadoop daemons |
Running Hadoop in Pseudo Distributed Mode
This section contains instructions for Hadoop installation on ubuntu. This is Hadoop quickstart tutorial to setup Hadoop quickly. This is shortest tutorial of Hadoop installation, here you will get all the commands and their description required to install Hadoop in Pseudo distributed mode(single node cluster)
Subscribe to:
Post Comments (Atom)
Hi,
ReplyDeleteI am trying to run hadoop in pseudo distributed
mode using cloudera as vm.
I have copied the files into hdfs via hue and using the job browser I am trying to run it as a job.
But it dies saying permissions denied error.
Could you help me the same?
Thanks,
Sayali
Hi Sayali,
ReplyDeleteWhen we run jobs some files are created,
I think you have not given permission to create a file at that location(hadoop.tmp.dir).
Please provide all the permissions to the current user.
or you can also install as root user
For cloudera you can refer:
http://cloudera-tutorial.blogspot.com/
For Hue you can refer:
http://hivetutorial.wordpress.com/
Hi, I am getting following errors. Can you please help
ReplyDelete11/02/21 00:23:23 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s).
11/02/21 00:23:24 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s).
11/02/21 00:23:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s).
11/02/21 00:23:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s).
Hi,
ReplyDeletePlease run jps command to ensure all the daemons are running.
Also if you run job just after bin/start-all.sh, you may get this error because namenode is in safe mode, so wait for 10 - 30 sec and try to run same job again
We are trying to set up a database in hive. Every time we create tables, insert data and end our session by stopping all hadoop daemons, we lose previously inserted data in hive as we need to execute hadoop namenode -format every time that seems to wipe out all data. Is there any way to retain the data created in hive?
ReplyDeleteYou need not to execute "hadoop namenode -format" every time you start the cluster.
ReplyDeletejust execute start-all.sh to start hadoop daemons.
If we format the namenode all the data will be lost
This is a great point. I kept wondering why I could not see the namenode when everything else was working properly.Thanks pal!
DeletePlease post your namenode logs, so that I can help you on this..
DeleteThanks Rahul for your prompt reply. What you said is true only when we keep ubuntu running. But when we restart our machine, we had to format namenode server otherwise http://localhost:50070/ won't work. So, maybe we need to start namenode in some other manner, right?
ReplyDeletewhen you restart your machine, just restart your hadoop daemons by executing "start-all.sh"
ReplyDeleteAfter starting daemons http://localhost:50070/ will work
hi can u tell me how to install Hadoop tutorial for running first in pseudo-distributed mode in windows7
ReplyDeleteHi,
ReplyDeleteI would suggest you to install hadoop on linux.
If you want to install on windows:
first install cygwin from http://www.cygwin.com/ , it will provide linux atmosphere,
now you can follow above tutorial for hadoop installation.
HI, I get following errors, when i want to run wordcount example, and many other examples provided by hadoop itself, can you help me?
ReplyDelete11/02/21 00:23:23 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s).
11/02/21 00:23:24 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s).
11/02/21 00:23:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s).
11/02/21 00:23:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s).
Please ensure all the daemons of hadoop are running By running jps command.
ReplyDeletealso ensure that name node is not in safe mode
Please see logs in case of any error
Hi, when I run the command bin/start-all.sh I get an error. Can you please help me to sort it out. here is the error message
ReplyDeletelocalhost: starting secondarynamenode, logging to /data/hadoop/hadoop-0.20.2/bin/../logs/hadoop-waqas-secondarynamenode-trinity.out
localhost: Exception in thread "main" java.lang.NumberFormatException: For input string: "localhost:9000"
localhost: at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
localhost: at java.lang.Integer.parseInt(Integer.java:492)
localhost: at java.lang.Integer.parseInt(Integer.java:527)
localhost: at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:146)
localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)
localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:131)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:115)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:469)
There might be mistake in your configuration files
ReplyDeletelike core-site.xml, hdfs-site.xml, mapred-site.xml etc.
please post your configuration files contents
I edited my configuration files exactly the same as you suggested up here in this blog.
ReplyDeleteplease posts you configuration files so i can help u on this
ReplyDeleteohh..yeah I rechecked and there was some error in my configuration files.Thanks for your time and suggestion
ReplyDeleteHi,
ReplyDeleteI have configured Hadoop and Hive on Windows through Cygwin. But I am facing some problems like: in hive terminal (CLI): hive> when i enter query, the query do not execute and terminal remains busy.
If i enter the query like: bin/hive -e 'LOAD DATA INPATH 'kv1.txt' OVERWRITE INTO TABLE pokes;'
The Output is like this:
Hive history file=/tmp/Bhavesh.Shah/hive_job_log_Bhavesh.Shah_201111301549_1377455380.txt FAILED: Parse Error: line 1:17 mismatched input 'kv1' expecting StringLiteral near 'INPATH' in load statement
What could be the problem? Pls suggest me
You need to create file "kv1.txt" and provide path to that.
ReplyDeleteIf error persist please post contents of log file (history file=/tmp/Bhavesh.Shah/hive_job_log_Bhavesh.Shah......)
This comment has been removed by the author.
ReplyDeleteI have already put that file in the same directory thats why I have written kv1.txt.
ReplyDeleteThe error String Lateral actually creating the problem, I dont no know why?
And I one more thing is that I am not finding that particular directory i.e. /tmp/Bhavesh.Shah/...
Now what to do?......:(
hi,
ReplyDeleteI just found the error log file.
CONTENT IS:
---------
SessionStart SESSION_ID="Bhavesh.Shah_201111301549" TIME="1322648344557"
Sorry for the multiple posts.
change outer single quotes to double quotes
ReplyDeletealso put local keyword
the correct query would be:
bin/hive -e "LOAD DATA LOCAL INPATH 'kv1.txt' OVERWRITE INTO TABLE pokes;"
following link would be useful:
http://hivebasic.blogspot.com/
I have one more doubt that,
ReplyDeleteWhen I enter the query in Hive CLI, I get the error as:
$ bin/hive -e "insert overwrite table pokes select a.* from invites a where a.ds='2008-08-15';"
bin/hive -e "insert overwrite table pokes select a.* from invites a where a.ds='2008-08-15';"
Hive history file=/tmp/Bhavesh.Shah/hive_job_log_Bhavesh.Shah_201112021007_2120318983.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201112011620_0004, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201112011620_0004
Kill Command = C:\cygwin\home\Bhavesh.Shah\hadoop-0.20.2\/bin/hadoop job -Dmapred.job.tracker=localhost:9101 -kill job_201112011620_0004
2011-12-02 10:07:30,777 Stage-1 map = 0%, reduce = 0%
2011-12-02 10:07:57,796 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201112011620_0004 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
I think that map-reduce job is not started and hence it is not executed?
So what could be solution?
Thanks.
Please scan your error logs and post detailed exception
ReplyDelete2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
ReplyDelete2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
2011-12-02 12:29:23,011 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2011-12-02 12:29:58,749 ERROR exec.MapRedTask (SessionState.java:printError(343)) - Ended Job = job_201112011620_0006 with errors
2011-12-02 12:29:58,858 ERROR ql.Driver (SessionState.java:printError(343)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
Hello Rahul,
ReplyDeleteNow this time I have configured Hadoop in Linux (Ubuntu) and trying for Hive. But there is one problem while configuring Hive that:
After successfully building the package through ant, when I try for the launching Hive CLI from Hive directory. I am getting errors as:
"Missing Hive Builtins Jar: /home/hadoop/hive-0.7.1/hive/lib/hive-builtins-*.jar"
What could be the problem in configuration? Pls suggest me as soon as possible.
For configuring Hive you don't need to build it
ReplyDeleteI didn't build it
you can follow this approach
1. Install hadoop
2. set HADOOP_HOME
3. Untar Hive*.tar.gz
4. go to HIVE_HOME and type bin/hive
hive shell should be open
i am getting error :
ReplyDeletehdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info is missing!
...my name node is running, however Jobtracker is not running
This comment has been removed by the author.
DeleteOn what operation you are getting this error ??
Deleteplease check you Jobtracker logs and post the error
when I searched in /tmp folder, there is no directory called Mapred/system/jobtracker.info
ReplyDeleteits giving error in hdfs not in local filesystem
Deleteplease scan post logs
INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG:
ReplyDelete/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG: host = DEFTeam-N5-PC/192.168.2.104
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009
************************************************************/
INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=9101
INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030
INFO org.mortbay.log: jetty-6.1.14
INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50030
INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9101
INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030
INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory
WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager. The Recovery manager failed to access the system files in the system dir (hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system).
WARN org.apache.hadoop.mapred.JobTracker: It might be because the JobTracker failed to read/write system files (hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info / hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info.recover) or the system file hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info is missing!
WARN org.apache.hadoop.mapred.JobTracker: Bailing out...
WARN org.apache.hadoop.mapred.JobTracker: Error starting tracker: org.apache.hadoop.ipc.RemoteException: java.io.IOException: failed to create file /tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info on client 127.0.0.1.
Requested replication 0 is less than the required minimum 1
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:238)
at org.apache.hadoop.mapred.JobTracker$RecoveryManager.updateRestartCount(JobTracker.java:1168)
at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:1657)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528)
2012-04-16 18:48:39,832 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to /127.0.0.1:9101 : Address already in use: bind
There are multiple errors, so I would suggest you to start from scratch, please remove your old installation and install it again
Deletetalking about above error
one exception is required port is already in use
your data node daemon might not running
after installation run jps command to ensure that all services are running
Thanx Rahul, I re-installed and working fine now :)
DeleteHi
ReplyDeleteWhen I am running jps command in cygwin and it says "command not found". Any idea how to run jps command in cygwin?
Thanks,
Nathan
I found it. It should executed like /cygdrive/c/Program\ Files/Java/jdk1.6.0_29/bin/jps.exe
Deleteit works...
Thanks,
Nathan
Hi,
ReplyDeleteFew days back, I installed Hadoop 0.20.2 on window's 7 through Cygwin...and its working fine, I ran Wordcount example on command prompt, everything is working fine.
I would like to know about Cloudera, I am going through cloudera videos, is there any difference in Cloudera installation and installation of hadoop through Cygwin. If I want to learn Cloudera hadoop in detail, Shall I do setup of hadoop through cloudera? can we install on windows?.....plz help me
installation of both Apache and cloudera through tarball is same
DeleteYes you can install hadoop on windows through cygwin, but it is not recommended
To install Cloudera Hadoop you need to download tarball from cloudera.com
their are some other debian and rpm format of hadoop also present on both cloudera and apache whose installation steps are different, but I recommend to download tarball and install
Thanks for information,
ReplyDeleteI am going to use Pentaho BI Tool with apache hadoop , i found document on pentaho which is saying to create virtual Operating system, installing VMware player and then Ubuntu, for hadoop installation...... will it be useful?
Hello. I am a new comer to Hadoop. I followed the instructions on http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/. When I run the test
ReplyDelete"bin/hadoop jar hadoop-*examples*.jar grep input output 'dfs[a-z.]+'"
I get an exception following a set of errors
"Retrying connect to server: /127.0.0.1:9101. Already tried"
I checked that Safe Mode is off. Any thoughts on what I can try to make sure I can get this working?
One of Hadoop daemons(namenode, secondary namenode, job tracker, datanode, task tracker) not running please check the error logs
DeleteHi Rahul, can we have common storage for both HBase and Hive, I want to retrieve data from both Hbase and Hive.....so I would like to know whether I can make common storage for data coming from both Hbase and Hive...plz help
ReplyDeleteHBase and hive already have same storage, data of both is saved in HDFS
DeleteIf you want to query HBase's data using Hive (using its SQL) you need to integrate Hive and HBase
Hi Rahul,
ReplyDeleteI am trying to confidgure hadoop 1.0. When I run the command for pi example then I get the following error:
Number of Maps = 10
Samples per Map = 100
java.lang.RuntimeException: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: java.io.EOFException
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:265)
at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
at org.apache.hadoop.ipc.Client.call(Client.java:1071)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542)
... 17 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)
What it might be the problem?Thanks
waqas
Please verify:
DeleteAre all daemons of hadoop running ?
Did you format namenode ?
Yes I formated it but in log it says its not formated.
Deleteorg.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
Delete2012-05-14 15:39:39,586 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:315)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:386)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:360)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2012-05-14 15:39:39,587 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:315)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:386)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:360)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2012-05-14 15:39:39,588 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at test/127.0.0.1
************************************************************/
I got rid of this problem and now I am getting this error at pi example
ReplyDeleteerror message part1(due to limit of 4096 words)
Number of Maps = 10
Samples per Map = 100
12/05/14 16:21:47 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/waqas/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1556)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1066)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3507)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3370)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2586)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2826)
What was exact problem in previous case ??
DeleteThis error come due to following reasons:
1.might be datanode daemon not running, (check by running jps command)
2. see no of live nodes (it should not be zero) check http://localhost:50070
3. namenode in safe mode check running this command bin/hadoop dfsadmin -safemode get
It looks like datanode is not running.
DeleteHere is output of datanode.out
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGFPE (0x8) at pc=0x00002ab1d93d368f, pid=14357, tid=1074792768
#
# JRE version: 7.0_01-b08
# Java VM: Java HotSpot(TM) 64-Bit Server VM (21.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [ld-linux-x86-64.so.2+0x868f] do_lookup_x+0xcf
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /data/hadoop/hadoop-1.0.0/hs_err_pid14357.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
I also want to mention that I already have hadoop 0.20 running on this system but having problem with hadoop 1.0...Can you please suggest how can I solve this problem.Thanks
waqas
also when i try to run ultimate -c unlimited then it says no such command. I am not use to all this command line stuff so please guide me in this as well
DeleteWhy are you running "ultimate -c unlimited"
Deleteit looks like you are using java 7 (with hadoop java 6 is recommended)
If Hadoop(0.20) is already running on your machine then first stop it otherwise you will get port related issues
I am not running both simultaniously. Also I used ports 8020 and 8021 for hadoop 1.0 instead of 9000 and 9001 which i assigend to hadoop 0.20 as mentioned in your blog.
DeleteHadoop 0.20 is also running with java 7. I tried ulimit -c unlimited because it was mentioned in the datanote.out file that I posted
Hi Rahul, I am using hadoop 0.20.2, and it's working fine, however sometimes, namenode stops working, then to resolve the issue, I need to delete hadoop image stored in tmp directory and have to do namenode format step, it resolve the issue but all the data is lost. is there anyother way to resolve issue?
ReplyDeletePlease scan your error logs and post so the error log, so that I can find root cause
DeleteThere is no error log, I did reformat now working fine...thnx
ReplyDeleteI have 1 more doubt :
ReplyDeleteI want to retrieve data from Hbase and Hive, my dimension table's are stored in Hbase and fact tables in Hive, so using Hive how to integrate and retrieve data from hbase, I am using Hive version 0.9.0, Hbase version 0.92.0, I heard that hive 0.9.0 onwards we can retrieve existing data from Hbase, but i dnt knw how to, plz help
Hi Karry you need to integrate HBase and Hive, with that you can fulfill your requirements, you can find all the stuff about that here (https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-HiveHBaseIntegration)
Deletefor earlier response please post comments on http://www.technology-mania.com/
My Issue is: Taking simple example , if Dimension table is in Hbase "eg(prodid,prodname,date)" and fact table is in Hive "eg: (prodid,sales)" , then I would like to know how to do integration if I wanna print O/P:(Prodname,sales,date)..
ReplyDeleteThe link provided "https://cwiki.apache.org/Hive/hbaseintegration" says that hbase table is to be created through hive....However in my case Hbase table is not created through hive......I am using hive:0.9,Hbase:0.92.1 versions ...plz help
stuff for your requirement is also mentioned in that page
DeleteIf you want to give Hive access to an existing HBase table, use CREATE EXTERNAL TABLE:
CREATE EXTERNAL TABLE hbase_table_2(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:val")
TBLPROPERTIES("hbase.table.name" = "some_existing_table");
Thnx for help...
ReplyDeleteHi rahul, I have installed hadoop-0.20 using cygwin in my windows7, and namenode,secondarynamenode,datanode,jobtracker and tasktraker are working fine.
ReplyDeleteI have set configuration in Eclipse IDE, and running wordcount example.it is also working fine.
the problem is, if I stop all the daemons by stop-all.sh and running my wordcount example..
it is running without any error, and producing the output file...
I dont know how it is working....?? any ideas...please
thanks,
Nitai
After you have stopped all the services might be the word count example is running in standalone mode, search for the output directory on your local filesystem
Deleteone more check when you run wordcount example after stopping services should not give % of map completion
Hi Rahul I am new to hadoop and I made my first multicluster but hte problem I am facing is that everything worked except jps command .Its showing error -bash jps :command not found . please tell me where I am going wrong I am using CentOS 6
ReplyDeleteOn centOS jps command doesn't work,
Deleteabove mentioned jps command is for Ubuntu
if there is no error in the logs then every thing is fine
hi i have change Java_home in hadoop-env.sh file to usr/lib/jvm/java-6-openjdk
ReplyDeletebut on terminal it show error java_home is not set.
what should i do?
Hi,
DeleteYou should set complete path, I think / is missing
in my current case it looks like this:
JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64/
thanks for answering....but still the terminal show the same error
Deletemay i know that hadoop 1.0.4 works on openjdk or not?
and i have java-6-openjdk not java-6-openjdk-amd64....is that for 64 bit?
yes Hadoop-1.x works with open-jdk
Deletemight be your path is not correct, plz check
in my case it is "java-6-openjdk-amd64" as I am working on 64 bit machine
thanks for answering.....i will check the path
Deletehi
ReplyDeletei have download hadoop 1.0.4 and extract it in a folder
how it has to install?...
above steps explains how to explains How to install Hadoop in pseudo distributed mode
Deletehow to go to hadoop installation directory(HADOOP_HOME)??
ReplyDeleteits the directory where you extract hadoop and in this directory you will find all the other directories like bin, conf, src, etc
Deletewhat is the next step after that?
Deletevi config/hadoop-env.sh is the next step ?
Deletenothing has to do before that ?
This comment has been removed by the author.
ReplyDeletehi
ReplyDeletewhen i run bin/hadoop jar hadoop-*-examples.jar pi 10 100 it give an error
cannot unzip the zip file...please help
When executing the start-all some of the daemons are not started:
ReplyDeletefor data node i see this backtrace:
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [ld-linux-x86-64.so.2+0x868f] double+0xcf
C [ld-linux-x86-64.so.2+0xa028] _dl_relocate_object+0x588
C [ld-linux-x86-64.so.2+0x102d5] double+0x3d5
C [ld-linux-x86-64.so.2+0xc1f6] _dl_catch_error+0x66
C [libdl.so.2+0x11fa] double+0x6a
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
j java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+300
j java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+347
j java.lang.Runtime.loadLibrary0(Ljava/lang/Class;Ljava/lang/String;)V+54
j java.lang.System.loadLibrary(Ljava/lang/String;)V+7
j org.apache.hadoop.util.NativeCodeLoader.()V+25
v ~StubRoutines::call_stub
j org.apache.hadoop.io.nativeio.NativeIO.()V+13
v ~StubRoutines::call_stub
j org.apache.hadoop.fs.FileUtil.setPermission(Ljava/io/File;Lorg/apache/hadoop/fs/permission/FsPermission;)V+22
j org.apache.hadoop.fs.RawLocalFileSystem.setPermission(Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)V+6
j org.apache.hadoop.fs.FilterFileSystem.setPermission(Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)V+6
j org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(Lorg/apache/hadoop/fs/LocalFileSystem;Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)Z+40
j org.apache.hadoop.util.DiskChecker.checkDir(Lorg/apache/hadoop/fs/LocalFileSystem;Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)V+3
j org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance([Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter$SecureResources;)Lorg/apache/hadoop/hdfs/server/datanode/DataNode;+74
j org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode([Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter$SecureResources;)Lorg/apache/hadoop/hdfs/server/datanode/DataNode;+99
j org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode([Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter$SecureResources;)Lorg/apache/hadoop/hdfs/server/datanode/DataNode;+3
Any idea on this ??
I used 1.0.4 rpm file to install hadoop
DeleteI am trying to connect from client machine to hive server.I installed hive and hadoop on the client.I am able to run hive.I have copied the hive-site.xml from server.But whenever I run any query it gives me this error..:
ReplyDelete#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGFPE (0x8) at pc=0x00002aaaaaab368f, pid=25697, tid=1076017472
#
# JRE version: 6.0_31-b04
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [ld-linux-x86-64.so.2+0x868f] double+0xcf
#
# An error report file with more information is saved as:
# /usr/local/hcat/hs_err_pid25697.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
I have problem when I run the word count example. I have check /etc/hosts and config files (core-site.xml, hdfs-site.xml and mapred-site.xml. Could you please check it for me?
ReplyDeletehadoop@Hadoop hadoop]$ bin/hadoop jar hadoop-examples-1.1.1.jar wordcount input output
Warning: $HADOOP_HOME is deprecated.
13/01/06 13:27:18 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:19 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 1 time(s); retry policy is
13/01/06 13:27:22 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:23 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:24 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:25 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:26 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:27 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:27 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.net.ConnectException: Call to Hadoop/10.57.250.186:6868 failed on connection exception: java.net.ConnectException: Connection refused
java.net.ConnectException: Call to Hadoop/10.57.250.186:6868 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
at org.apache.hadoop.ipc.Client.call(Client.java:1112)
...
run jps command to ensure all hadoop daemons are running...
Deletemtech11@cse-desktop:~/hadoop/bin$ hadoop jar hadoop-*-examples.jar pi 10 100
DeleteException in thread "main" java.io.IOException: Error opening job jar: hadoop-*-examples.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:130)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:127)
at java.util.jar.JarFile.(JarFile.java:135)
at java.util.jar.JarFile.(JarFile.java:72)
at org.apache.hadoop.util.RunJar.main(RunJar.java:128)
This is is the result
I get when I run an example map reduce job.I have tried almost all solutions,but there is no result
Kindly help me
Please correct the name of jar file.
Delete