Friday, January 2, 2015

YARN not starting

Here’s the error I saw (from ‘Recent Log Entries” in Cloudera Manager after clicking on the Details for the failed YARN startup step when restarting the Cluster):


Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 13 missing files; e.g.: /tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state/000005.sst


There is a bug already submitted in Jira for YARN that seems to encompass this error I saw. It also seems to include a workaround:


http://ift.tt/141Wo6x


Fix:


In short, remove or rename the CURRENT file in these 2 paths and then restart YARN (or delete the files, or I think you could even just reboot each affected node since the /tmp folder may be cleared out on reboot):


/tmp/hadoop-yarn/yarn-nm-recovery/yarn-nm-state


/tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state





No comments:

Post a Comment