Hadoop Important Commands
All Hadoop commands are executed using the $HADOOP_HOME/bin/hadoop command. Running the Hadoop script without any arguments displays a description of all commands.
Usage − hadoop [--config confdir] COMMAND
The table below outlines the available options along with their descriptions.
Option | Description |
---|---|
namenode -format | Formats the DFS filesystem. |
secondarynamenode | Runs the DFS secondary namenode. |
namenode | Runs the DFS namenode. |
datanode | Runs a DFS datanode. |
dfsadmin | Runs a DFS admin client. |
mradmin | Runs a Map-Reduce admin client. |
fsck | Runs a DFS filesystem checking utility. |
fs | Runs a generic filesystem user client. |
balancer | Runs a cluster balancing utility. |
oiv | Applies the offline fsimage viewer to an fsimage. |
fetchdt | Fetches a delegation token from the NameNode. |
jobtracker | Runs the MapReduce job Tracker node. |
pipes | Runs a Pipes job. |
tasktracker | Runs a MapReduce task Tracker node. |
historyserver | Runs job history servers as a standalone daemon. |
job | Manipulates the MapReduce jobs. |
queue | Gets information regarding JobQueues. |
version | Prints the version. |
jar <jar> | Runs a jar file. |
distcp <srcurl> <desturl> | Copies file or directories recursively. |
distcp2 <srcurl> <desturl> | DistCp version 2. |
archive -archiveName NAME -p <parent path> <src>* <dest> | Creates a hadoop archive. |
classpath | Prints the class path needed to get the Hadoop jar and the required libraries. |
daemonlog | Get/Set the log level for each daemon |
How to Interact with MapReduce Jobs -
Usage − hadoop job [GENERIC_OPTIONS]
The following are the Generic Options available in a Hadoop job.
COMMAND | Description |
---|---|
-submit <job-file> | Submits the job. |
-status <job-id> | Prints the map and reduce completion percentage and all job counters. |
-counter <job-id> <group-name> <countername> | Prints the counter value. |
-kill <job-id> | Kills the job. |
-events <job-id> <fromevent-#> <#-of-events> | Prints the events' details received by jobtracker for the given range. |
-history [all] <jobOutputDir> - history < jobOutputDir> | Prints job details, failed and killed tip details. More details about the job such as successful tasks and task attempts made for each task can be viewed by specifying the [all] option. |
-list[all] | Displays all jobs. -list displays only jobs which are yet to complete. |
-kill-task <task-id> | Kills the task. Killed tasks are NOT counted against failed attempts. |
-fail-task <task-id> | Fails the task. Failed tasks are counted against failed attempts. |
-set-priority <job-id> <priority> | Changes the priority of the job. Allowed priority values are VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW |
$HADOOP_HOME/bin/hadoop job -kill <JOB-ID> | To kill the job |
$HADOOP_HOME/bin/hadoop job -history <DIR-NAME> | To see the history of job output-dir |
$HADOOP_HOME/bin/hadoop job -status <JOB-ID> | To see the status of job |