Hadoop Important Commands
All Hadoop commands are executed using the $HADOOP_HOME/bin/hadoop command. Running the Hadoop script without any arguments displays a description of all commands.
Usage − hadoop [--config confdir] COMMAND
The table below outlines the available options along with their descriptions.
| Option | Description |
|---|---|
| namenode -format | Formats the DFS filesystem. |
| secondarynamenode | Runs the DFS secondary namenode. |
| namenode | Runs the DFS namenode. |
| datanode | Runs a DFS datanode. |
| dfsadmin | Runs a DFS admin client. |
| mradmin | Runs a Map-Reduce admin client. |
| fsck | Runs a DFS filesystem checking utility. |
| fs | Runs a generic filesystem user client. |
| balancer | Runs a cluster balancing utility. |
| oiv | Applies the offline fsimage viewer to an fsimage. |
| fetchdt | Fetches a delegation token from the NameNode. |
| jobtracker | Runs the MapReduce job Tracker node. |
| pipes | Runs a Pipes job. |
| tasktracker | Runs a MapReduce task Tracker node. |
| historyserver | Runs job history servers as a standalone daemon. |
| job | Manipulates the MapReduce jobs. |
| queue | Gets information regarding JobQueues. |
| version | Prints the version. |
| jar <jar> | Runs a jar file. |
| distcp <srcurl> <desturl> | Copies file or directories recursively. |
| distcp2 <srcurl> <desturl> | DistCp version 2. |
| archive -archiveName NAME -p <parent path> <src>* <dest> | Creates a hadoop archive. |
| classpath | Prints the class path needed to get the Hadoop jar and the required libraries. |
| daemonlog | Get/Set the log level for each daemon |
How to Interact with MapReduce Jobs -
Usage − hadoop job [GENERIC_OPTIONS]
The following are the Generic Options available in a Hadoop job.
| COMMAND | Description |
|---|---|
| -submit <job-file> | Submits the job. |
| -status <job-id> | Prints the map and reduce completion percentage and all job counters. |
| -counter <job-id> <group-name> <countername> | Prints the counter value. |
| -kill <job-id> | Kills the job. |
| -events <job-id> <fromevent-#> <#-of-events> | Prints the events' details received by jobtracker for the given range. |
| -history [all] <jobOutputDir> - history < jobOutputDir> | Prints job details, failed and killed tip details. More details about the job such as successful tasks and task attempts made for each task can be viewed by specifying the [all] option. |
| -list[all] | Displays all jobs. -list displays only jobs which are yet to complete. |
| -kill-task <task-id> | Kills the task. Killed tasks are NOT counted against failed attempts. |
| -fail-task <task-id> | Fails the task. Failed tasks are counted against failed attempts. |
| -set-priority <job-id> <priority> | Changes the priority of the job. Allowed priority values are VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW |
| $HADOOP_HOME/bin/hadoop job -kill <JOB-ID> | To kill the job |
| $HADOOP_HOME/bin/hadoop job -history <DIR-NAME> | To see the history of job output-dir |
| $HADOOP_HOME/bin/hadoop job -status <JOB-ID> | To see the status of job |