Open In App

How Job runs on MapReduce

MapReduce can be used to work with a solitary method call: submit() on a Job object (you can likewise call waitForCompletion(), which presents the activity on the off chance that it hasn’t been submitted effectively, at that point sits tight for it to finish). 

Let’s understand the components –



  1. Client: Submitting the MapReduce job.
  2. Yarn node manager: In a cluster, it monitors and launches the compute containers on machines.
  3. Yarn resource manager: Handles the allocation of computing resources coordination on the cluster.
  4. MapReduce application master Facilitates the tasks running the MapReduce work.
  5. Distributed Filesystem: Shares job files with other entities.

 

How to submit Job?



 To create an internal JobSubmitter instance, use the submit() which further calls submitJobInternal() on it. Having submitted the job,

 waitForCompletion() polls the job’s progress after submitting the job once per second. If the reports have changed since the last report, it further reports the progress to the console. The job counters are displayed when the job completes successfully. Else the error (that caused the job to fail) is logged to the console. 

Processes implemented by JobSubmitter for submitting the Job :

Article Tags :