Hadoop Map reduce is a framework which used to process huge set of data’s in parallel way across different clusters. It has five different phases Map ,Partition ,Shuffle,Sort and Reduce among this phases Map and reduce plays an vital role .
Map Reduce Phases
Map : Map phase get the i/p file from the Hadoop distributed file system(HDFS).It applies the map function to each records
Partition : Each mapper must determine which reducer will receive each of the outputs
Shuffle/Sort : Map reduce will get the input data from all map tasks for the portion corresponding to the reduce task’s bucket and sort/merge will help to make the output as an single set.
Reduce : Applies user-defined reduce function to the merged run and write the op to HDFS