Home » Introduction to MapReduce

# Introduction to MapReduce

• Hello Tavish,

My question is re “the Quick brown fox” example near the end of the article. Why isn’t the word “Quick” included in the Suffle & Sort stage ?

Thanks,

Vikram Chinmulgund

• Tavish Srivastava says:

Hi Vikram,

“Quick” word is included in Shuffle & Sort stage. It is just that it goes to Reducer 2 in this case. Hope this clarifies your doubt.

Tavish

• Thanks, Tavish.

Are you saying that a)since there is only one instance of “Quick” it doesn’t need to be shuffled and b)hence “quick” can go to any available reducer depending on load distribution.

regards

• Tavish Srivastava says:

Vikram,

The main objective of shuffling is to make sure all the occurrence of same map id are available on the same reducer. As “Quick:” has only one occurrence, it can go to either of the two reducers. But to your point, shuffler still needs to work “Quick”. Notice that shuffler cannot directly swap “the” and “quick” from the first mapper. If that happens, all “the” from other mappers also need to be shifted to Reducer 2.

Tavish

• Sankhe says:

Hi Tavish,

I have a question on number of reducers. As I have understood from browsing on internet about map and reduce, is that after mapper has finish its job, output of mapper(combiner optional) will go to reducer such that each key will go to different reducer.
so, reducer will have just one key and its related result,
for eg. Mapper 1 output : (abc,1 ) (def,1)
Mapper 2 output : (abc,1 ) So input to reducer will be
Reducer 1: (abc,1) (abc,1) & Reducer 2 🙁 def, 1) so final ouput : (abc,2) (def,1)

• Tavish Srivastava says:

Sankhe,

The number of mappers and reducers can be independently chosen. And you are right, the work of reducer starts after all mappers have done the job.

I have been trying to understand big data form very simple concepts. This is the best article I have read on hadoop, simply explained, together with the other one on support vector machines. thank you very much

• Tavish says: