Friday, September 30, 2011

Resources for NextGen MapReduce

Edit: For easier access I have moved this to the pages section just below the blog header and no more maintaining this entry.

'Next Genereation MR' or 'NextGen MR' or 'MRv2' or 'MR2' is a major revamp of the MapReduce engine and will part of the 0.23 release. MRv1 or the old MapReduce engine will be not be supported in 0.23 release. The underlying engine has been revamped in 0.23, but the API to interface with the engine remains the same. So, the existing MapReduce code for MRv1 engine should run without modifications on MRv2.

The architecture, information for building and running MRv2 is spread across and this blog entry will try to consolidate and present all the information available on MRv2. I will keep-on updating this blog entry as I get more information about MRv2, instead of creating a new one. So, bookmark this and check it often :).

Current Status - 27th September, 2011 - 15th November, 2011 - 16th November, 2011

Home Page


The Hadoop Map-Reduce Capacity Scheduler

The Next Generation of Apache Hadoop MapReduce

Next Generation of Apache Hadoop MapReduce – The Scheduler

Detailed document on MRv2


Quick view of MRv2



Next Generation Hadoop MapReduce by Arun C. Murthy


Building from code and executing and running a sample

No comments:

Post a Comment