Big Data and Cloud Tips: Passing parameters to Mappers and Reducers

Wednesday, November 30, 2011

Passing parameters to Mappers and Reducers

There might be a requirement to pass additional parameters to the mapper and reducers, besides the the inputs which they process. Lets say we are interested in Matrix multiplication and there are multiple ways/algorithms of doing it. We could send an input parameter to the mapper and reducers, based on which the appropriate way/algorithm is picked. There are multiple ways of doing this

Setting the parameter:

1. Use the -D command line option to set the parameter while running the job.

2. Before launching the job using the old MR API

JobConf job = (JobConf) getConf();
job.set("test", "123");

3. Before launching the job using the new MR API

Configuration conf = new Configuration();
conf.set("test", "123");
Job job = new Job(conf);

Getting the parameter:

1. Using the old API in the Mapper and Reducer. The JobConfigurable#configure has to be implemented in the Mapper and Reducer class.

private static Long N;
public void configure(JobConf job) {
    N = Long.parseLong(job.get("test"));
}

The variable N can then be used with the map and reduce functions.

2. Using the new API in the Mapper and Reducer. The context is passed to the setup, map, reduce and cleanup functions.

Configuration conf = context.getConfiguration();
String param = conf.get("test");

20 comments:

Tim MattisonFebruary 3, 2012 at 8:24 AM
Perfect, couldn't be easier. I was up and running with mappers and reducers taking parameters from the JobConf class in minutes with this information. Thanks!
ReplyDelete
Replies
arun akJune 28, 2012 at 1:38 AM
Praveen : Is there any means by which I can pass certain parameters from main to the partitioner function (my custom partitioner) ?
ReplyDelete
Replies
IPJuly 19, 2012 at 9:23 PM
Any idea on how I can pass the ArrayList to the mapper. The very inefficient workaround I can think of is converting it to String. Also if you could suggest as how to I can an ArrayList to the driver method.
Thank you!
ReplyDelete
Replies
haritAugust 4, 2012 at 2:48 AM
great help, thanks Pravin
ReplyDelete
Replies
UnknownApril 19, 2013 at 7:15 PM
Thank you very much! You solved my problem. ^^ Thankyou Thankyou~
ReplyDelete
Replies
UnknownJune 17, 2013 at 4:56 PM
Dear Praveen,

Thanks for your post, what would you do if you have many parameters? Is there a way to put the parameters in a settings file and make them available to the mapper/reducer?
ReplyDelete
Replies
scharalaSeptember 11, 2013 at 6:41 PM
Thanks for the post! Only the last solution worked for me in the new api. I would add that by using the getInt,setInt methods it would be slightly more efficient
ReplyDelete
Replies
JoroSeptember 12, 2013 at 8:32 PM
It is important to know that Configuration object is cloned at some point, so the order is important. i.e.:
Configuration conf = getConf();
conf.set("mmsilist", mmsiList);
conf.set("msgidlist", msgidList);//set BEFORE Job instance is created
Job job = new Job(conf, "MyJob");
//
//If you try to set conf.set("mmsilist", mmsiList);
after the job is instantiated, it will not work .
ReplyDelete
Replies
vishnuNovember 27, 2013 at 7:19 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownMarch 3, 2014 at 7:15 AM
Hey, Thank you for your post, however I'm having problems. I'm using hadoop version 0.20.205, but context.getConfiguration(), java says context cannot be resolved. Is there a particular library I should be using? Is there a different variable I need to initialize first?

Thanks!
ReplyDelete
Replies
RoccoMay 22, 2014 at 12:38 AM
thanks!! man :D
ReplyDelete
Replies
UnknownOctober 3, 2014 at 7:10 AM
Thanks Praveen, this is very helpful.
ReplyDelete
Replies
UnknownFebruary 28, 2017 at 11:47 PM
how to set an object in conf and hoe to get

ReplyDelete
Replies
UnknownMay 9, 2018 at 12:02 PM
what is the meaning of this in MAPPER Class?
Configuration conf=context.getConfiguration();
String newWord=conf.get("RunTimeArg");

what is the meaning of this in DRIVER Class?
Configuration conf = new Configuration(); conf.set("RunTimeArg",args[2]);
Job job = new Job(conf, "DynamicWordCount");
ReplyDelete
Replies

Add comment

Pages

Wednesday, November 30, 2011

Passing parameters to Mappers and Reducers

20 comments: