MapReduce is not a new programming model, but the Google's paper on MapReduce made it popular. A map is usually used for transformation, while reduce/fold is used for aggregation. They are built-in primitives used in functional programming languages like Lisp and ML. More about the functional programming roots to MapReduce paradigm can be found in Section 2.1 of Data-Intensive Text Processing with MapReduce paper.
Below is a simple Python 2 program using the map/reduce functions. map/reduce are functions in the __builtin__ python module. More about functional programming in Python here. For those using Python3, the reduce function has removed from the __builtin__ package. According to the Python 3.1 release notes :
Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.
The Python 2 program `squares/transforms` a list of 1 to 100 using `map/square` and then `sums/aggregates` them up using the `reduce/add` function. Note that Hadoop which provides a run time environment for executing MapReduce programs also does something similar, but in a distributed fashion to process huge amounts of data.
Below is a simple Python 2 program using the map/reduce functions. map/reduce are functions in the __builtin__ python module. More about functional programming in Python here. For those using Python3, the reduce function has removed from the __builtin__ package. According to the Python 3.1 release notes :
Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.
The Python 2 program `squares/transforms` a list of 1 to 100 using `map/square` and then `sums/aggregates` them up using the `reduce/add` function. Note that Hadoop which provides a run time environment for executing MapReduce programs also does something similar, but in a distributed fashion to process huge amounts of data.
#!/usr/bin/python def square(x): return x * x def add(x, y): return x + y def main(): print reduce(add, map(square, range(100))) if __name__ == "__main__": main()
Best tips i readed. It is helpful for me. Thank friend
ReplyDeleteSteven Liu