Friday, February 10, 2012

Apache Giraph 0.1 released

As I mention again and again, Hadoop is not for solving every thing. But, Hadoop is acting like a Kernel on which a lot of things are getting built (some of HDFS, some for MR and some for both HDFS and MR). One of it is Giraph which is still in incubator and uses Hadoop MapReduce to do graph processing. Here is a nice introduction to Apache Giraph from LinkedIn. The blog also mentions why not to process graphs on MapReduce directly and use Giraph at a very high level.

http://engineering.linkedin.com/open-source/apache-giraph-framework-large-scale-graph-processing-hadoop-reaches-01-milestone

The 0.1 release binaries are not available in the mirrors, but the source code is. The Giraph home has instructions on how to build from the source and test it. Would recommend to give it a shot.

Lately I had been spending good amount of my time on HBase which is largely dependent on HDFS to store the data and can easily used as a source/sink for MR jobs. I would be following with a series of blogs on the pros and cons, getting started and some of the new features in HBase.

No comments:

Post a Comment