Saturday, March 24, 2012

BSP (Hama) vs MR (Hadoop)

BSP (Bulk Synchronous Parallel) is a distributed computing model similar to MR (MapReduce). For some class of problems BSP performs better than MR and other way. This paper compares MR and BSP.

While Hadoop implements MR, Hama implements BSP. The ecosystem/interest/tools/$$$/articles around Hadoop is much more when compared to Hama. Recently Hama has released 0.4.0 and includes examples for page rank, graph exploration, shortest path and others. Hama has many similarities (API, command line, design etc) with Hadoop and it shouldn't take much time to get started with Hama for those who are familiar with Hadoop.

As of now Hama is in incubator stage and is looking for contributors. Hama is still in the early stage and there is a lot of scope for improvement (performance, testing, documentation etc) . For someone who wants to start with Apache, Hama is a choice. Thomas and Edward had been actively blogging about Hama and are very responsive for any clarifications in the Hama community.

I am planning to spend some time on Hama (learning and contributing) and would keep posting on this blog on the progress.

