Thursday, March 15, 2012

How easy is it to use Hadoop?

This article made me think how easy it is to setup Hadoop. Setting up Hadoop on a single/multiple nodes and running MR jobs is not a big challenge. But, getting it to run efficiently and securely is a completely different game. There are too many Hadoop parameters to be configured, some of which are not documented. Now, add the different hardware configurations, different Hadoop versions and lack of documentation to the mix.

All the Hadoop production clusters have a separate team who are very familiar with the Hadoop code and know it in and out. Hadoop is evolving at a very fast pace and it's like a moving target to keep updated with the changes in the Hadoop code. It's not just possible to download the Hadoop binaries and use as-is in production efficiently.

I believe in the potential of Hadoop and don't want to deter those who wanted to get started with Hadooop. This blog is all about to make it easy for those who are getting started with Hadoop. Things will change as more and more vendors are getting in the Hadoop space and as they are contributing code to Apache. But, for now Hadoop is a beast and there is and will be huge demand for Hadoop professionals.

No comments:

Post a Comment