Friday, January 20, 2012

Pre-packaged Hadoop softwares (HDP vs CDH)

HortonWorks and Cloudera are two of the active companies working on Hadoop besides LinkedIn, FaceBook and others.

Cloudera had been providing a well integrated, tested package of different Apache frameworks around big data called CDH for a couple of years. Cloudera also provides tools on top of the Apache frameworks (Cloudera Manager) for management, troubleshooting of Hadoop installations. These tools are very important as the size of the Hadoop cluster grows over time. As of this writing CDH has released 3 versions of it's flagship software CDH1, CDH2 and CDH3. Cloudera plans to include MRv2 or the next generation MR architecture in CDH4.

HortonWorks although a bit late announced the plans for the public release of Hortonworks Data Platform (HDP) end of this quarter. HDP is along similar lines of CDH. HDP1 will include Hadoop 1.0.0 (from 0.20.205 branch) and HDP2 based on Hadoop 0.23 release (MRv2). HortonWorks would also be packaging some of the similar management tools like Cloudera Manager and others.

Cloudera has partnered with different vendors (Dell, Oracle and others) to package CDH along with the vendors offerings. Similar announcements can also be expected from HortonWorks. HortonWorks had been actively working with Microsoft on porting Hadoop to Windows Server and Azure.

Getting started with Hadoop with a couple is easy for a POC is easy, but scaling to 10s to 100s to 1000s of machines in a production environment is a challenge and requires some indepth knowledge about Hadoop, best practices, patterns and the ecosystem surrounding it. This is where companies like HortonWorks and Cloudera which have a lot of Apache commitors in their payroll come into play. Besides the packaged software both HortonWorks and Cloudera also be providing commercial support for HDP and CDH.

It would be interesting to see if any other players such as HortonWorks and Cloudera come into play or if some of the big companies acquire companies like HortonWorks and Cloudera.

No comments:

Post a Comment