Monday, January 9, 2012

What is HDFS?

Here is a nice paper on introduction to HDFS with a clear explanation of the different features implemented. The paper doesn't go much into the API level details.

Most of the HDFS features are covered except for the HDFS federation which was introduced in 0.23 release and HDFS High Availability which will be included in the coming Hadoop release 0.24.

Also, included is the  benchmarking on the Yahoo cluster.

