Security

Some quick highlights

1. Security feature is implemented only in 0.20.203, 0.20.204, 0.20.205 (now called 1.0.0) and 0.23 releases. Kerberos is used for the authentication of the user. Once the user is authenticated, the group resolution is done at the Hadoop master nodes (JobTracker and NameNode). Cloudera has exhaustive documentation on setting up security in Hadoop.

2. Authorization is implemented within Hadoop (permissions on HDFS directories/files, hadoop-policy.xml etc).

3. Delegation tokens are used to decrease the load on the Kerberos KDC (Key Distribution Center).

4. In other releases security is not present, `whoami` command is used at the client and the output is sent to Hadoop. It's very easy to fake this by creating a shell script with the name `whoami` and putting it in the path. The only way to secure Hadoop is to create a cluster and authenticate the user at the gateway.

5. Alfredo (Hadoop Auth) can be used to protect/access HTTP resources (including JobTracker/TaskTracker/NameNode/DataNode) using Kerberos HTTP SPNEGO.

Introduction to Security in Hadoop


Alfredo (Hadoop Auth)


General