Tuesday, October 17, 2017

EMR logging into the master instance

Once we spawn a Cluster as mentioned here, we should see the instances in the EC2 management console. It would be nice to login to the master instance. All the log files are generated on the master and then moved to S3. Also, the different Big Data processing jobs can be run from the master command line interface.

In this blog we will look into connecting to the master. The AWS documentation for the same is here.

Step 1 : Click on the gear button on the top right. The columns in the page can be added or deleted here.

Include the EMR related keys as shown in the right of the above screen shot and the EC2 instance roles (MASTER and CORE) will be displayed as shown below.

Get the DNS hostname of the master instance after select it.

Step 2 : From the same EC2 management console, modify the Security Group associated with the master instance to allow inbound port 22 as shown below.

Step 3 : Now ssh into the master as shown below. Note that the DNS name of the master has to be changed.
ssh -i /home/praveen/Documents/AWS-Keys/MyKeyPair.pem hadoop@ec2-54-147-238-2.compute-1.amazonaws.com

Step 4 : Go to the '/mnt/var/log/' and check the different log files.

In the upcoming blog, we will explore running a Hive script from the master itself once we have logged into it.

No comments:

Post a Comment