Wednesday, August 12, 2020

Connecting to S3 Service via VPC Gateway Endpoint

Lets say we are building a image processing application using ML which gets the images from S3 and identifies the action performed (sitting, standing etc) in those images . By default the network data flows from the application to S3 over the internet as shown in the left image which is not really that efficient and secure. AWS provides VPC Gateway Endpoint feature and this all the data will be within the AWS network only.

 
VPC Endpoints provides Gateway Endpoints for S3 and DynamoDB services, while Interface Endpoints are for the rest of the services. In this blog we will explore the VPC Gateway Endpoints.
 

Step 1: Create a VPC as mentioned in the previous blog and connect to the EC2 in the Private Subnet. By default the route table for the Private Subnet has route for 0.0.0.0/0 and so any EC2 in the Private Subnet will have internet connection.

 
The same can be verified by pinging google.com or some other host.

Step 2: In the Putty execute the below commands to install the AWS CLI.

sudo apt-get update
sudo apt-get install python2.7 python-pip -y
pip install awscli --upgrade
export PATH="$PATH:/home/ubuntu/.local/bin/" 

Step 3: Create an IAM Role with AmazonS3ReadOnlyAccess and attach it to the EC2 instance in the Private Subnet.

Step 4: Lets remove the internet connection for the EC2 instance in the Private Subnet. For this select the Routing Table for the Private Subnet and click on Edit Routes. Delete the route for 0.0.0.0/0 and click on "Save routes".

 

The Route Table will be updated as shown below.

 
Step 5: There is no need for NAT Gateway and the ElasticIP as we have removed the internet connection for the EC2 in the Private Subnet. Make sure to remove them. You can also keep it, but there is a cost associated with it.


Test out the internet connectivity (ping google.com") and also try to get the list of files in S3 (aws s3 ls). Both the commands should fail. Press Ctrl+C to come out of the commands.

Step 6: In the VPC Management Console, go to "Endpoints" and click on "Create Endpoint".

Search for S3 in the Service Name and select "com.amazonaws.us-east-1.s3". Make sure to select the VPC which was created in the previous step and select the Private Subnet.


Rest of the default options are good enough. Click on "Create endpoint" and the Endpoint will be created in a few minutes.

Step 7: Go back to the Route Table of the Private Subnet and note that a Route has been automatically added to the VPC Endpoint.

 
Step 8: Go back to the Putty session and execute the below commands. Notice that there is no internet connection, but still we are able to get the number of buckets in the AWS S3. This is because we have setup the AWS Gateway Endpoint and all the traffic remains with the AWS network only.
 
ping google.com
aws s3 ls | wc -l 

Conclusion

By default when we consume any AWS service from an EC2 instance the network traffic goes through the internet, which is not really secure. And there is an additional cost for NAT and Internet Gateway. By using the VPC Endpoint Gateway we noticed that the network traffic remains within the AWS network only. This makes it easy for migrating the applications to AWS and also make sure they are compliant.

1 comment: