Tuesday, April 20, 2021

Installing K8S on AWS EC2 and connecting via Lens

There are tons of ways of setting up K8S on AWS. Today we will see one of the easiest way to get started with K8S on AWS. The good thing is that we would be using t2.micro instance type, which falls under the AWS free tier. This configuration is good enough to get started with K8S and not for production setup. It's with the assumption that the reader is familiar with the basic concepts of AWS.

Step 1: Create a SecurityGroup which allows all the traffic inbound and outbound as shown below. This is not a good practice, but is OK for the sake of demo and practicing K8S. Also, make sure to create a KeyPair.

Step 2: Create 3 Ubuntu instances with t2.micro as the instance type. Make sure to attach the above created SecurityGroup and to attach the KeyPair for connecting to the EC2 instances later. Name the instances as ControlPlane, Worker1 and Worker2 to avoid any confusion.

Step 3: Create an Elastic IP and assign it to the Control Plane EC2 instance or else the external IP address of the EC2 might change on reboot and we won't be able to connect from our laptop which is outside the VPC.

Step 4: Connect to the EC2 instances using Putty or some other SSH Client. Here I had setup tmux panes for the 3 EC2 instances. The left pane is for the Control Plane and the right side panes for the Worker EC2 instances. tmux has a cool feature "synchronize-panes" as mentioned in the StackOverflow response here (1).  Enter the command in one of the pane and it will be automatically be played in the other panes. If not comfortable with tmux, then simply open multiple Putty sessions to the EC2 instances.

Step 5: On the Control Plane and the Worker Instances execute the below commands. This is where the above mentioned tmux feature comes handy.

#Update Ubuntu
sudo su
apt-get update
apt-get dist-upgrade -y

#Install Docker
apt install docker.io -y
systemctl enable docker
usermod -a -G docker ubuntu

#Add Google K8S repo and install kubeadm
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
apt install kubeadm -y

#Pull K8S Docker images (makes the installation faster later)
kubeadm config images pull

Step 6: On the Control Plane instance execute the below commands.

#Initialize the Control Plane. It will take a few minutes.
#note down the complete "kubeadm join ....." command from the output
kubeadm init --pod-network-cidr= --ignore-preflight-errors=NumCPU

#Setup the K8S configiguration for Ubuntu user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

#Install the Flanner overlay network
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Step 7: On both the Worker EC2 instances execute the "kubeadm join ....." command for the Worker EC2 instances to be part of the K8S Cluster.

Step 8: Go back to the Control Plane and execute the below commands to make sure the Cluster is setup properly.

#Make sure all the nodes are in a Ready state
kubectl get nodes

#Make sure all the pods are in a running state
kubectl get pods --all-namespaces

Step 9: On the Control Plane create dep.yaml file with the below yaml content and create a deployment with the "kubectl apply -f dep.yaml" command. Get the status of the deployment/pods using the "kubectl get deployments" and "kubectl get pods" commands.

apiVersion: apps/v1
kind: Deployment
  name: nginx-deployment
    app: nginx
  replicas: 2
      app: nginx
        app: nginx
      - name: nginx
        image: nginx
        - containerPort: 80

Step 10: Now lets try connect Lens to the K8S Cluster. Connect to the Control Plane EC2 instance and execute the below commands to generate the certificates again. Make sure to replace the IP addresses with the Public and Private IP address of the Control Plane EC2 instance.

sudo su
rm /etc/kubernetes/pki/apiserver.*
kubeadm init phase certs all --apiserver-advertise-address= --apiserver-cert-extra-sans=,
docker rm -f `docker ps -q -f 'name=k8s_kube-apiserver*'`
systemctl restart kubelet

Step 11:  Copy the content of the ".kube/config" file from the Control Plane to the laptop and save it as a file. This file has all the details to connect to the K8S Cluster.  Replace server IP with the external IP of the master EC2

Step 12: Download Lens from here and install it.  Add a Cluster by pointing to the K8S config file created earlier. Go to Cluster properties in Lens and install Metrics.

In a few seconds, Lens would be gathering the details and metrics from the K8S Cluster on AWS. Note that as of now there is not too much pressure on the EC2 instances.

(List of nodes)

(Deployment which was created earlier)

(Pods which were created earlier)

(kubectl commands via Lens on the Control Plane)


It's not that difficult to setup K8S on AWS using kubeadm. We haven't really considered security and performance though, this setup is good enough to get started with K8S on AWS. Since, we are installing K8S manually, we are responsible for HA, Scalability, Upgradation etc. This is where managed services like AWS EKS come into play.

Friday, April 16, 2021

Bicycling - my new hobby

It had been quite some time I blogged here. Lately I had been a bit busy with personal and professional work and didn't get much of a chance to post here. This blog is not about technology, but about a new passion which I got bitten into lately.

With the pandemic it had been tough to hit the gym and get some exercise and travelling has come to stand still. And so, I got into the habit of bicycling with my son. I bought a BTWIN Riverside 120 a hybrid bike three months back from Decathlon and really getting the kick out of it. Initially it was about 15 km round trip, but occasionally we had been riding about 40 km also. We had been exploring new routes on a regular basis and I never knew that so many nice places existed around me. We get started early before it gets too hot. We load up with lots of water to keep us hydrated and energy bars to push us. 

Fortunately, we have Mahavir Harina Vanasthali National Park a few km drive from our house and we had been hitting it quite often. And also, for whatever reason it has not become a concrete jungle as of now. Fingers crossed, we hope it remains the same for ever.

First few days I used to use MapMyRide on my mobile to keep track of the route, time and distance bicycled. But, it had become more of a distraction than something useful. So, stopped using it and started enjoying the places around me and be in the movement. Over time I had become averse of gadgets for some reason and try to keep it as simple as possible. I could have bought a GoPro, Garmin device, but charging, transferring the data etc. nah nah nah.

It' a bit tough to carry a DSLR on the bicycle, but I had been using my mobile phone to take a few pictures here and there.  Below are few of the pictures from my trip. Planning to buy a car rack, so that we can explore a bit far away places, but for now we are exploring places which are within 10 to 20 km radius from where we stay.

Hope you like the pictures, I will try to keep the blog updated with pictures of any good locations I come across. Meanwhile, stay healthy and keep safe.

Wednesday, October 14, 2020

Applications around the intersection of Big Data / Machine Learning and AWS

As many of the readers of this blog know I am a big fan of Big Data and the AWS Cloud, especially I am interested in the intersection of these two. But, Big Data processing requires huge number of machines, to process huge amounts of data and do some complex processing as in the case of Machine Learning.

Cloud has democratized the usage of Big Data, there is no need to buy any machines, we can spin a number of EC2 instances, do the Big Data processing and once done we can terminate the EC2 instances. AWS and other vendors are doing a lot of hardware and software innovations in this space, below are a few hardware innovations from AWS. They do require a lot of investment in the R&D and building them, which is usually possible at the scale Cloud operates.

AWS Nitro Systems : Some of the virtualization responsibilities have been shifted from the CPU to the dedicated hardware and software.

AWS Graviton Processor : The Graviton processor uses ARM based architecture, similar to the once used on mobile phones. Now we can spin EC2 with Graviton Processor.

AWS and Nvidia : They bring very high end GPU to the Cloud with the EC2 instances for Machine Learning modelling.

AWS Inferentia : Once the Machine Learning model has been created, the next step is inference which takes most of the CPU cycles. Inferentia is a custom chip from AWS for the same.

F1 Instances : Hardware acceleration on the EC2 using FPGA.

Coming back to the subject of this blog, AWS provides a few open data sets via S3 for free for us to do the processing in the Cloud and get some meaningful insights out of it. The data sets can be found here. For those who are familiar with either AWS or Big Data, the challenge is how to figure out how the intersection of these work together. For this AWS has published a bunch of blogs/articles here on the intersection of AWS and Big Data /Machine Learning for different domains. Below is a sample application around the intersection of Big Data and AWS around Genome data. Note that AWS has been highlighted, look out for more of them.


The intersection of Big Data / Machine Learning and AWS is very interesting. Cloud with the pricing democratizes the usage of Big Data / Machine Learning, but each one is a beast on its own to learn and there is a lot of innovation happening in this space and it's tough to keep in pace. Here are a few applications around these to get started. Good Luck !!!

Thursday, October 8, 2020

Setting up additional EC2 users with username/password and Keypair authentication

When an Ubuntu EC2 instances is created in the AWS Cloud, we should be able to connect to the EC2 using the username/password and the Keypairs. In the case of the Ubuntu AMI provided by AWS, only the Keypair authentication is enabled while the username/password authentication is disabled. Very often I get the query "How to create additional users for the Ubuntu EC2 with Keypair for authentication", so is the blog. At the end of the day, Linux is a Linux weather we run it in the Cloud, Laptop or in On-Premise, so the instructions apply everywhere.

Setting up an EC2 user with username/password authentication

Step 1: Create an Ubuntu EC2 instance and connect to it

Step 2: Add user "praveen" using the below command
#Enter the password and other details
sudo adduser praveen

Step 3: Open the "/etc/ssh/sshd_config" file and set "PasswordAuthentication" to yes

Step 4: Restart the ssh service
sudo service ssh restart

Step 5: Connect to the EC2 as the user "praveen" via Putty or some other software by specifying the password

Setting up an EC2 user with Keypair authentication

Step 1: Add user "sripati" and disable the the password authentication
#as we would be using the Keypair for authentication
sudo adduser sripati --disabled-password

Step 2: Switch as the user
sudo su - sripati

Step 3: Generate the keys. They would be in the .ssh folder

Step 4: Copy the public key to the authorized_keys file in the .ssh folder
cat .ssh/id_rsa.pub >> .ssh/authorized_keys

Step 5: Copy the private key in the ~/.ssh/id_rsa to a file sripati.pem on your local machine
cat ~/.ssh/id_rsa

Step 6: Using PuttyGen convert the pem file to ppk. "Load" the pem file and "Save private key" in the ppk format.

Step 7: Now connect via Putty via the username as "sripati", the public IP of the EC2 instance and private key in the ppk format. There is no need to specify the password.

Tuesday, October 6, 2020

Provisioning AWS infrastructure using Ansible

Cloud infrastructure provision can be automated using code. The main advantage is that the process can be repeated with consistent output and the code can be version controlled in github, bitbucket or something else.

AWS comes with CloudFormation for automation of the provisioning of the AWS infrastructure, the main disadvantage is that CloudFormation template (code) is very specific to AWS and takes a lot of effort to migrate to some other Cloud. In this blog we will look at Ansible using which infrastructure can be provisioned for multiple Clouds and also migrating code to provision code to some Cloud doesn't take as much effort as with CloudFormation.

We would installing Ansible on an Ubuntu EC2 instance for provisioning of the AWS infrastructure. Ansible can be setup on Windows also, but as we install more and more softwares on Windows (host OS) directly, it becomes slow over time. So, I prefer to launch an EC2, try a few things and tear it down once done with it. Anyway, lets look at setting up Ansible and create AWS infrastructure on it.

 Step 1: Create an Ubuntu instances (t2.micro) and connect to it.

Step 2: Install Python and boto (AWS SDK for Python) on the EC2 instance using the below commands.

   sudo apt-get update
   sudo apt-get install python2.7 python-pip -y
   pip install boto

Step 3: Install Ansible using the below command.

   sudo apt install software-properties-common -y
   sudo apt-add-repository --yes --update ppa:ansible/ansible
   sudo apt install ansible -y

Step 4: Go to the IAM Management Console here (1) and create the Access Keys. Note them down.

Step 5: Export the Access Keys using the below commands. Make sure to replace 'ABC' and 'DEF' with the Access Keys which have been generated in the previous step.


Step 6: Create a file called "launch-ec2.yaml" with the below content. Make sure to replace the highlighted sections.

- name: Provision a set of instances
  hosts: localhost
    - name: Provision a set of instances
        key_name: my-keypair
        region: us-east-1
          - sg-0fa7df1dab4d7ebcb
          - sg-040f6c6ef9932dbb5
        instance_type: t2.micro
        image: ami-0bcc094591f354be2
        wait: yes
          Name: Demo
        exact_count: 1
        count_tag: Name
        assign_public_ip: yes
        vpc_subnet_id: subnet-59120577

Step 7: Execute the below command to launch an EC2 instance.

ansible-playbook launch-ec2.yaml

Step 8: Go to the EC2 Management Console and notice a new EC2 instance has been launched with the Name:Demo tag. Make sure to note down the "Instance ID" of the newly created EC2 instance.

Step 9: Create a file called "terminate-ec2.yaml" with the below content. Make sure to replace the highlighted section with the Instance ID of the EC2 got from the previous step.

- name: Terminate instances
  hosts: localhost
    - name: Terminate instances
        state: "absent"
        instance_ids: "i-08ef0942aabbc45d7"
        region: us-east-1
        wait: true

Step 10: Execute the below command to launch an EC2 instance.

ansible-playbook terminate-ec2.yaml

Step 11: Go back to the EC2 Management Console and notice that the EC2 which was created by Ansible will be in a terminated status within a few minutes.


By using YAML code, we were able to launch and terminate instance. Ansible allows to do lot of complicated things than this, this is something to start with. As mentioned earlier Ansible allows easy migration to some other Cloud vendor when compared to AWS CloudFormation. BTW, Ansible has been bought by Red Hat which has been bought by IBM. So, Ansible is part of IBM now.

For reference, here is the yaml code for launching and terminating the EC2 instances, the screen has been split horizontally using tmux.

Thursday, October 1, 2020

Automating EC2 or Linux tasks using "tmux"

A lot of times we do create multiple EC2 instances and install the same software on each one of them manually, this can be for trying out a Load Balancer feature or to test routing with High Availability across different Regions and Availability Zones. One way to avoid this manual process is to create an AMI, but they are immutable and a new AMI has to be created for even small changes. This is where tmux (Terminal Multiplexer) comes into play.

Here the assumptions is that we want three EC2 instances as shown above and they are fronted by an ELB, which will load balance the traffic across these EC2 instances. On each of these instances we would like to install Apache2 and create webpages. For this, we would be using one of the EC2 as the jump or bastion box and connect to the other two EC2 instances from here as shown below.

Step 1: Start three EC2 Ubuntu instances and name them as "WS1/Jump/BastionBox", "WS2" and "WS3".

Step 2: Download pagent.exe from here (1) and click on "Add Key" and point to the Private Key in the ppk format. Close the window.

Step 3: Connect to the EC2 instance names as "WS1/Jump/BastionBox" via Putty. In the "Host Name (or IP address)" specify the username and the IP as show below.

Go to "Connection --> SSH --> Auth" and make sure to select "Allow agent forwarding". This makes it easy to connect to the EC2 instances, as there is no need to specify the Private Key, it would be picked from pagent.exe. Click on "Open" to connect to the EC2 instance.

Step 4: Execute the tmux command to start it.

Step 5: Enter "Ctrl + B" and "%" to split the panes horizontally. Again enter "Ctrl + B" and "Double Quotes" to split the panes vertically. Now we should see three panes as shown below. Use the "Ctrl + B" and the arrow buttons to navigate the panes.

Step 6:  On the right side upper and bottom panes execute the "ssh ubuntu@ip" command to login to the EC2 instances. Make sure to replace the IP address of WS2 and WS3 EC2 instances in the command. 

Step 7: Now we are connected to three EC2 instances as shown below. Execute the "ifconfig" command on all the panes and note that the IP address should be different. This is to make sure we are connected to different EC2 instances.

Step 8: Now we will turn on the synchronization across the panes, this way any command executed on the panes will be automatically executed on the other panes also automatically. For synchronization to happen enter "Ctrl-B " and ":" and "setw synchronize-panes on" and "Enter Button". Use the setw command with "off" options to turn off the synchronization across the panes.

Step 9: Navigate to one of the pane and notice that any command executed in one of the pane would get executed in the other panes. Ain't it neat !!!


When we want to automate tasks AWS provides a few means like SSM, OpsWorks, AMI and so on. But, there are good for automating on the long run, but not good when we want to try different things in an iterative approach or we are really not sure what we want to do.

This is where tmux with the synchronization feature comes handy. There is lot more to tmux, but hope this blog articles helps you to get started with tmux and builds the curiosity around it.

Tuesday, September 29, 2020

Using the same Keypair across AWS Regions

In one of the previous blog (1), we looked what happens behind the scenes when we use a Keypair for authentications against Linux. This blog post is more about productivity. I do create and connect to EC2 instances quite often and so I have created Sessions in Putty for most of my regularly connected Linux instances. One of the Session is for AWS which automatically populates the username and the keypair as shown below. When I would like to connect to an EC2 instance all I need to specify the Public IP address of the EC2 instance.

It all looks fine and dandy, the only problem is when I create EC2 instances in different AWS regions to test High Availability or some other features and try to connect to them. With the above approach since the Keypairs have regional scope, when I connect to EC2 instances in different regions, I need to change the keypairs in Putty. It would be good to use the same Keypair across regions, this way I don't need to change when connecting to the EC2 in different regions when using Putty saved sessions feature. Let's look at how to.

Step 1: Download putty.exe and puttygen.exe from here (1). There is no need to install it, just downloading should be good enough.

Step 2: Go to the EC2 Management Console and create a Keypair. Generate the Keypair by selecting the pem or ppk format. 

Step 3: When prompted store the private key.

The Keypair should be created as shown below.

Step 4: Start PuttyGen and click on Load.

Step 5: Point to the private key which has been downloaded earlier. If the file is not visible then remove the filter and select "All files (*.*)". Click on Open and click on OK.

Step 6: Click on "Save public key" and specify the same file name but with a pub extension as shown below.

Step 7: Go to the EC2 Management Console for some other region and navigate to the Keypair tab. Click on Actions and then "Import key pair".

Step 8: Click on "Choose file" and point to the pub file which was created earlier. Finally click on Import to create the Keypair.


Now we have created a Keypair in two regions. And both the regions have keypairs which have the same public/private key. So, we would be able to use the same Putty session when connecting to the EC2 instances in different regions. It's not a life saving hack, but it something interesting to know and saves a few seconds/clicks here and there.

Note that this approach is not recommended for production and sensitive setup as we are using the same Keypair across regions, but can definitely used when we are trying to learn AWS.