Monday, October 10, 2016

What is Dockers all about?

There had been a lot of noise about Docker. And there had been a raft of announcements (1, 2, 3) about the support for Docker in their products from different  companies. In this blog, we will look what Docker is all about, but before that we will look into what virtualization, LXC. In fact, Docker-Big Data integration also can be done.

What is Virtualization?

Virtualization allows to run multiple operating systems on a single machine. For this to happen virtualization software like Xen, KVM, HyperV, VMWare vSphere, VirtualBox has to be installed. Oracle VirtualBox is free and easy to setup. Once VirtualBox has been installed, multiple guest OS can be run on top of VirtualBox as shown below.
On top of the guest OS, applications can be installed. The main advantage of the above configuration is that the applications are isolated from each other. And also, resources can be allocated to the guest OS and a single application can’t dominate and use the underlying hardware resources completely and let the other applications starve for resources. The main disadvantage is each of the guest OS gets it’s own kernel and the file system and hence consume a lot of resources.

What is LXC?

LXC (Linux Containers) provide OS level virtualization and don’t need a complete OS to be installed as mentioned in the case of virtualization.
The main advantage of LXC is that they are light weight and there is little over head of running the applications on top of LXC instead of directly on top of the host OS. And also, LXC provides isolation between the different applications deployed in them and resources can also be allocated them.

LXC can be thought of light weight machines which consume less resources, are easy to start and on which applications can be deployed.

How does Docker fit into the entire thing?

LXCs are all interesting, but it’s not that easy to migrate an application on top of LXC from one environment to another (let’s say from development to QA to production) as there are a lot of dependencies between the application and the the underlying system.

Docker provides a platform, so that applications can be be built and packaged in a standardized way. The application built, will also include all the dependencies and so there is less friction moving from one environment to another environment. Also, Docker will try to abstract the resources required for the application, so that that application can run in different environments without any changes.

For those from a Java/JEE background, Docker applications can be considered similar to an EAR file which can be migrated from one environment to another without any changes as long as the proper standards are followed while creating the EAR file. The Java applications can make JNDI calls to discover services by name and so are not tied to a particular resource.

BTW, here is a nice documentation on Docker.


  1. very nicely explained... for more information go through this link big data hadoop

  2. Explain Data Persistance in Docker,with cloudera-quickstart

  3. data persistance using Docker with cloudera_quickstart..we expect an article on that