Wednesday, December 4, 2013

New Virtual Machine to get started with Big Data

We offer a Virtual Machine (VM) for those interested in the Big Data training to get started easily. More details about the VM here. The VM avoids the burden of installing/configuring/integrating different Big Data frameworks on the developer. The VM can be easily installed on a Windows/Mac/Linux machine once it has been downloaded. Along with the VM, documentation with the instructions on how to use those different frameworks will also be provided.

The original VM had been created almost an year back and contained some outdated frameworks. So, we created a new VM for those who wanted to dive into the exciting world of Big Data and related technologies.

The old VM already had Hadoop, Hive, Pig, HBase, Cassandra, Flume, ZooKeeper which have been upgraded with the latest releases from Apache. The new VM has been built using Ubuntu 12.04 64-bit with the below mentioned  frameworks newly included.

- Oozie (for creating work flows and schedulers)
- Storm (for performing real time processing)
- Neo4J (for storing and retrieving graphs)
- Hadoop 2x (for executing MR on YARN)
- Phoenix (for SQL on top of HBase)
- rmr2 and rhdfs (for quickly developing MR in R)

Because of using FOSS, there is no limitation on how long the VM can be used. For more information about the Big Data Virtual Machine and the training, please send an email to

No comments:

Post a Comment