In this blog althrough we had been exploring different frameworks like Hadoop, Pig, Hive and others which are batch oriented (some call it long time similar to real time). So, based on the size of the data and the cluster it might take a couple of hours to process the data.
Above mentioned frameworks might not meet all the user requirement. Lets take the use case of a Credit Card or an Insurance company, they would like to detect frauds happening as soon as possible to minimize the effects of frauds. This is where frameworks like Apache Storm, LinkedIn Samza, Amazon Kinesis help to fill the gap.
A social analytics company called BackType acquired by Twitter developed Storm. Twitter later released the code for Storm. This Twitter blog makes a good introduction to Storm architecture. Instead of the repeating the same here, we will look into how to install and configure on Apache Storm on a single node. It took me a couple of hours to figure it out, so this blog entry to make it easy for others to get started with Storm. 1, 2 had been really helpful to figure it out. Note that Storm has been submitted to the Apache Software Foundation and is in the Incubator phase.
Here are steps from 10k feet
- Download the binaries, Install and Configure - ZooKeeper.
- Download the code, build, install - zeromq and jzmq.
- Download the binaries, Install and Configure - Storm.
- Download the sample Storm code, build and execute them.
Here are the steps in detail. No need to install Hadoop etc, Storm works independent of Hadoop. Note that these steps have been tried on Ubuntu 12.04 and might change a bit with other OS.
Above mentioned frameworks might not meet all the user requirement. Lets take the use case of a Credit Card or an Insurance company, they would like to detect frauds happening as soon as possible to minimize the effects of frauds. This is where frameworks like Apache Storm, LinkedIn Samza, Amazon Kinesis help to fill the gap.
A social analytics company called BackType acquired by Twitter developed Storm. Twitter later released the code for Storm. This Twitter blog makes a good introduction to Storm architecture. Instead of the repeating the same here, we will look into how to install and configure on Apache Storm on a single node. It took me a couple of hours to figure it out, so this blog entry to make it easy for others to get started with Storm. 1, 2 had been really helpful to figure it out. Note that Storm has been submitted to the Apache Software Foundation and is in the Incubator phase.
Here are steps from 10k feet
- Download the binaries, Install and Configure - ZooKeeper.
- Download the code, build, install - zeromq and jzmq.
- Download the binaries, Install and Configure - Storm.
- Download the sample Storm code, build and execute them.
Here are the steps in detail. No need to install Hadoop etc, Storm works independent of Hadoop. Note that these steps have been tried on Ubuntu 12.04 and might change a bit with other OS.