Big Data and Cloud Tips: RDBMS vs NoSQL Data Flow Architecture

Saturday, December 1, 2012

RDBMS vs NoSQL Data Flow Architecture

Recently I had an interesting conversation with someone who is an expert in Oracle Database on the difference between RDBMS and a NoSQL Database. There are a lot of differences, but the data flow is as shown below in those systems.

In a traditional RDBMS, the data is first written to the database, then to the memory. When the memory reaches a certain threshold, it's written to the Logs. The Log files are used for recovering in case of server crash. In RDBMS before returning a success on an insert/update to the client, the data has to be validated against the predefined schema, indexes created and other things which makes it a bit slow compared to the NoSQL approach discussed below.

In case of a NoSQL database like HBase, the data is first written to the Log (WAL), then to the memory. When the memory reaches a certain threshold, it's written to the Database. Before returning a success for a put call, the data has to be just written to the Log file, there is no need for the data to be written to the Database and validated against the schema.

Log files (first step in NoSQL) are just appended at the end and is much faster than writing to the Database (first step in RDBMS). The NoSQL data flow discussed above gives a higher threshold/rate during data inserts/updates in case of NoSQL Databases when compared to RDBMS.

Big Data and Cloud Tips

Pages

Saturday, December 1, 2012

RDBMS vs NoSQL Data Flow Architecture

No comments:

Post a Comment