Tavant Logo

What’s so Big about Big Data?

Share to

About 90% of the data in today’s World has been created only in the last two years. As per Google, every 2 days we create as much information as we did till 2003. There are around 200 million Tweets every day. Facebook gets around 6 billion messages per day. When it comes to handling this kind of a data explosion, conventional RDBMS has its own limitations; that’s where Big Data comes into the picture. Big Data is about handling the 3Vs – Volume, Velocity, and Variety.
Volume: Big Data can handle data in Petabytes or more; this is a difficult task in RDBMS.
Velocity: Velocity describes the frequency at which data is generated, captured and shared. Big Data is capable of handling dynamic data from diverse sources—online systems, sensors, social media, web clickstream, and other channels.
Variety: Big Data comprises all types of data—structured, semi-structured and unstructured data (such as text, sensor data, audio, video, click streams, log files and more).

According to O’Reilly Media,., “Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. To gain value from this data, you must choose an alternative way to process it.”

The comparison chart shown below throws more light on the differences between RDBMS and Big Data.

RDBMS

Big Data

Variety

Places data inside well-defined structures or tables using meta data. But it can’t handle semi-structured and unstructured datalike photos, videos and posting messages on Social Media.

Has the capability to handle a variety of data (structured, semi-structured and unstructured data) through different NoSQL databases like graph, document, key-value and column family databases.

Volume

Can handle data in MBs and GBs better than any Big Data system, but its performance goes down as the data size increases to TBs or PBs. The RDBMS system can be scaled up and not scaled out. Also, the cost of scaling up a system is high.

Good in handling a large size of data. So it is very efficiently used by sites like Facebook, LinkedIn and Twitter, where the data size is huge. Big Data handles this task through
scaling out on commodity hardware.

Velocity

Can handle small sets of data, but can’t manage the speed at which data arrives on sites like Facebook, Twitter, etc. So, the performance will be poor when the velocity is high.

Can easily handle high velocity data like the millions/billions of messages arriving on social networking sites. It can handle the data through parallel processing, which is not possible in RDBMS.

Apart from data storage and retrieval, Big Data has capabilities to process the data efficiently, e.g., we can divide the data to be processed into hundreds or thousands of commodity hardware, and the data can be processed independently on each machine.

So the bottom line is that the omnipresent and ongoing buzz around Big Data is definitely not a passing fad. This claim can be substantiated with the fact that today’s technology-driven, net-enabled businesses are continuing to count on Big Data – big time!

Tags :

Let’s create new possibilities with technology