Online Training with Virtual Labs

Big Data Introduction

Understanding Big Data today is confusing. Here is my attempt to explain Big Data.

Big Data is an umbrella term used to refer the technology behind collecting and analyzing large volume of data at a fast speed.  In last few years, number of devices and services customers use, have increased multi fold. As customers are using more of everything, they are creating more data. By inter connecting these data, you can know your customer better and provide a better service. Big Data helps you in storing and connecting these data.


Here are some of the Big Data Drivers:

  • The high adoption of data capture and creation technologies
  • Increased “interconnectivity” drives consumption, creates more data
  • Inexpensive storage makes it possible to keep more data for longer period
  • Hadoop software and analysis tools turn data into information

Big Data Characteristics

Below is one of the better picture, I have found to explain the Big data characteristics in simple terms. Volume, velocity, variety and veracity are four characteristics of big data. Let us take an example of volume. Facebook generates around 500 Terabytes of data each day. Now compare it with a plane. For each 30 minutes flight, it creates 10 Terabytes of data. So in way, one plane creates same amount of data every day as of Facebook. Till we got the bid data technology,  airlines could not harvest these data in any meaningful way. Velocity is all about speed at which you are getting data and Variety is multiple data types. Veracity comes in play when things are ambiguous.

bigdata-attributes

Big Data Contexts

Now that we know the attributes, let us see the progression of the data complexity in the context of  IT application. Here is an image from Teradata, which has captured the essence of data varities and complexity. In the ERP days, we had low data volume with few data complexity. As we passed thru CRM and WEB, the data volume and complexity increased. Now in Big data we are adding much more data with tons of complexity.

bigdatacontexts