85% of the respondents have big data initiative either in progress or planned.
Customer and Risk Management are the main focus area for companies using big data solution. In total survey found 17 different business functions using big data technology.
There is nothing like knowing that some one is using this technology to bring value to their business. Even better if they have made Big Data as part of their business strategy to win over the competition. Here are some examples for us.
Understanding Big Data today is confusing. Here is my attempt to explain Big Data.
Big Data is an umbrella term used to refer the technology behind collecting and analyzing large volume of data at a fast speed. In last few years, number of devices and services customers use, have increased multi fold. As customers are using more of everything, they are creating more data. By inter connecting these data, you can know your customer better and provide a better service. Big Data helps you in storing and connecting these data.
Before we start, I want to give credit to author of the book “Hadoop – The Definitive Guide“, for the read and write diagrams. This is one of the best diagrams available to show the read and write operation in Hadoop.
In the beginning, we talked about Hadoop architecture that it is fault tolerant. So to recover from failing data nodes, it does replication. While doing replication it follows the principles as listed in the picture.
Here we will talk about the steps and sequence of Hadoop processing. Every step is numbered so you can follow the sequence easily. By following the sequence, you can also understand the relationship among these components.
Let us explore little further about HDFS. Some of the attributes of HDFS are commodity hardware, fault tolerant and ability to handle large set of data. It should have high throughput and streaming access to file system. Streaming access is nothing but ability to read groups of data block in one read. This helps Hadoop to read data faster. Just for our convenience to read, let us list them.
In this post, we will look into the core components of Hadoop and find out their role. There are two major concepts for us to understand in Hadoop. Once you understand these, you will start understanding why Hadoop works the way, it does. These concepts are related to HDFS and Map Reduce. At a very high level, HDFS handles storage and Map Reduce does fast computing.