Loading, please wait...

A to Z Full Forms and Acronyms

Hadoop - A quick reference

Jul 26, 2019 Big Data, 2701 Views
Big Data Solutions

Overview Of Hadoop

Traditional Approach

Traditionally the company uses the computer to store and analyze the big data. In this approach, the user interacts with the application which in turn takes care of the part of data storage and analysis.

Limitation

This approach works fine with those applications which process the limited amount of data up to limit of processor that is processing the data. But when it comes to dealing with the massive amount of data, it is a hectic task to process such data through a single database server.

Google's Approach

Google solved this big data problem using his algorithm MapReduce. This algorithm divides the problem into a number of problems and distribute it to the many computers, and collects the results the result from them which when integrated, through result datasets.

Hadoop

Doug Cutting and his team using the solution provided by Google developed an Open Source Project called Hadoop. Hadoop is an open-source framework for writing and running the distributed applications that process a huge amount of data. The key distinction of Hadoop are-

  • Accessible- Hadoop runs on large clusters of commodity hardware.
  • Robust- As it is intended to run on commodity hardware, it is architected with the assumption of a frequent hardware breakdown. It elegantly handles most of the failures.
  • Scalable- It supports both horizontal and vertical scaling.
  • Simple- It allows users to write efficient parallel code.

As you can see the figure how one interacts with a Hadoop cluster. A Hadoop cluster is a set of commodity hardware networked together in one location. Data storage and processing all happens within this 'cloud' of machines. Different users can submit computing jobs to Hadoop from their own computer in remote locations from the Hadoop cluster.

Hadoop runs the application using MapReduce algorithms, where data is processed in parallel with others. Hadoop is used to develop applications that could perform statistical analysis on a huge amount of data.

A to Z Full Forms and Acronyms

Related Article