Friday, September 27, 2013

What is Hadoop?

Hadoop
Hadoop is an open source software framework which allows distributed processing of large datasets across clusters of computers using simple programming model.

Hadoop is a platform designed to solve where we have large data.It is designed from single server to thousands of machines and each will offer computation and storage.

Hadoop library itself detect the hardware failures.

The Hadoop project includes the following modules:

1. Hadoop common
2. Hadoop Distributed File System
3. Hadoop YARN
4. Hadoop MapReduce