Getting started with MapReduce Programming

MapReduce programming

In this article, you will learn to write MapReduce program using Java programming language. This program is to just understand the concept of MapReduce programming, which will simply take some input file and same data will be passed through Mappers and Reducer to generate the final output. Pre-Requisites: 1. Eclipse […]

Big Data – Hadoop Interview Questions

To crack an interview, it is must that your basic concepts about frameworks are pretty clear. On request of many of our students, we’ve put together a comprehensive list of questions to help you get through your Big Data – Hadoop interview. We’ve made sure that the most probable questions […]

Writing Custom Combiner in MapReduce

Combiner function is used as an optimization technique for MapReduce jobs. Combiner class combines/reduce the data generated by Mappers before it gets transferred to the Reducers. In previous post, you learned about how combiner works in MapReduce programming. In most of cases you can use Reducer class as Combiner class. […]

How Combiner works in Hadoop MapReduce

Hadoop is a framework used for handling Big Data. It uses HDFS as the distributed storage mechanism and MapReduce as the parallel processing paradigm for data residing in HDFS. The key components of Mapreduce are Mapper and Reducer. When a MapReduce Job runs on a large dataset, Mappers generate large […]

MapReduce Introduction

Hadoop MapReduce is a software framework designed to develop applications to process large dataset in parallel in a reliable and fault tolerant manner. A MapReduce application processes the input dataset into chunks in parallel on multiple nodes. The below diagram shows the different phases for a MapReduce application:   There are two […]