ElasticSearch Interview Questions

What is ElasticSearch? Elasticsearch is a search engine based on Lucene. It has a distributed, multitenant-able full-text search engine. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. What is an index in ElasticSearch? An index is a collection of documents that have somewhat […]

Getting started with MapReduce Programming

In this article, you will learn to write MapReduce program using Java programming language. This program is to just understand the concept of MapReduce programming, which will simply take some input file and same data will be passed through Mappers and Reducer to generate the final output. Pre-Requisites: 1. Eclipse […]

Writing Custom Combiner in MapReduce

Combiner function is used as an optimization technique for MapReduce jobs. Combiner class combines/reduce the data generated by Mappers before it gets transferred to the Reducers. In previous post, you learned about how combiner works in MapReduce programming. In most of cases you can use Reducer class as Combiner class. […]

How Combiner works in Hadoop MapReduce

Hadoop is a framework used for handling Big Data. It uses HDFS as the distributed storage mechanism and MapReduce as the parallel processing paradigm for data residing in HDFS. The key components of Mapreduce are Mapper and Reducer. When a MapReduce Job runs on a large dataset, Mappers generate large […]

Importing Data using Sqoop

Sqoop is an Apache Hadoop top-level project and designed to move data between Hadoop and RDBMS. Sqoop is a collection of related tools. To use Sqoop, you specify the tool you want to use and the arguments that control the tool.

In this post, we will cover how to […]