Pre-requisites Before starting with Hadoop 2 single node installation, make sure that the node have the following pre-requisites: a) Any Linux Operating system b) Sun Java 1.6 or above should already be installed and the version should be same across all the nodes. To install Java, you can refer to the installation steps […]
Hadoop
A guide for professionals to start working with Hadoop, understand its architecture and explore the power of Hadoop.
Prepare Node for Hadoop
Hadoop is a distributed processing framework with multiple nodes connected with each other through network. An administrator needs to prepare node for Hadoop, i.e. configure a node to be used as a part of Hadoop cluster. This blog describes a list of prerequisites and how you can configure these prerequisites before using a node as […]
Create a New VM using Oracle Virtualbox
Oracle Virtualbox is a tool to create and host Virtual machines on your system.The virtual machine is known as a Guest Operating system. This presentation covers the steps to Create a New VM using Oracle Virtualbox. To know more about Oracle Virtualbox, click here. Create New VM using Oracle Virtualbox […]
Build Hadoop-2.4.0 Source on windows and Configure in Eclipse
We can now build hadoop source version 2.4.0 on windows and configure it to use in eclipse. Follow the steps mentioned below to configure hadoop source on windows. Requisites 1. Download hadoop distribution hadoop-2.4.0-src.tar.gz from here. 2. 7-ZIP (Right click on any folder to check if it is already installed […]
HDFS High Availability Overview
Background Single Point of Failure (SPOF) in HDFS: Each cluster had a single NameNode, and if that machine or process became unavailable, the cluster as a whole would be unavailable until the NameNode was either restarted or brought up on a separate machine. Ecosystem Dependency: The Hadoop ecosystem components like […]
YARN Introduction
YARN is a sub-project of Hadoop introduced in Hadoop 2.0. It is the next generation framework for resource management. With Map-Reduce focusing only on batch processing, YARN was conceptualised to provide a more general processing platform for data stored in HDFS. This document summarises the growing need of big data […]
Configure Static IP Address in Ubuntu
You can configure a network adapter of a machine to use a static IP address. To configure static IP address in Ubuntu, you need to edit the /etc/network/interfaces file Open the file with Sudo Option sudo nano /etc/network/interfaces Assuming that the eth1 is the adapter for which static IP address is to be […]
Install Java in Linux
To install Java in Linux, refer the following instructions: 1. Download the 32bit or 64bit compressed binary “.tar.gz” file from here. 2. Create a system directory like /usr/lib/jvm to install JDK and copy the tar file to the directory. sudo mkdir -p /usr/lib/jvm sudo mv jdk-7u3-linux-x64.tar.gz /usr/lib/jvm/ 3. Change the present […]