Install Hive with embedded metastore

Hive package comes with derby as default embeded metastore. Follow below mentioned steps to install Hive with embedded metastore: 1. Download the latest version of Hive from here. 2. Uncompress the package on linux: tar –xzvf apache-hive-0.13.1-bin.tar.gz 3. Add following to ~/.bash_profile sudo nano ~/.bash_profile export HIVE_HOME=/home/hduser/hive-0.13.1 export PATH=$PATH:$HIVE_HOME/bin Where […]

Start HDFS High Availability Cluster

In the previous blog, we discussed about the HDFS high availability configuration. This blog describes the steps to start an HDFS high availability cluster. Pre-requisites Before starting with HDFS high availability cluster, make sure that the cluster meets the following pre-requisites: a)  If you have enabled Automatic Failover for Hot-BackUp during NameNode failover,  then before starting with HDFS high availability cluster, […]

HDFS High Availability Configuration

In the previous blog, we discussed about the HDFS High availability architecture. This blog describes the configurations for HDFS high availability in a Hadoop cluster. Pre-requisites Before configuring HDFS high availability, make sure that your Hadoop cluster has the following pre-requisites: a) You must have at-least two nodes to enable HDFS high availability. b) If you want to configure […]

MongoDB installation from tar distribution

Follow the steps below for MongoDB installation using tar distribution: 1. Download the stable release of MongoDB from here. 2. Extract the distribution tar –xzvf mongodb-linux 3. Create a directory for mongo db in /opt. mkdir /opt/mongodb 4. Move the distribution files to mongodb directory mv mongolinux/* /opt/mongodb 5. Add […]

MapReduce Introduction

Hadoop MapReduce is a software framework designed to develop applications to process large dataset in parallel in a reliable and fault tolerant manner. A MapReduce application processes the input dataset into chunks in parallel on multiple nodes. The below diagram shows the different phases for a MapReduce application:   There are two […]