Zookeeper Clustered Mode Installation

Pre-requisites Before starting with Zookeeper cluster mode installation, make sure that the node have the following pre-requisites: a)  Supported Platforms: GNU/Linux, Win32, MacOSX, FreeBSD and Sun Solaris. This blog describes the installation steps for Linux. b)  Sun Java 1.6 or above should already be installed. To install Java, you can refer to the installation steps […]

Flume Installation

Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store. In this post, we would discuss about flume installation. The use of Apache Flume is not only restricted to log data aggregation. […]

Apache Pig Installation

Apache PIG is a data analytics framework in Hadoop ecosystem. It is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform consists of a textual language called Pig Latin. Pig internally execute its Hadoop jobs in MapReduce. Pig’s infrastructure layer consists of a […]

Hive Metastore

Hive Metastore Introduction Hive Metastore is a central repository for Hive metadata. It has 2 components: A Service to which the Hive Driver connects to and queries for the database schema. A backing database to store the metadata. Currently Hive supports 5 backend databases: Derby, MySQL, MS SQL Server, Oracle […]