Pre-requisites Before starting with Zookeeper cluster mode installation, make sure that the node have the following pre-requisites: a) Supported Platforms: GNU/Linux, Win32, MacOSX, FreeBSD and Sun Solaris. This blog describes the installation steps for Linux. b) Sun Java 1.6 or above should already be installed. To install Java, you can refer to the installation steps […]
Hadoop Ecosystem
Flume Installation
Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store. In this post, we would discuss about flume installation. The use of Apache Flume is not only restricted to log data aggregation. […]
Apache Pig Installation
Apache PIG is a data analytics framework in Hadoop ecosystem. It is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform consists of a textual language called Pig Latin. Pig internally execute its Hadoop jobs in MapReduce. Pig’s infrastructure layer consists of a […]
Hive Metastore
Hive Metastore Introduction Hive Metastore is a central repository for Hive metadata. It has 2 components: A Service to which the Hive Driver connects to and queries for the database schema. A backing database to store the metadata. Currently Hive supports 5 backend databases: Derby, MySQL, MS SQL Server, Oracle […]
Apache Hive Introduction
Hive Introduction Hive is an Apache software foundation project originated at Facebook. It is a data warehousing system build on top of Hadoop to analyse big data using SQL like query language. This blog covers an overview of Hive architecture and its design goals. The RDBMS and NoSql databases failed to fulfil […]