In the previous blog, we discussed about the HDFS high availability configuration. This blog describes the steps to start an HDFS high availability cluster.

Pre-requisites

Before starting with HDFS high availability cluster, make sure that the cluster meets the following pre-requisites:

a)  If you have enabled Automatic Failover for Hot-BackUp during NameNode failover,  then before starting with HDFS high availability cluster, make sure that Zookeeper quorum specified in the hdfs-site.xml file is running and accessible from both the NameNodes. To install and start Zookeeper Quorum, you can refer to the blog here.
b)  Both the NameNodes should have passwordless SSH configured to all the DataNodes in the cluster.
c)  It requires a Hadoop 2.x bundle and a minimum of 2 nodes to be configured as two NameNodes. You can download a Hadoop 2 bundle from here.

Steps to Start HDFS High Availability Cluster

Starting HDFS high availability cluster is summarised as a 6 step process.

Start HDFS High Availability Cluster
Start HDFS High Availability Cluster with Automatic Failover Enabled

 

Step 1 # Start Journal Nodes / Mount Shared Directory

Go to any one node and run below command to stop all running HDFS services

cd $HADOOP_PREFIX
sbin/stop-dfs.sh

If you are using Quorum Journal Manager as a mechanism to share edits log files, you need to start the Journal Nodes. Login to the nodes specified as Journal Nodes and use the hadoop-daemon.sh script in the Hadoop script directory to start the journal node service.

cd $HADOOP_PREFIX
sbin/hadoop-daemon.sh start journalnode

If you are using a network shared directory, you need to mount the directory on both the NameNodes with read/ write access. To mount your drive, you can refer to the blog here.

Step 2 # Format and Start a NameNode

Note: If you are setting up a fresh HDFS cluster,  then only you should run the format command (hdfs namenode -format) on one of NameNodes.

Login to any one of the NameNode and execute the NameNode format command.

cd $HADOOP_PREFIX
bin/hdfs namenode -format

Start namenode:

sbin/hadoop-daemon.sh start namenode

Step 3 # Start the other NameNode

Login to the other NameNode and start the NameNode process with copy metadata option.

cd $HADOOP_PREFIX
bin/hdfs namenode -bootstrapStandby

If you are converting a non-HA NameNode to be HA, you should run the command “hdfs -initializeSharedEdits“, which will initialize the JournalNodes with the edits data from the local NameNode edits directories.

bin/hdfs namenode -initializeSharedEdits

Start namenode process on this machine:

sbin/hadoop-daemon.sh start namenode

Note: Both the NameNodes are currently in StandBy state. You either need to configure HA with automatic failover or you need to initiate a manual transition to Active state using haadmin command. You can execute the below command on any one of the NameNode to change the state from StandBy to Active.

cd $HADOOP_PREFIX
bin/hdfs haadmin -transitionToActive nn1

nn1 is the NameNode id of the node as configured in the hdfs-site.xml file. With manual failover, you can skip steps 4 & 5 and start DataNode services.

If automatic failover enabled, you need to initialize HA in Zookeeper and start ZKFC services on both the NameNodes (Step 4 / 5).

Step 4 # Initialize HA in Zookeeper

You need to initialize HA in the Zookeeper quorum. Your Zookeeper quorum should be running before executing the initialize command. If you need to configure a Zookeeper quorum, you can refer to the blog here. Login to any one of the NameNode and execute the below command:

cd $HADOOP_PREFIX
bin/hdfs zkfc -formatZK

Step 5 # Start ZKFC Services
ZKFC runs on both the NameNodes to maintain a session with the Zookeeper quorum and to monitor the health of NameNode process. Login to both the NameNodes and execute the following command to start ZKFC service.

cd $HADOOP_PREFIX
sbin/hadoop-daemon.sh start zkfc

You can verify if the ZKFC service started successfully by executing the jps command. A process with name DFSZKFailoverController should be running on the NameNode.

Step 6 # Start DataNode Services
Login to the Active NameNode and use hadoop-daemons.sh script to start all the DataNode services in one go.

cd $HADOOP_PREFIX
sbin/hadoop-daemons.sh start datanode

OR login to each DataNode and start each service using hadoop-daemon.sh script.

cd $HADOOP_PREFIX
sbin/hadoop-daemon.sh start datanode

Related Blogs
a) HDFS High Availability Overview
b) HDFS High Availability Architecture
c) HDFS High Availability Configuration

Share this:

2 thoughts on “Start HDFS High Availability Cluster

  1. My brother suggested I might like this blog. He was entirely right.
    This post actually made my day. You cann’t imagine just how much time I
    had spent for this information! Thanks!

  2. Greetings from Florida! I’m bored at work so I
    decided to browse your blog on my iphone during lunch break.
    I enjoy the info you provide here and can’t wait to take a look when I get home.
    I’m shocked at how fast your blog loaded on my cell phone ..
    I’m not even using WIFI, just 3G .. Anyhow, amazing
    blog!

Leave a Reply

Your email address will not be published. Required fields are marked *