Public Repository

Last pushed: 2 years ago
Short Description
Master node of a Multi-node Hadoop Cluster
Full Description

This is an image of a master node of a multi-node hadoop cluster on a docker running Ubuntu 14.04. The image was created on docker running on a Ubuntu 14.04 system. The image has been created using div4/hadoop as the base image. The master node runs the Namenode, Secondary Namenode, Node manager and Resource manager.

The cluster can be set up on a single host using docker link command or on multiple hosts using weave or any other clustering tool. Weave is a good, easy to use and user-friendly tool that can be used to set up a cluster. For weave installation and commands:

https://github.com/weaveworks/weave

http://xmodulo.com/networking-between-docker-containers.html

To start master node:

Pull this image and run the container

sudo docker run -it --name hadoop_master -P -p 50070:50070 -p 50090:50090 div4/hadoop_master /bin/bash

Dedicated hadoop user group is hadoop_group and user account is hduser1.

su – hduser1
sudo service ssh start
ssh localhost

Update the /etc/hosts file with the IPs assigned to docker containers, using weave or any other clustering tool. In the downloaded containers, the master has been assigned an IP 10.0.0.5/24 in the subnetwork of cluster. The slave has been assigned IP 10.0.0.7/24 in the subnetwork of cluster. Do this edit in both master and slave nodes before proceeding.

vi /etc/hosts

and write to the file

10.0.0.5 master
10.0.0.7 slave1

Allow master node to connect to slave node. For this run:

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser2@slave1

Check connection from master account to the slave account

ssh master
ssh hduser2@slave1
exit

Start the Cluster

cd /usr/local/hadoop
sbin/start-all.sh

Check the daemons running

jps

To stop hadoop, run

/usr/local/hadoop/sbin/stop-all.sh

Hadoop Web Interfaces are accessible at http://localhost:50070/ - web UI of the NameNode daemon.
If you are running hadoop on a server on cloud, replace local host with ip of the server.

If password is required: H4doop

Changes can be made to this multi-host configuration using hadoop configuration files, to run more than one slaves just run the required number of slave containers and edit hadoop config files. These links may be helpful:

http://doctuts.readthedocs.org/en/latest/hadoop.html
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php

Docker Pull Command
Owner
div4

Comments (0)