Public | Automated Build

Last pushed: 6 months ago
Short Description
Docker Hadoop 2.7.3
Full Description

Docker hadoop

The image is built using image base from Docker Debian

In this image:

  • The packages installed include:

openssh-server

hadoop 2.7.3

supervisor

wget

curl

Size of image:

  • 1.08 GB

Build

docker build -t hadoop .

Usage

Single Node

docker run -d -p 8088:8088 -p 50070:50070 -p 9000:9000 -e "NODE_TYPE=master" dungvo/hadoop

Cluster with 3 nodes

docker network create --subnet=172.20.0.0/16 mynet

docker run -d --net mynet --ip 172.20.0.3 \
           -e "MASTER_IP=172.20.0.2" \
           -e "MASTER_HOST=hadoop-master" \
           --name slave1 -h slave1 dungvo/hadoop

docker run -d --net mynet --ip 172.20.0.4 \
           -e "MASTER_IP=172.20.0.2" \
           -e "MASTER_HOST=hadoop-master" \
           --name slave2 -h slave2 dungvo/hadoop

docker run -d --net mynet --ip 172.20.0.2 \
           -e "MASTER_IP=172.20.0.2" \
           -e "MASTER_HOST=hadoop-master" \
           -e "NODE_TYPE=master" \
           -e "SLAVES=slave1,slave2" \
           --link slave1:slave1 \
           --link slave2:slave2 \
           --name hadoop-master -h hadoop-master \
           -p 8088:8088 -p 50070:50070 -p 9000:9000 dungvo/hadoop

NOTE: MASTER_HOST must be the same with value of -h option in master node, in this case is hadoop-master

Join new node to cluster

docker run -d --net mynet --ip 172.20.0.5 \
           -e "MASTER_IP=172.20.0.2" \
           -e "MASTER_HOST=hadoop-master" \
           -e "NODE_TYPE=slave" \
           --name slave3 -h slave3 dungvo/hadoop

Explain Environment in startup container

There are 5 environments for you control when start container:
  1. MASTER_HOST

    • Default: master (if not set)

    • Description: Hostname of master node

    • More: this parameter will assign to core-site.xml and yarn-site.xml and Add it to /etc/hosts with $MASTER_IP.

  1. MASTER_IP

    • Default: 0.0.0.0

    • Description: IP of master node.

    • More: this parameter will assign to core-site.xml and yarn-site.xml and Add it to /etc/hosts with $MASTER_HOST.

  1. NODE_TYPE (there are 3 type)

    • master: container startup with ssh service, dfs (datanode, namenode, secondnamenode) and yarn (Using this type when start single node or start hadoop cluster with node play role as master node)

    • slave: container startup with ssh service, datanode and nodemanager (Using this type when add new node to cluster and node play role as slave node)

    • node: container startup without start hadoop service. (Using this type when you want to start hadoop cluster, and node play role as slave node, master node will send command for it start hadoop service)

    • Default: node

    • Descripton: start container with a node special

  2. SLAVES

    • Default: localhost

    • Description: list all node play role as slave node, split by comma

  3. NODE_MANAGER_WEB_PORT

    • Default: 8042

    • Description: nodemanager web port

Add more hosts to /etc/hosts

  • To add more hosts to /etc/hosts container, you only need to mount volume a hosts file to /tmp/hosts in container, All hosts in hosts file will append to /etc/hosts container.

  • Script append hosts in startup.sh

    HOSTS_FILE_TMP=/tmp/hosts
    ...
    if [ -f "$HOSTS_FILE_TMP" ]; then
      cat $HOSTS_FILE_TMP >> /etc/hosts
    fi
    

GUI on Browser

HDFS: localhost:50070

YARN: localhost:8088

List environment variables

VISIBLE=now
HADOOP_HOME=/usr/local/hadoop
HADOOP_PREFIX=/usr/local/hadoop
HADOOP_COMMON_HOME=/usr/local/hadoop
HADOOP_HDFS_HOME=/usr/local/hadoop
HADOOP_MAPRED_HOME=/usr/local/hadoop
HADOOP_YARN_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
YARN_CONF_DIR=/usr/local/hadoop/etc/hadoop

Port expose

  • HDFS ports: 9000 50070
  • YARN ports: 8040 8042 8088 8030 8031 8032 8033

Done

Docker Pull Command
Owner
dungvo
Source Repository

Comments (0)