Public Repository

Last pushed: a year ago
Short Description
Docker build Hadoop Cluster
Full Description

hadoopcluster FROM: centos-7.2.1511

1. pull docker image

docker pull elbertmalone/hadoopcluster:2.7.2

2. clone git.oschina repository

git clone https://git.oschina.net/elbertmalone/hadoopcluster.git

3. create hadoop network

docker network create --driver=bridge hadoop

4. start container

cd hadoopcluster
sudo ./start-container.sh

5. start hadoop

./start-hadoop.sh

6. run wordcount

./run-wordcount.sh

output

input file1.txt:
Hello Hadoop

input file2.txt:
Hello Docker

wordcount output:
Docker    1
Hadoop    1
Hello    2

Arbitrary size Hadoop cluster

1. pull docker images and clone git.oschina repository

do 1~2 like section A

2. rebuild docker image

./resize-cluster.sh 5
  • specify parameter > 1: 2, 3..

3. start container

./start-container.sh 5
  • use the same parameter as the step 2

4. run hadoop cluster

do 3~5 like section A

reference

hadoop-cluster-docker

centos-ssh

Docker Pull Command
Owner
elbertmalone

Comments (0)