Public Repository

Last pushed: 2 years ago
Short Description
Three node Hadoop cluster for the Apcera Platform
Full Description


Run a three node Hadoop cluster on the Apcera Platform. Feel free to test this out on the Apcera Platform Community Edition

Version info

Hadoop 2.7.2
Pig 0.16.0
OpenJDK 8

Create the virtual network.

apc network create hadoop --batch

Run the Docker images

apc docker create hadoop1 -i apcerademos/hadoop -m 3g --port 50070 --port 8088 --port 22 -ae -e CONFIG=MASTER -e HADOOP_HOSTNAME="hadoop1.apcera.local"
apc docker create hadoop2 -i apcerademos/hadoop -m 3g -e HADOOP_HOSTNAME="hadoop2.apcera.local" 
apc docker create hadoop3 -i apcerademos/hadoop -m 3g -e HADOOP_HOSTNAME="hadoop3.apcera.local"

Join the Hadoop containers to the virtual network with a discovery address

apc network join hadoop -j hadoop1 -da hadoop1
apc network join hadoop -j hadoop2 -da hadoop2
apc network join hadoop -j hadoop3 -da hadoop3

Do not change the -da name because the configuration files are hard coded

Start the Hadoop containers

apc app start hadoop1
apc app start hadoop2
apc app start hadoop3

Add a route to the central host of the Apcera Cluster for SSH access.

If you are running Apcera Community Edition, you can get the IP with "./apcera-setup status". You can change port 9999 to a port that you want to connect to the cluster externally.

apc route add central_ip:9999 --app hadoop1 --tcp --port 22 --batch

Add routes for the YARN and DFS web console

apc route add --app hadoop1 --port 50070 --batch
apc route add --app hadoop1 --port 8088 --batch

SSH credentials

Username: root
Password: training

Run a sample Hadoop Job that parses a log with Pig

ssh root@central_ip -p 9999
bash /

The test script copies a Debian package installation log to the Hadoop File System and searches for instances of the word "Java".

Docker Pull Command