Public | Automated Build

Last pushed: 2 days ago
Short Description
Dockerized Hadoop based on https://github.com/sequenceiq/hadoop-docker
Full Description

Apache Hadoop 2.7.1 in a Docker container

Hadoop in a pseudo distributed mode in a Docker container, forked from sequenceiq. The added values are:

  • Bug fix
  • Use openjdk
  • Tez installation

Start a container

Make sure that SELinux is disabled on the host. If you are using boot2docker you don't need to do anything.

docker run -P -it ouyi/hadoop-docker /etc/bootstrap.sh -bash

Testing

You can run one of the stock examples:

cd $HADOOP_PREFIX
# run the mapreduce
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar grep input output 'dfs[a-z.]+'

# check the output
bin/hadoop fs -cat output/{*}
Docker Pull Command
Owner
ouyi
Source Repository

Comments (0)