Public Repository

Last pushed: a year ago
Short Description
Run a fully-distributed Hadoop cluster in Docker containers!
Full Description

How to use

These images can be used to run a container for any component service of Apache Hadoop.

Just mount in a volume with config files in it, and you're good to go.

docker run -v $(pwd)/etc/hadoop:/etc/hadoop jakelow/hadoop:2 hdfs namenode

Of course, Hadoop is a multi-role architecture, so you'll likely want to run several containered services based on a shared set of configs. You can leverage Compose for this; these examples will get you started.

Supported versions

All versions of Hadoop ≥ 1.0.0 are supported. Tag names exactly follow the convention of the Apache software archive, but with the hadoop- prefix dropped.

Exact version tags, like 2.6.1, point to the version you'd expect. If you drop a least-significant digit, e.g. 2.5, you'll get the most recent release of the 2.5.x family. Similarly, 2 (and latest) point to the most recent Hadoop 2.x release and 1 points to the most recent Hadoop 1.x release.

Special case: Hadoop 2.0.x and 2.1.x families were alpha and beta releases respectively. Their names have -alpha or -beta suffixed to them, but otherwise function equivalently -- for example, 2.0.2-alpha is an exact release, and 2.1-beta is the most recent 2.1.x release.


These images are open source; you can open issues or feature requests on the public GitHub repository.

Docker Pull Command