Public | Automated Build

Last pushed: 9 months ago
Short Description
Short description is empty for this repo.
Full Description

Docker image for Apache Tez

This repository contains a docker file to build a docker image with Apache Tez. This docker file is adaptation from this repo, except that the docker image for tez runs on top of hadoop 2.5.2 docker base image from my other github repo (docker-hadoop).

Current Version

  • Apache Tez 0.8.4
  • Apache Hadoop 2.6.0


  • Docker (If you are using Mac. Docker for Mac is preferable over Boot2Docker)

Pull the image

You can either pull the image that is already pre-built from Docker hub or build the image locally (refer next section)

docker pull prasanthj/docker-tez:0.8.4

Building the image

If you do not want to pull the image from Docker hub, you can build it locally using the following steps

  • To build the tez docker image locally from Dockerfile, first checkout source using
    git clone
  • Change to docker-tez directory cd docker-tez
    docker build -t local-tez-0.8.4 .

Running the image

docker run -i -t -P local-tez-0.8.4 /etc/ -bash


When running one of the stock map-reduce examples, the TEZ DAG ApplicationMaster will run the map-reduce job instead of the YARN MR AppMaster.
This can be verified by looking at the YARN ResourceManager UI.

$HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+'

There is also a basic Tez MRR job example in one of the tez jars. You can test it by running the following:

$HADOOP_PREFIX/bin/hadoop jar $TEZ_DIST/tez-examples-0.8.4.jar orderedwordcount input output-owc

Viewing Web UI

If you are running docker using Boot2Docker then do the following steps

  • Setup routing on the host machine (Mac OS X) using the following
    command sudo route add -net
    NOTE: 172.17.0.X is usually the ipaddress of docker container. is the ipaddress exported in DOCKER_HOST

  • Get containers IP address

    • To get containers IP address we need CONTAINER_ID. To get container id use the following command which should list all running containers and its ID
      docker ps
    • Use the following command to get containers IP address (where CONTAINER_ID is the container id of local-tez-0.8.4 image)
      docker inspect -f=“{{.NetworkSettings.IPAddress}}” CONTAINER_ID
  • Launch a web browser and type http://<container-ip-address>:8088 to view hadoop cluster web UI.

Docker Pull Command
Source Repository

Comments (3)
2 years ago

Ah, I discovered that it works if I use the following command

docker run -i -t -P prasanthj/docker-tez /etc/ –bash

in place of the command provided under the heading "Running the Image"

docker --tls run -i -t -P local-tez-latest /etc/ -bash

I was then able to get the 2 example commands provided unde the Testing section to work.

2 years ago

I meant to say I can run the following images

docker run ubuntu /bin/echo hello world
docker run -t -i ubuntu /bin/bash
run -i -t -P sequenceiq/tez /etc/ -bash
docker run -it -p 8088:8088 -p 8042:8042 -h sandbox sequenceiq/spark:1.6.0 bash
docker run --rm -it wernight/funbox

I cannot run
docker --tls run -i -t -P local-tez-latest /etc/ -bash

2 years ago

Hi Prasanth, I am having a problem running this image. It looks interesting
so I would like to understand if this error is due to a problem with the
docker image or is it a problem with my configuration (Ubuntu 14.04) which I can fix?

I can run the following docker images without a problem

docker run ubuntu /bin/echo hello world
docker run -t -i ubuntu /bin/bash
docker --tls run -i -t -P local-tez-latest /etc/ -bash
docker run --rm -it wernight/funbox

but when I download and run "prasanthj/docker-tez" as follows
docker --tls run -i -t -P local-tez-latest /etc/ -bash

I get this error
Cannot connect to the Docker daemon. Is the docker daemon running on this host?

Do you have any suggestions regarding what I might try to get this to work?