Public Repository

Last pushed: 3 years ago
Short Description
personal build of spark as worker
Full Description

drpaulbrewer/spark-worker is a personal build of spark (1.3.1 as of April 2015) with scripting to be a worker

The version, build options, jdk, settings, etc. are from drpaulbrewer/spark-roasted-elephant and subject to change.

This is not an official public build and there is NO WARRANTY for this code. ALL USE IT AT YOUR OWN RISK. If it works for you, great. But don't expect it to always work, or to always have the same options compiled in.

Here's an example of how to start a worker.

Here the worker's preset IP address is and we have assumed a master at .10
You'll need to edit the IP addresses. All machines used must be able to contact each other.
Docker hostnames and networking seem more confusing than helpful at this stage.


SPARK=$(docker run --name="spark1" --expose=1-65535 --env SPARKDIR=/spark/spark-1.3.1 --env mem=10G --env master=spark:// --env SPARK_LOCAL_IP= -v /data:/data -v /tmp:/tmp -d drpaulbrewer/spark-worker:latest)
sudo pipework eth0 $SPARK

For a worker on a wireless LAN pipework wasn't useful, even against the proper interface.

For wireless, run the container on the hosts' network stack directly. In such case, SPARK_LOCAL_IP can usually be omitted.


sudo -v
docker run --net="host" --expose=1-65535 --env SPARKDIR=/spark/spark-1.3.1 --env mem=10G --env master=spark:// -v /data:/data -v /tmp:/tmp -d drpaulbrewer/spark-worker:latest

To shutdown and cleanup, you may want to create a shell script similar to this but changing the container names as appropriate on the host (not the container):


docker kill master spark1
docker rm master spark1
Docker Pull Command