Public Repository

Last pushed: 2 years ago
Short Description
personal spark worker with nfs4
Full Description

drpaulbrewer/spark-worker-nfs is a personal build of spark (1.3.1 as of April 2015) with scripting to be a worker using nfs4 for a shared file system.

Note that running parallel analyses through nfs4 creates a bottleneck and it is better instead to copy data files to each worker's own file system before running anything.

The version, build options, settings, etc. are from drpaulbrewer/spark-roasted-elephant and subject to change.

This is not an official public build and there is NO WARRANTY for this code. ALL USE IT AT YOUR OWN RISK. If it works for you, great. But don't expect it to always work, or to always have the same options compiled in.

Dockerfile (subject to change)

FROM drpaulbrewer/spark-roasted-elephant:latest
ADD /spark/
RUN apt-get install --yes nfs-common
CMD /spark/

container included file /spark/

#!/bin/bash -e
cd /spark/spark-1.3.1
sleep 10
# dont use ./sbin/ it wont take numeric URL
mkdir -p /Z/data
mount -o ro -t nfs4 $nfsdata /Z/data
su -c "cd /spark/spark-1.3.1 && ./bin/spark-class org.apache.spark.deploy.worker.Worker --memory $mem $master" spark
Docker Pull Command