hashicorp/spark-nomad

Verified Publisher

By HashiCorp, Inc.

Updated over 7 years ago

Apache Spark distribution with HashiCorp Nomad scheduler support

Image
Data Science
Integration & Delivery
Machine Learning & AI

5.5K

Apache Spark on Nomad

This image provides a build of Apache Spark 2.1.1 with the addition of support for scheduling against HashiCorp Nomad. The PR that implements this support can be found here: https://github.com/apache/spark/pull/18209.

Usage Instructions

Full usage instructions can be found here: https://github.com/barnardb/spark/blob/nomad/docs/running-on-nomad.md

Sample usage

Start a Nomad agent in developer mode on a machine with Docker:

$ sudo nomad agent -dev

This will create a Nomad server and client so that we can submit or example Spark job to.

Client Mode

Run the following to create a Spark job that will run on the Nomad agent created in Spark client mode:

$ docker run --network=host -it hashicorp/spark-nomad ./opt/spark/bin/spark-submit \
  --master nomad \
  --docker-image hashicorp/spark-nomad \
  --distribution local:///opt/spark \
  --driver-memory 512m \
  --executor-memory 512m \
  --class org.apache.spark.examples.SparkPi \
   local:/opt/spark/examples/jars/spark-examples_2.11-2.1.1.jar \
  10
Cluster Mode

Run the following to create a Spark job that will run on the Nomad agent created in Spark cluster mode:

$ docker run --network=host -it hashicorp/spark-nomad ./opt/spark/bin/spark-submit \
  --master nomad \
  --deploy-mode cluster \
  --docker-image hashicorp/spark-nomad \
  --distribution local:///opt/spark \
  --driver-memory 512m \
  --executor-memory 512m \
  --class org.apache.spark.examples.SparkPi \
  local:/opt/spark/examples/jars/spark-examples_2.11-2.1.1.jar \
  10

Docker Pull Command

docker pull hashicorp/spark-nomad