Public Repository

Last pushed: a year ago
Short Description
Run Spark job server in a docker container on Marathon
Full Description

Spark JobServer

This docker container is meant to run on Mesos 0.24.1, with Spark 1.5.2 as a Marathon App.

Many thanks to danisla

Build your own Docker Container

Dockerfile and container built by running sbt assembly docker on Spark Job Server project.

Update the following files for your settings:

  • spark-jobserver/config/docker.sh
  • spark-jobserver/config/docker.conf
  • spark-jobserver/project/Dependencies.scala

Run sbt assembly docker

Optionally push to docker repo

Marathon config file

{
    "container": {
        "type": "DOCKER",
        "docker": {
            "image": "eodgooch/spark-jobserver:0.6.2-SNAPSHOT.mesos-0.24.1.spark-1.5.2"
        }
    },
    "ports": [
        0
    ],
    "id": "spark-job-server",
    "instances": 1,
    "cpus": 1,
    "mem": 1024,
    "uris": [],
    "cmd": "sed -i 's/port = 8090/port = ${?SPARK_JOBSERVER_PORT}/g' /app/docker.conf ; export SPARK_JOBSERVER_PORT=$PORT0 ; /app/server_start.sh --conf spark.master=${SPARK_MASTER} --conf spark.mesos.coarse=true --conf spark.mesos.executor.docker.image=${SPARK_MESOS_EXECUTOR_DOCKER_IMAGE} --conf spark.mesos.executor.home=/opt/spark",
    "env": {
        "SPARK_MASTER": "mesos://zk://master.mesos:2181",
        "SPARK_HOME": "/spark",
        "SPARK_MESOS_EXECUTOR_DOCKER_IMAGE": "eodgooch/spark:1.5.2-hdfs"
    },
    "upgradeStrategy": {
        "minimumHealthCapacity": 0.0
    },
    "backoffSeconds": 15,
    "backoffFactor": 1.15,
    "healthChecks": [{
        "protocol": "HTTP",
        "portIndex": 0,
        "path": "/",
        "gracePeriodSeconds": 5,
        "intervalSeconds": 20,
        "maxConsecutiveFailures": 3
    }]
}

Testing Spark

Upload a Spark Binary package to hdfs, s3 or somewhere url accessible to your cluster. I use hdfs then set spark.master.executor.uri=hdfs://hdfs/spark-1.5.2-bin-hadoop2.4.tgz. I've done this in the Spark JobServer docker.conf file.

Update the spark-job-tests/src/spark/jobserver/WordCountExample.scala to use the mesos master url instead of local[4]
Run sbt assembly to build the test jar file
From spark-jobserver directory run curl --data-binary @job-server-tests/target/scala-2.10/job-server-tests_2.10-0.6.2-SNAPSHOT.jar <spark-jobserver:port>/jars/test to upload the jar to SJS.
Once the jar is loaded on SJS run curl -d "input.string = a b c a b see" '<spark jobserver:port>/jobs?appName=test&classPath=spark.jobserver.WordCountExample'
Then check the status asynchronously by running curl http://<spark jobserver:port>/jobs/<job id>

Docker Pull Command
Owner
eodgooch

Comments (0)