Public Repository

Last pushed: a year ago
Short Description
PredictionIO with recent Scala (2.11.11) and Spark (2.1.1) bundled with MySQL JDBC driver
Full Description

Dockerized PredictionIO containing the following:

  • PredictionIO 0.11.0
  • Scala 2.11.11
  • Spark 2.1.1
  • MySQL JDBC driver 5.1.42

The docker image also includes curl and jq to make subsequent scripting a bit easier.

PredictionIO has been preconfigured to use MySQL, but you'll still need to point it to an actual MySQL instance using the following environment variables:


These can be set on the command line when using docker run with -e NAME=VAR or using an env-file.

You can run the PredictionIO event server straight off this docker image using:

$ docker run -p 7070:7070 -e ...  geoffwa/predictionio-mysql pio eventserver --port 7070

For deploying to Cloud Foundry, ensure the required environment variables are present, then run:

$ cf push eventserver --docker-image geoffwa/predictionio-mysql -c 'pio eventserver --port 8080'

For building a PredictionIO engine to train and deploy, you can create a Dockerfile that looks something like:

FROM geoffwa/predictionio-mysql
RUN mkdir -p /engine /engine/lib /engine/project
COPY build.sbt engine.json template.json /engine/
COPY src/ /engine/src/
COPY project/assembly.sbt project/ /engine/project/
RUN cd /engine \
  && pio build --clean --uber-jar --sbt-extra -211 \
  && rm -rf project/project project/target target/streams target/resolution-cache target/scala-*/classes

You can then subquently pio train and pio deploy using the built engine, see geoffwa/pio-mysql-similar-engine for an example using PredictionIO's Similar Product engine template.

Be aware that as you're running in a container, pio train should only save model data to the configured persistent store - some of the engine templates available from persistent training information to local disk.

Docker Pull Command