Dockerized PredictionIO containing the following:
- PredictionIO 0.11.0
- Scala 2.11.11
- Spark 2.1.1
- MySQL JDBC driver 5.1.42
The docker image also includes
jq to make subsequent scripting a bit easier.
PredictionIO has been preconfigured to use MySQL, but you'll still need to point it to an actual MySQL instance using the following environment variables:
These can be set on the command line when using
docker run with
-e NAME=VAR or using an env-file.
You can run the PredictionIO event server straight off this docker image using:
$ docker run -p 7070:7070 -e ... geoffwa/predictionio-mysql pio eventserver --port 7070
For deploying to Cloud Foundry, ensure the required environment variables are present, then run:
$ cf push eventserver --docker-image geoffwa/predictionio-mysql -c 'pio eventserver --port 8080'
For building a PredictionIO engine to train and deploy, you can create a Dockerfile that looks something like:
FROM geoffwa/predictionio-mysql RUN mkdir -p /engine /engine/lib /engine/project COPY build.sbt engine.json template.json /engine/ COPY src/ /engine/src/ COPY project/assembly.sbt project/build.properties /engine/project/ RUN cd /engine \ && pio build --clean --uber-jar --sbt-extra -211 \ && rm -rf project/project project/target target/streams target/resolution-cache target/scala-*/classes
You can then subquently
pio train and
pio deploy using the built engine, see
geoffwa/pio-mysql-similar-engine for an example using PredictionIO's Similar Product engine template.
Be aware that as you're running in a container,
pio train should only save model data to the configured persistent store - some of the engine templates available from http://predictionio.incubator.apache.org/gallery/template-gallery/ persistent training information to local disk.