Public Repository

Last pushed: 2 years ago
Short Description
Spark for Data Science
Full Description

A rocker/r-base:latest based Spark and Zeppelin Docker container.

This image contains:

  • Spark 2.0.0, Hadoop 2.7 and zeppelin 0.7.0
  • R 3.3.1
  • PySpark support with Python 3.5, NumPy, PandaSQL, SciPy and scikit-learn
  • Some interpreters out-of-the-box. If your favorite interpreter isn't included, consider adding it with the api: spark, shell, angular, markdown, postgresql, jdbc, python, base, elastic search, simple usage

To start Zeppelin pull the latest image and run the container:

docker pull josepcurto/sparkzeppelin

docker run --rm --name zeppelin -p 8080:8080 -v josepcurto/sparkds

Zeppelin will be running at http://${YOUR_DOCKER_HOST}:8080.

Docker Pull Command