Public Repository

Last pushed: 2 years ago
Short Description
Spark Notebook Demo container
Full Description

Demo ready container

Using tools like spark, cassandra, kafka and of course the spark-notebook.

Problems: check the troubleshooting section at the bottom of this page or add a comment or contact me on this email.


Spark 2.0.0 preview

You can use now with no pain and no constraint... and within minutes the latest greatest spark version, like 2.0.0-preview (at the time writing).

To do so, pull the image, run it, start the services -- see below


docker pull andypetrella/spark-notebook-demo:master-2.0.0-preview


docker run --rm -it --net=host -m 8g andypetrella/spark-notebook-demo:master-2.0.0-preview bash


The 3 following scripts will setup all tools (incl. cassandra and kafka) and will also configure them with predefined content (like keyspaces or topics).



Use the IP (or localhost of Linux box) of your Docker's VM IP and open the port 9000 and create a new notebook in the interface to start using spark 2.0.0.

On Linux then: http://localhost:9000, on Mac http://<VM-IP>:9000


Resources (VM)

You're in a docker with many things, hence make sure you're VM (on Mac/Win) has enough memory and cpus allocated to it


On Mac/Win, you'll need to use the VM's IP instead of localhost to access the services (like the spark-notebook on the 9000 port)

Out of resources

Even if you allocated enough resources to your VM, if you open too many notebooks, you may hit a performance issue since each notebook is actually starting a new VM (with a Spark driver in it). So make sure to shutdown unused notebook (see running tab in the main UI)

Docker Pull Command