Demo ready container
Using tools like spark, cassandra, kafka and of course the spark-notebook.
Problems: check the troubleshooting section at the bottom of this page or add a comment or contact me on this email.
Spark 2.0.0 preview
You can use now with no pain and no constraint... and within minutes the latest greatest spark version, like 2.0.0-preview (at the time writing).
To do so, pull the image, run it, start the services -- see below
docker pull andypetrella/spark-notebook-demo:master-2.0.0-preview
docker run --rm -it --net=host -m 8g andypetrella/spark-notebook-demo:master-2.0.0-preview bash
The 3 following scripts will setup all tools (incl. cassandra and kafka) and will also configure them with predefined content (like keyspaces or topics).
source var.sh source start.sh source create.sh
Use the IP (or localhost of Linux box) of your Docker's VM IP and open the port 9000 and create a new notebook in the interface to start using spark 2.0.0.
You're in a docker with many things, hence make sure you're VM (on Mac/Win) has enough memory and cpus allocated to it
On Mac/Win, you'll need to use the VM's IP instead of localhost to access the services (like the spark-notebook on the 9000 port)
Out of resources
Even if you allocated enough resources to your VM, if you open too many notebooks, you may hit a performance issue since each notebook is actually starting a new VM (with a Spark driver in it). So make sure to shutdown unused notebook (see
running tab in the main UI)