Jupyterhub version integrated with spark
Steps to set it up
1 - To create a container run it as
docker run -d -p 8000:8000 --name jupyterhub hselvaggi/jupyterhub_spark
this will let you access jupyter on localhost:8000 there is a default user guest / guest so you can omit step 2 unless you want to setup more users.
2 - To create a user to access jupyter run the next two commands
docker exec -it jupyterhub /bin/bash
adduser <username to login in jupyter>
After filling all the data requested by adduser you can exit the container by running
3 - There is some work in progress, in the mean time you need to specify the following lines to be able to properly instantiate the spark context.
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName('Your application name').setMaster('local[*]')
sc = SparkContext.getOrCreate(conf)
There are available python 2, python 3 and R kernels.
Notes: The latest version fixes the need to set the SPARK_HOME path in python os.environ. The template for new users has been fixed so the creation of new users need no configuration at all from the administrator.
Support to read data from a Cassandra database added.