debian:jessie based Spark and Zeppelin Docker container.
This image is large and opinionated. It contains:
- Spark 2.1.0 and Hadoop 2.7.3
- PySpark support with Python 3.4, NumPy, PandaSQL, and SciPy, but no matplotlib.
- A partial list of interpreters out-of-the-box. If your favorite interpreter isn't included, consider adding it with the api.
A prior build of
dylanmei/zeppelin:latest contained Spark 1.6.0, Python 2.7, and all of the stock interpreters. That image is still available as
To start Zeppelin pull the
latest image and run the container:
docker pull dylanmei/zeppelin docker run --rm -p 8080:8080 dylanmei/zeppelin
Zeppelin will be running at
You can use docker-compose to easily run Zeppelin in more complex configurations. See this project's
./examples directory for examples of using Zeppelin with
docker-compose to :
- read and write from local data files
- read and write documents in ElasticSearch
onbuild container is still a part of this project, but I have no plans to keep it updated. See the
onbuild directory to view its
To use it, create a new
Dockerfile based on
dylanmei/zeppelin:onbuild and supply a new, executable
install.sh file in the same directory. It will override the base one via Docker's ONBUILD instruction.
The steps, expressed here as a script, can be as simple as:
#!/bin/bash cat > ./Dockerfile <<DOCKERFILE FROM dylanmei/zeppelin:onbuild ENV ZEPPELIN_MEM="-Xmx1024m" DOCKERFILE cat > ./install.sh <<INSTALL git pull mvn clean package -DskipTests \ -Pspark-1.5 \ -Dspark.version=1.5.2 \ -Phadoop-2.2 \ -Dhadoop.version=2.0.0-cdh4.2.0 \ -Pyarn INSTALL docker build -t my_zeppelin .
Hi does this build connect to a hive database I am trying create a new hive context and the following error is being given:
error: object hive is not a member of package org.apache.spark.sql
I am using the following code:
val hiveContext =new org.apache.spark.sql.hive.HiveContext(sc)
Running on Kitematic (OS X), yet when I go to localhost:8080, I get a "Site can't be reached error". Any ideas?
First of all great job! you really made my life easier with this container.
Yet I need to use Cassandra interpreter in Zeppelin. I have tried using the dynamic loading API but with no success.
I am posting at 127.0.0.1:8080/api/interpreter/load/cassandra/cassandra
the following body
but the server returns a 500 Server Error.
I think it might be because the credentials are missing but I do not know them, can you help me?
Hi, thanks for the great work! Any plans to release a 0.6.1 version?