Public | Automated Build

Last pushed: 2 months ago
Short Description
Docker image with zeppelin; spark and the python libs needed for Data Science
Full Description

Docker image of zeppelin notebook

Author: Anderson Santos

DockerHub repository: https://hub.docker.com/r/supergarotinho/zeppelin/

Main features

  • Spark - 2.1.1
  • zeppelin - 0.7.1
    • spark
    • shell
    • angular
    • markdown
    • postgresql
    • jdbc
    • python
    • hbase
    • elasticsearch
  • Python libs:
    • Python 3.5
    • Data
      • NumPy
      • pandas
      • PandaSQL
    • ML and Math
      • sklearn
      • SciPy
    • Visualization
      • matplotlib
      • seaborn
      • folium (GeoVisualization)
      • wordcloud
    • Util
      • ijson
      • datetime
      • tweepy
    • NLP
      • nltk
        • punkt - sentence segmentation
        • stopwords
        • rslp - lemmatizer da Viviane Orengo
        • floresta - Corpus Floresta Sint?tica for PT_BR
      • gensim (Topic and language modelling)
    • Graphs
      • networkx
      • igraph

How to use it with Docker

docker run --rm -d -p 8080:8080 -v $PWD:/notebook -e ZEPPELIN_NOTEBOOK_DIR='/notebook' supergarotinho/zeppelin
Docker Pull Command
Owner
supergarotinho
Source Repository