alexmerced/spark35nb
Quick Access to Spark and a Notebook for Practice and Small Jobs
3.7K
alexmerced/spark35nb
Docker Image DocumentationThe alexmerced/spark35nb
Docker image provides an environment for data engineers and data scientists to work with Apache Spark 3.5.2, Python 3.10, and JupyterLab. This image includes a comprehensive set of popular Python libraries for data processing, machine learning, and visualization. It is designed to run Apache Spark in a single-node mode alongside a JupyterLab server, making it ideal for development, testing, and educational purposes.
pandas
, numpy
, scikit-learn
, tensorflow
, pyspark
, pyarrow
, ibis-framework
, dask
, and more.This image comes with a wide range of pre-installed Python libraries, including but not limited to:
pandas
, numpy
, dask
, polars
, daft
, datafusion
scikit-learn
, tensorflow
, torch
, xgboost
, lightgbm
matplotlib
, seaborn
, plotly
pyspark
, pyarrow
, ibis-framework
, duckdb
, sqlframe
, pyiceberg
requests
, beautifulsoup4
, lxml
, boto3
, s3fs
, minio
sqlalchemy
, psycopg2-binary
, dremio-simple-query
To pull the image from Docker Hub:
docker pull alexmerced/spark35nb
docker run -p 8888:8888 -p 4040:4040 -p 7077:7077 -p 8080:8080 -p 18080:18080 -p 6066:6066 -p 7078:7078 -p 8081:8081 alexmerced/spark35nb
Then head over to localhost:8888 to access JupyterLab.
docker pull alexmerced/spark35nb