alexmerced/datanotebook
A Image for quickly setting up a Python Notebook for data engineering
308
Link to Dockerfile and Documentation
This Docker image provides a Python environment with a wide range of data science libraries pre-installed. It is designed for easy access via a Jupyter Notebook in your web browser. The image is built from a minimal Python base and includes libraries for data manipulation, machine learning, and database connectivity.
python:3.9-slim
pandas
, numpy
, polars
, dask
, ibis
, pyiceberg
, datafusion
, sqlframe
scikit-learn
, tensorflow
, torch
, xgboost
, lightgbm
matplotlib
, seaborn
, plotly
psycopg2-binary
, mysqlclient
, sqlalchemy
, duckdb
, pyarrow
, pyiceberg
boto3
, s3fs
, minio
openpyxl
, requests
, beautifulsoup4
, lxml
, pyspark
, dremio-simple-query
pydata
with a working directory set to /home/pydata/work
8888
(Jupyter Notebook)To build the Docker image, navigate to the directory containing the Dockerfile
and run:
docker build -t python-notebook .
To run the container and start the Jupyter Notebook server, use the following command:
docker run -p 8888:8888 -v $(pwd):/home/pydata/work python-notebook
Once the container is running, you can access the Jupyter Notebook in your web browser by navigating to:
http://localhost:8888
Token Authentication: The Jupyter Notebook server is started with the --NotebookApp.token='' flag, which disables token authentication. This allows direct access to the notebook without requiring a login token.
User Configuration: The image uses the user pydata with the home directory set to /home/pydata. The working directory for the notebook is /home/pydata/work.
If you need to add more libraries or make additional configurations, you can extend this image by creating a new Dockerfile that builds on top of it. For example:
FROM python-notebook
RUN pip install additional-library
docker pull alexmerced/datanotebook