This is a collection of jupyter notebooks for transforming data files and running the pipelines.
First Install Docker - follow the instructions in this link.
How to run this container from the command line.
pull the image:
docker pull knowengdev/jupyter_notebooks:08_18_2017
create a directory "user_data" (with your data) in the directory where you will run the container:
docker run -v `pwd`:/home/jovyan/work -it --rm -p 8888:8888 knowengdev/jupyter_notebooks:08_18_2017
In the terminal window copy the one time connection token to a browser URL window to run the notebooks
Output will be saved in "user_data/results" after the container is stopped
The browser window will display jupyter mounted directories - change directory to run the notebook:
- select: knoweng_transform
- select: run_transform
- click on the transformation notebook: Data_File_Transformations.ipynb
The browser will show that the notebook is not trusted so the cells must be run manually after selecting "Trust":
- in the "Cell" menu select Run All
- the list boxes and action buttons will appear below the directions for each transformation
Development repositories included are:
Production repositories included are:
Notebbook runs pipelines example: /knoweng_dev_tools/runner_notebooks/samples_clustering_w_cleanup.ipynb