This image is for working data sets stored in CSV.
It contains the basic data-science toolkit needed to work in python pandas dataframes.
The image itself is not specific to the hackthefed project and can be used to
work with any filesystem based dataset. I may add database drivers etc later,
but I don't need them right now and haven't decided on which toolset I'll use
All tools are compiled to the latest version every time I re-create the image. Pandas, Numpy, scipy, etc are not bound to the debian packages and as such contain newer versions. To make this possible fortran and supporting libraries are all installed.
* Python 3 * Pandas * SciPy * NumPy * Pillow * ipython[all] - This includes notebook and matplotlib * pyYAML * rodeo - cause it looks interesting
Dockerfile source etc... are here https://github.com/hackthefed/python-datakit/tree/develop