Docker image for bioinformatics in python.
Includes a number of popular libraries and dependencies for bioinformatic data
analysis in Python. See the Dockerfile for details of which
software libraries are included.
To run a single command using the biipy docker image, for convenience a
biipy_run.sh wrapper script is available from this
For example, save biipy_run.sh to a local file on your host
system, then run:
$ ./biipy_run.sh v2.0.0 ipython
This will run a docker container using the biipy image and execute an IPython
To run a Jupyter notebook server, omit the last argument, e.g.:
$ ./biipy_run.sh v2.0.0
You will probably want to map more directories from your host filesystem
into the container, and may want to change other settings such as the
default port mapping for the Jupyter notebook server, in which case you can
edit and customise your local copy of the biipy_run.sh script.
If you have a Jupyter notebook server already running and want to also run
other commands using the same container, find out the container name:
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES fb030ddae198 cggh/biipy:v2.0.0 "/bin/bash /biipy/no 10 seconds ago Up 10 seconds 0.0.0.0:8888->8888/tcp aliman_biipy_v2.0.0
...then use docker exec, e.g.:
$ docker exec -it aliman_biipy_v2.0.0 ipython
You can also use this as a quick way to install additional software into a
running container if you need to, e.g.:
$ docker exec -it --user=root aliman_biipy_v2.0.0 pip3 install somepackage
Customising the Jupyter notebook server
By default, biipy will run a Jupyter notebook server with default settings.
You can change the Jupyter configuration by creating and editing a
configuration file. This is useful, e.g., if you want to secure the notebook
server with HTTPS and a password (highly recommended).
To generate a default configuration file, do e.g.:
$ ./biipy_run.sh v2.0.0 jupyter notebook --generate-config Writing default config to: /home/aliman/.jupyter/jupyter_notebook_config.py
You can then edit the configuration file on the host system, assuming you
have mapped your home directory into the container. For example, here are the
lines I have uncommented and edited in mine:
$ grep '^[^#]' .jupyter/jupyter_notebook_config.py c.NotebookApp.allow_origin = '*' c.NotebookApp.certfile = 'mycert.pem' c.NotebookApp.cookie_secret = b'...' c.NotebookApp.enable_mathjax = False c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False c.NotebookApp.password = 'sha1:...' c.NotebookApp.port = 8888
You will want to replace the
with something different. To generate an SHA1 hash of your password, run an
IPython interactive shell:
$ ./biipy_run.sh v2.0.0 ipython
In : from notebook.auth import passwd; passwd()
...and copy-paste the SHA1 string into the config file.
Further instructions on setting up HTTPS and other matters relating to
securing a notebook server are available from the [Jupyter docs]
If you have mapped your home directory as a volume and are running biipy
with your own UID, then Jupyter should pick up the changes you have made
to the configuration file the next time you run the notebook server.
If there are features you would like to add or other changes you'd like to
make, please feel free to raise an issue on GitHub
or create a pull request.
Minor changes to Dockerfile that do not add, remove or alter dependencies
(e.g., change ordering) get a micro version bump, e.g., 0.1.0 -> 0.1.1.
Adding, removing or changing (e.g., upgrading) a dependency in the Dockerfile
gets a minor version bump, e.g., 0.1 -> 0.2.
Changing the base image (e.g., to a different version of Ubuntu) gets a major
version bump, e.g., 0.1 -> 1.0
- For some reason that we don't yet understand, if you try to run a Jupyter
notebook server by providing the command directly (e.g.,
biipy_run.sh v2.0.0 jupyter notebook), this leads to kernel connection
issues. However, there is a bash script baked into the container that works,
biipy_run.sh v2.0.0 /biipy/scripts/notebook.sh. This is the default
command in the image so you can just run
For some information on how to set up on your system, see here
Significant numbers of package upgrades
- try to reduce log size for travis
- [CI skip] use Python 3.5.2
- trivial change to trigger CI
- add travis CI support
- upgrade numexpr
- upgrade scikit-allel, downgrade dask
- upgrade zarr
- solve conda conflicts
- add hmmlearn
- upgrade vcfnp
- upgrade pymysql, install via conda
- upgrade py-cpuinfo
- upgrade prettypandas
- upgrade petl, install via conda
- upgrade openpyxl, install via conda
- install intervaltree via conda
- upgrade fastcluster, install via conda
- add scipy explicitly
- upgrade ete3, install via conda
- add cytoolz
- upgrade toolz
- upgrade sqlalchemy
- upgrade scikit-learn
- upgrade rpy2
- upgrade pysamstats
- add pyfastaq
- upgrade psutil
- upgrade pillow
- upgrade pandas
- upgrade numpy
- upgrade numba
- upgrade msprime
- upgrade matplotlib-venn
- upgrade matplotlib
- upgrade line_profiler
- upgrade joblib
- upgrade icu
- upgrade gdal
- upgrade dask
- upgrade cython
- upgrade cartopy
- upgrade bokeh
- upgrade bcolz and source via pip because 1.1.0 not available via conda
- Updated versions of large number of packages. Including:
biopython=1.68, cartopy=0.14.2, cython=0.24.1, gdal=2.1.1, joblib=0.10.2, matplotlib=1.5.3, msprime=0.3.2, numba=0.28.1, numpy=1.11.2, numexpr=2.6.1, pillow=3.4.1, psutil=4.3.1, psycopg2=2.6.2, pysamstats=0.24.2, pytables=3.3.0, pyvcf=0.6.8, scikit-learn=0.18, seaborn=0.7.1, sqlalchemy=1.0.13, toolz=0.8.0, whoosh=2.7.4, xlrd=1.0.0, zarr=2.1.3
- Added mapping software; gdal and cartopy
- Updates to allel
- Included zarr
- updates to humanize
- bwa/samtools/tabix moved to same env.
- Readded simupop from bpeng conda repo.
- Remove simupop
- Minor formatting
- Major version change
- splitting into two separate dockerfiles
- Hitching wagon to anaconda/conda management. Slight loss of control on versioning, but gains in stability and build time
- Several packages now come via the bioconda project
- uses text files to hold package requirements
- Several version updates including numpy with MKL, ipython to 4.2.0, cython to 0.24 and others. basemap now packaged by conda
- Removed simupop to try to get build time down
- Upgraded scikit-allel, bug fix
- Added simupop forward simulation tool
- Add sudo, nano Ubuntu packages.
- Add psutil, py-cpuinfo, prettypandas, joblib, fastcluster Python packages.
- Upgrade seaborn.
- Minor changes to biipy_run.sh example script to enable better mapping of user
information into containers.
- Reorganise Dockerfile and minimise dependencies installed to reduce image size.
- Upgrade pysamstats.
- Add bokeh, numba, zarr, openblas.
- Upgrade numpy (and should now build against openblas), Jupyter notebook,
IPython, rpy2, matplotlib, sqlalchemy, pymysql, openpyxl, pillow,
memory_profiler, psutil, msprime, anhima, dask, ete3.