Public | Automated Build

Last pushed: a year ago
Short Description
Docker run environment for PGAP (Pan-Genome Analysis Pipeline)
Full Description

Docker container run environment for PGAP (Pan-Genome Analysis Pipeline)

PGAP is a pan-genomes analysis pipeline developed with Perl. It could perform five analytic functions with only one command, including cluster analysis of functional genes, pan-genome profile analysis, genetic variation analysis of functional genes, species evolution analysis and function enrichment analysis of gene clusters.


To run, mount your input and output directories with -v, and then use a standard call with options:

seq_dir=/path/to/input/{.nuc|.pep|.function}  # PGAP-formatted input files
out_dir=/path/to/output  # Output directory (should exist)

docker run \
  -v "${seq_dir}":/input \
  -v "${out_dir}":/output \
  -w /pgap kastman/pgap:1.12 \
    perl ./ --strains $strains \
    --input /input --output /output \
    --cluster --pangenome --variation  --evolution --function \
    --method MP --thread 1

This will run produce a full run with all 5 steps of analysis, using a single thread. Note that I had problems with segfaults in the docker container when using multiple threads, but there may be a way to adjust that?

Also note the -w /pgap default working directory - this is important for to correctly find sub-modules in the /pgap directory.


Adapts the Dockerfile from . (However, that contained version 2.12 of blast, which seemed to be incompatible with PGAP - specifically the -C option to blastall called by


For more information, see: Zhao Y, Wu J, Yang J, Sun S, Xiao J, Yu J. PGAP: pan-genomes analysis pipeline. Bioinformatics. 2012;28(3):416-418. doi:10.1093/bioinformatics/btr655.

Docker Pull Command
Source Repository