Public Repository

Last pushed: 5 months ago
Short Description
Short description is empty for this repo.
Full Description

CGE Tools

This project documents the Tools and the Pipeline of the Center for Genomic Epidemiology (CGE) running in a Docker container

Installation

Recommended installation instructions can be found here which install Docker and the VM.

Test that Docker is installed properly:

docker version
docker run hello-world

Install The CGE Docker Toolkit:

# For MacOS running docker-machine, do the following to make sure the machine
# is running, and that the environment is set:
docker-machine stop default
docker-machine start default
eval "$(docker-machine env default)"

# Build cgetools Docker image
docker build -t cgetools .

Install the databases

# First make sure you have git-lfs installed
# https://git-lfs.github.com/
# Download databases
scripts/download_databases.sh /path/to/databases
# Download kmerfinder database files
# (these can be very large so make sure you have at least 60GB space available!)
cd /path/to/databases/kmerfinder
git lfs pull

Test

# Set paths:
toolpath=/path/to/cge-tools-docker
dbpath=/path/to/databases
cwd=/path/to/place/output

# Test BAP contigs
docker run -ti --rm \
   -v $toolpath/test/databases:/databases \
   -v $cwd:/workdir \
   cgetools BAP --wdir /workdir \
   --fa /usr/src/cgepipeline/test/test.fa

# Check the output
cat $cwd/out.tsv

# The output should look exactly like so:
#contigs_file    sequencing_size    genome_size    contigs    n50    depth    species    mlst    mlst_genes    resistance_genes    virulence_genes    plasmids    pmlsts
#NA    NA    4834953    606    28438    NA    Escherichia coli    ecoli[ST44],ecoli_2[ST2]    ecoli[adk-10,fumc-11,gyrb-4,icd-8,mdh-8,pura-8,reca-7],ecoli_2[dinb_8,icda_2,pabb_7,polb_3,putp_7,trpa_1,trpb_4,uida_2]    strA,aac(6')Ib-cr,aadA5,strB,aac(3)-IIa    gad    Col(MGD2),Col(MG828),ColRNAI,IncFII,IncFIB(AP001918),IncFIA    IncF[F31:A4:B1]

# Test BAP Illumina paired end reads
docker run -ti --rm \
   -v $toolpath/test/databases:/databases \
   -v $cwd:/workdir \
   cgetools BAP --wdir /workdir --fq1 /usr/src/cgepipeline/test/test_1.fq.gz \
   --fq2 /usr/src/cgepipeline/test/test_2.fq.gz --Asp Illumina --Ast paired

# Check the contigs and output
head -3 $cwd/assembler/contigs.fsa

#>NODE_1_length_720_cov_6.647222
#AGCTCACTGCATAGCTATGCATGAAAGTGAATGGCGATCGGTTTGGGGCCTTACGGCGTT
#CATACCGTCTGTTTTCGACAGTTTCTCTCCGGGAAGCTAATCTGCCATAAGCCTGGATAA

cat $cwd/out.tsv

#contigs_file    sequencing_size    genome_size    contigs    n50    depth    species    mlst    mlst_genes    resistance_genes    virulence_genes    plasmids    pmlsts
#/workdir/Assembler/contigs.fsa    NA    16284    47    325    NA    unknown    NA    NA    NA    NA    Col(MGD2),Col(MG828),ColRNAI    

# Test databases
docker run -ti --rm \
   -v $dbpath:/databases \
   -v $cwd:/workdir \
   cgetools BAP --wdir /workdir \
   --fa /usr/src/cgepipeline/test/test.fa

# Check the output
cat $cwd/out.tsv

#contigs_file    sequencing_size    genome_size    contigs    n50    depth    species    mlst    mlst_genes    resistance_genes    virulence_genes    plasmids    pmlsts
#NA    NA    4834953    606    28438    NA    Escherichia coli    ecoli[ST44],ecoli_2[ST2]    ecoli[adk-10,fumc-11,gyrb-4,icd-8,mdh-8,pura-8,reca-7],ecoli_2[dinb_8,icda_2,pabb_7,polb_3,putp_7,trpa_1,trpb_4,uida_2]    strA,aac(6')Ib-cr,aadA5,strB,aac(3)-IIa,mph(A),sul1,sul2,dfrA17,aac(6')Ib-cr,tet(B),catB3,blaCTX-M-15,blaOXA-1    gad    Col(MGD2),Col(MG828),ColRNAI,IncFII,IncFIB(AP001918),IncFIA    IncF[F31:A4:B1]

Usage

# Run terminal shell on selected image
docker run -t -i cgetools /bin/bash

# Assembly
docker run -ti --rm -w /output \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools Assembler --sequencing_platform Illumina --sequencing_type paired \
   --files "/input/my_file_1.fq.gz,/input/my_file_2.fq.gz"

# ContigAnalyzer
docker run -ti --rm -w /output \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools ContigAnalyzer -f /input/my_file.fa 

# KmerFinder
docker run -ti --rm -w /output \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools KmerFinder -f /input/my_file.fa -s bacteria_organisms -p ATGAC -w 

# MLST
docker run -ti --rm -w /output \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools MLST -f /input/my_file.fa -s ecoli

# PlasmidFinder
docker run -ti --rm -w /output \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools PlasmidFinder -f /input/my_file.fa -s enterobacteriaceae -k 80.00

# pMLST
docker run -ti --rm -w /output \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools pMLST -f /input/my_file.fa -s incf

# ResFinder
docker run -ti --rm -w /output \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools ResFinder -f /input/my_file.fa -s phenicol -k 90.00 -l 0.60

# VirulenceFinder
docker run -ti --rm -w /output \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools ResFinder -f /input/my_file.fa -s virulence_ecoli -k 90.00

# Bacterial Analysis Pipeline (contigs)
docker run -ti --rm \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools BAP --wdir /output --fa /input/my_file.fa

# Bacterial Analysis Pipeline (Illumina paired end reads)
docker run -ti --rm \
   -v /path/to/database:/database \
   -v /path/to/input:/input \
   -v /path/to/output:/output \
   cgetools BAP --wdir /output --fq1 /input/my_file_1.fq.gz \
   --fq2 /input/my_file_2.fq.gz --Asp Illumina --Ast paired

Useful commands

# Go to Docker repository
cd /your/path/to/cge-tools-docker

# Update Docker repository
git stash;git pull

# Go to master git branch
git stash;git checkout master

# Shutdown Docker deamon
docker-machine stop default

# Start Docker deamon
docker-machine start default

# Reset Docker env (Used to run docker containers from different terminals)
eval "$(docker-machine env default)"

# Build cgetools Docker image
docker build -t cgetools .

# Docker Cleanup
# Stop and remove all containers (instances of images)
docker rm $(docker stop $(docker ps -aq))
# Remove all exited containers
docker rm -v $(docker ps -aq -f status=exited)
# Remove all dangling images
docker rmi $(docker images -qf "dangling=true")

License

See LICENSE.md

Docker Pull Command
Owner
mettevoldbylarsen