staphb/serotypefinder
Tool for identifying the serotype of E. coli isolates from reads or assemblies
4.4K
A docker container that contains SerotypeFinder, a tool for serotyping E. coli isolates from reads or assemblies
SerotypeFinder version: 2.0.1 https://bitbucket.org/genomicepidemiology/serotypefinder/src/2.0.1/ made on 2019‑01‑28
SerotypeFinder database version: Git commit 39c68c6e1a3d94f823143a2e333019bb3f8dddba
made on 2020‑09‑24. Link to commit history
You may be familiar with the web version of SerotypeFinder: https://cge.cbs.dtu.dk/services/SerotypeFinder/
usage: serotypefinder.py [-h] -i INFILE [INFILE ...] [-o OUTDIR] [-tmp TMP_DIR] [-mp METHOD_PATH] [-p DB_PATH] [-d DATABASES] [-l MIN_COV] [-t THRESHOLD] [-x] [-q]
optional arguments:
-h, --help show this help message and exit
-i INFILE [INFILE ...], --infile INFILE [INFILE ...]
FASTA or FASTQ input files.
-o OUTDIR, --outputPath OUTDIR
Path to blast output
-tmp TMP_DIR, --tmp_dir TMP_DIR
Temporary directory for storage of the results from the external software.
-mp METHOD_PATH, --methodPath METHOD_PATH
Path to method to use (kma or blastn)
-p DB_PATH, --databasePath DB_PATH
Path to the databases
-d DATABASES, --databases DATABASES
Databases chosen to search in - if non is specified all is used
-l MIN_COV, --mincov MIN_COV
Minimum coverage
-t THRESHOLD, --threshold THRESHOLD
Minimum threshold for identity
-x, --extented_output
Give extented output with allignment files, template and query hits in fasta and a tab seperated file with gene profile results
-q, --quiet
-p
or -d
flags
/database
kma
and use the serotypefinder.py -p
flag. You can find instructions for this on the SerotypeFinder Bitbucket README. kma
is included in this docker container for database indexing.-o
flag. You MUST create it beforehand or it will throw an error.-t 0.95
-l 0.70
-x
flag (extended output) if you want the traditional/legacy SerotypeFinder output files results_tab.tsv results.txt Serotype_allele_seq.fsa Hit_in_genome_seq.fsa
. Otherwise you will need to parse the default output file data.json
for resultskma
(instead of ncbi-blast+)ncbi-blast+
tmp/out_H_type.xml
and tmp/out_O_type.xml
will exist in the specified output directory# download the image
$ docker pull staphb/serotypefinder:2.0.1
# input files are in my PWD
$ ls
E-coli.skesa.fasta E-coli.R1.fastq.gz E-coli.R2.fastq.gz
# make an output directory
$ mkdir output-dir-reads output-dir-asm
# query reads, mount PWD to /data inside container (broken into two lines for readabilty)
$ docker run --rm -u $(id -u):$(id -g) -v $PWD:/data staphb/serotypefinder:2.0.1 \
serotypefinder.py -i /data/E-coli.R1.fastq.gz /data/E-coli.R2.fastq.gz -o /data/output-dir-reads
# query assembly
$ docker run --rm -u $(id -u):$(id -g) -v $PWD:/data staphb/serotypefinder:2.0.1 \
serotypefinder.py -i /data/E-coli.skesa.fasta -o /data/output-dir-asm
# download the image
$ singularity build serotypefinder.2.0.1.sif docker://staphb/serotypefinder:2.0.1
# files are in my PWD
$ ls
E-coli.skesa.fasta E-coli.R1.fastq.gz E-coli.R2.fastq.gz
# make an output directory
$ mkdir output-dir-reads output-dir-asm
# query reads; mount PWD to /data inside container
$ singularity exec --no-home -B $PWD:/data serotypefinder.2.0.1.sif \
serotypefinder.py -i /data/E-coli.R1.fastq.gz /data/E-coli.R2.fastq.gz -o /data/output-dir-reads
# assembly
$ singularity exec --no-home -B $PWD:/data serotypefinder.2.0.1.sif \
serotypefinder.py -i /data/E-coli.skesa.fasta -o /data/output-dir-asm
docker pull staphb/serotypefinder