Public | Automated Build

Last pushed: 7 months ago
Short Description
Ensembl's variant effect predictor (VEP) tool
Full Description

Variant Effect Predictor (VEP)


Quick start

Clone this repo and build the image:

docker build -t opengenomics/variant-effect-predictor .

After building the image and downloading the offline cache, you can test the image like so:

docker run -v /vep/data/path/homo_sapiens:/mnt/homo_sapiens opengenomics/variant-effect-predictor --species homo_sapiens --assembly GRCh37 --offline 
--no_progress --no_stats --vcf --minimal --dir /mnt/ --fasta /mnt/homo_sapiens/86_GRCh37/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz 
--input_file example_GRCh37.vcf --output_file /opt/ensembl-tools-release-84/scripts/variant_effect_predictor/example_GRCh37.vep.vcf --everything --dir_cache /mnt/

Download and Prepare VEP Data Dependencies

Download and unpack VEP's offline cache for GRCh37

export VEP_DATA = /home/.vep
rsync -zvh rsync:// $VEP_DATA
tar xvfz homo_sapiens_vep_86_GRCh37.tar.gz
gunzip Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz
bgzip Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz

Download and index a custom ExAC r0.3.1 VCF, that skips variants overlapping known somatic hotspots:

curl -L > $VEP_DATA/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz
bcftools filter --targets ^2:25457242-25457243,12:121176677-121176678 --output-type z --output $VEP_DATA/ExAC_nonTCGA.r0.3.1.sites.minus_somatic.vep.vcf.gz $VEP_DATA/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz
mv -f $VEP_DATA/ExAC_nonTCGA.r0.3.1.sites.minus_somatic.vep.vcf.gz $VEP_DATA/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz
tabix -p vcf $VEP_DATA/ExAC.r0.3.sites.minus_somatic.vcf.gz

Download and index the files required for the dbNSFP plugin:

head -n1 dbNSFP2.9.1_variant.chr1 > h
cat dbNSFP2.9.1_variant.chr* | grep -v ^#chr | sort -k1,1 -k2,2n - | cat h - | bgzip -c > dbNSFP.gz
tabix -s 1 -b 2 -e 2 dbNSFP.gz

Convert the offline cache for use with tabix, that significantly speeds up the lookup of known variants:

docker run -v $VEP_DATA:/mnt vep /root/vep/ --species homo_sapiens --version 86_GRCh37 --dir /mnt
Docker Pull Command