Docs: Parker et al. (2018, in prep) ID-by-sequencing paper
The paper and this code document the rapid-raw-read-reference for ID ('R4ID') approach. This comprises the following steps:
- A training step ('r4ids-training'), in which samples of known origin are MinION-sequenced, or public references downloaded, and used to create labelled BLASTN databases for ID;
- A resequencing step ('r4ids-resequencing'), in which samples of unknown origin are MinION-sequenced and compared in real-time to the BLASTN R4ID databases to generate lists of read alignments to the R4ID data;
- A visualisation step ('r4ids-visualisation'), in which the lists of BLAST hits are parsed and displayed as a web GUI over a network connection.
The three images for each step are tagged as r4ids-training-v0.4, r4ids-resequencing-v0.4 and r4ids-visualisation-v0.4 respectively.
USAGE (briefly - see (USAGE.md)[https://github.com/lonelyjoeparker/oddjects-sandbox/blob/master/R4IDs/USAGE.md] in repo):
1) Train blast DBs:
docker run -v <directory of .fasta reference genome or R4ID files>:/input_training -v <desired BLAST DB directory>:/blast_db raids-training
2) Real-time analyse new reads from Albacore:
docker run -v <dir for new fasta reads from Albacore>:/input_resequencing -v <dir with BLAST DBs from r4ids-training step>:/blast_db -v <real-time BLAST analysis outout dir / r4ids-visualisation www dir>:/output_web raids-requencing
3) Real-time visualise results over http
docker run -v <real-time BLAST analysis outout dir / r4ids-visualisation www dir>:/www -p 80:80 r4ids-visualisation
Steps 2 & 3 (r4ids-resequencing, r4ids-visualisation) can and should run in parallel and need concurrent access to their shared <real-time BLAST analysis outout dir / r4ids-visualisation www> dir on the physical host (mounted at
/output_web in resequencing container, and
/www in visualisation container.