P3: Portable Proteomics Pipeline
P3 is a docker container for mass-spectometry data pre-processing pipelines. The pipelines consist of protein identification (using MSGF+ tool developed by PNNL), and quantification (Bioconductor / MSnbase).
The container takes mass-spectometry raw files (
*.pkl) and a peptide sequences file (
*.fasta). Folder must be mounted to the container's "/root/data" (e.g. using
Click here for more information about Docker Volumes.
The files can be retrieved from a mounted local storage.
Click here for more information about Folder Sharing from VM and host machine.
Alternatively, the container can also retrieve the files from the internet by providing either FTP address or PrideID in the
If not provided,
p3.config template will be created. The configuration file navigates how the pipeline should be run. Inspect p3.config before re-run.
Running the container
docker pull kristiyanto/p3 docker run --rm -v /path/to/files:/root/data kristiyanto/p3 # e.g: Windows docker run --rm -v /c/Users/daniel/p3-data:/root/data kristiyanto/p3 #e.g: Mac/Linux docker run --rm -v ~/Desktop/p3-data:/root/data kristiyanto/p3
Once process is done, following files will be created:
- *.txt : a tab delimited file of the result (spec-evalue, identified peptides, quantification results, etc.)
- *.rda : R object of the results. Ensure MSnBase package is installed prior importing.
- *.mzid : results from MSGF+
To run P3, Docker engine must be installed. Click here for a detailed information to install Docker engine on various operating system including Windows and MacOS.
Protein identification and quantification is a computationally intensive process. Depending on the size of the data, at least 4Gb available memory on the Docker Machine is required. Click here for more information on increasing the memory allocation for Docker engine on VirtualBox machine for MacOS and Windows Users.
Use Case / Config File Samples:
- Using PRIDE as source (Spectrum Count): https://github.com/kristiyanto/P3/tree/master/SAMPLES/PrideID
- Using FTP as source (Spectrum Count): https://github.com/kristiyanto/P3/tree/master/SAMPLES/FTP
- Local files (iTRAQ4): https://github.com/kristiyanto/P3/tree/master/SAMPLES/iTRAQ4
- Local files (Spectrum Count): https://github.com/kristiyanto/P3/tree/master/SAMPLES/LOCAL
You are invited to contribute for new features, updates, fixes by sending pull requests.
Daniel Kristiyanto & Samuel Payne
Pacific Northwest National Laboratory