Evolinc-II pipeline is designed to perform a series of comparative genomic and transcriptomic analyses across an evolutionary timescale of the user’s choosing and on any number (1-1000s) of query lincRNAs.
Evolinc-II minimally requires the following input data
- A FASTA file of lincRNA sequences of all genomes to be interrogated.
- A single column text file with all species listed in order of phylogenetic relatedness to the query species. See Discussion_on_species_list.txt for more info.
- Evolinc-II can optionally incorporate genome annotation files (GTF) and known lincRNA datasets from target species in FASTA format.
- Currently, Evolinc-II iterates through all of its analyses using a tab-delimited file that contains all of this information, including file names and species abbreviations. See Discussion_on_BLASTing_file.md for more info.
Using Docker image
Since there are several dependencies (can be seen in Dockerfile) for running Evolinc-II on your linux or MAC OS, we highly recommend using the Docker image for Evolinc-II or use the Dockerfile to build an image and then use the built image.
# Pull the image from CyVerse Dockerhub docker pull evolinc/evolinc-ii:1.2 # See the command line help for the image docker run evolinc/evolinc-ii:1.2 -h # Download the test data wget https://github.com/Evolinc/Evolinc-II/releases/download/v1.0/sample_data.zip unzip sample_data.zip # Run Evolinc-II on the test data docker run --rm -v $(pwd):/working-dir -w /working-dir evolinc/evolinc-ii:1.2 -b sample_data/Blasting_list.txt -l sample_data/Sample_query_lincRNA_data_set_for_webinar.fasta -q Atha -i sample_data -s sample_data/test_species_list.txt -o test_out -v 1e-20
Using CyVerse Discovery Environment
The Evolinc-II app is currently integrated in CyVerse’s Discovery Environment and is free to use by researchers. The complete tutorial is available at this CyVerse wiki. If you do not have access to a high performance computing cluster we highly recommend using Evolinc-II within the DE. Limited computing resources can lead to extremely long Evolinc-II run times, especially when searching through large genomes. Not only is the CyVerse DE free to researchers, but many of the genomes are already integrated (and almost all others are available through CoGe; genomevolution.org), thereby alleviating the data storage hurdle common with these types of analyses.
If you experience any issues with running Evolinc-I (DE app or source code or Docker image), please open an issue on this github repo.
The sources in this github repository are copyright free. Thus you are allowed to use these sources in which ever way you like. Please be aware that other license terms apply for the
Notung.jar used inside the image. Here is the full MIT license.
If you have used Evolinc-I manuscript in your research, please cite as below..
Andrew D. Nelson*, Upendra K. Devisetty*, Kyle Palos, Asher K. Haug-Baltzell, Eric Lyons, Mark A. Beilstein (2017). "Evolinc: a comparative transcriptomics and genomics pipeline for quickly identifying sequence conserved lincRNAs for functional analysis". Frontiers in Genetics. 1(10)