Check the Wiki for
ZetaHunter is a command line script designed to assign user-supplied
small subunit ribosomal RNA (SSU rRNA) gene sequences to OTUs defined
by a reference sequence database.
By default, ZetaHunter uses a curated database of full-length,
non-chimeric, Zetaproteobacteria SSU rRNA gene sequences derived from
arb SILVA (release 128) and Zetaproteobacteria genomes from JGI's
Integrated Microbial Genomes (IMG). OTU definitions are the same as
those suggested by McAllister et al. (2011) at 97% identity, with
novel OTUs discovered since that publication named ZetaOTU29 and
higher (curated OTUs only). Infiles aligned by the arb SILVA SINA web
aligner are masked using the same 1282 bp mask used in McAllister et
al. (2011) to obtain reproducible OTU calls through closed reference
OTU binning. User sequences that represent novel Zetaproteobacteria
OTUs are de novo binned into NewZetaOTUs, numbered by abundance.
OTU network analysis is a simple way to visualize the connectivity of
OTUs within a sample or environment type. ZetaHunter will output edge
and node tab-delimited files for import into cytoscape. The node file
contains the abundance information for each node. The edge file lists
OTUs that are found within the same sample (node1, node2, sample), thus
allowing for visualization. Note: Samples with only one ZetaOTU will contain
a self referential edge. Otherwise, only non-self connections are shown.
ZetaHunter also supports user-provided curated OTU databases for
sequence OTU binning of any SINA-aligned SSU rRNA sequences.
- Stable SSU rRNA gene OTU binning to a curated database
- Supports import of multiple files for easy comparison of NewZetaOTUs across samples
- Database and sequence mask management options
- Multi-threaded processing
- Chimera checking
- Flags for sequences not related to the curated database (i.e. not Zetaproteobacteria)
- Cytoscape-compatible output file for OTU network analysis
Running ZetaHunter with Docker
Note: If you have Windows, running
ZetaHunter with Docker is the
only supported option.
After installing Docker, open the Launchpad and click the
perl script, and change the permissions to executable. In this case,
it will be placed in the following directory
$ mkdir -p ~/software/ZetaHunter $ \curl "https://raw.githubusercontent.com/mooreryan/ZetaHunter/master/bin/run_zeta_hunter" > ~/software/ZetaHunter/run_zeta_hunter $ chmod 755 ~/software/ZetaHunter/run_zeta_hunter
You can create a symbolic link to somewhere on your path so that you
can use the
run_zeta_hunter command from any folder. Assuming that
/usr/local/bin on your path, you can use this command.
$ sudo ln -s $HOME/software/ZetaHunter/run_zeta_hunter /usr/local/bin
If you don't want to use a symbolic link, you can also move the program to your path directly.
$ sudo mv ~/software/ZetaHunter/run_zeta_hunter /usr/local/bin
Try it out! Running this command
$ run_zeta_hunter -h
will display the help banner.
Zetaproteobacteria database curation
McAllister, S. M., R. E. Davis, J. M. McBeth, B. M. Tebo, D. Emerson, and C. L. Moyer. 2011. Biodiversity and emerging biogeography of the neutrophilic iron-oxidizing Zetaproteobacteria. Appl. Environ. Microbiol. 77:5445–5457. doi:10.1128/AEM.00533-11
ZetaHunter uses lots of other software internally. Please cite the
Quast, C., E. Pruesse, P. Yilmaz, J. Gerken, T. Schweer, P. Yarza, J. Peplies, and F. O. Glöckner. 2013. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucl. Acids Res. 41(D1): D590-D596.
Pruesse, E., J. Peplies, and F. O. Glöckner. 2012. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 28:1823–1829.
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.
Kopylova E., Noé L. and Touzet H., "SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data", Bioinformatics (2012), doi: 10.1093/bioinformatics/bts611.
Edgar, R. C., B. J. Haas, J. C. Clemente, C. Quince, and R. Knight. 2011. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics, doi: 10.1093/bioinformatics/btr381
silva.gold.align.gz is from
NOTE: This file will be temporarily unzipped (requires 247mb of
hard drive space) if chimera checking is turned on.
Lines beginning with
# are considered comments.
The headers are split on " " characters and the first part of that is
taken to be the sequence ID and must be unique.
The entropy file needs to be rebuilt each time