qiime
http://qiime.sourceforge.net/install/install.html
Introduction
QIIME (pronounced "chime") stands for Quantitative Insights Into Microbial Ecology. QIIME is a pipeline application that uses numerous third-party applications. QIIME takes users from their raw sequencing output through initial analyses such as OTU picking, taxonomic assignment, and construction of phylogenetic trees from representative sequences of OTUs, and through downstream statistical analysis, visualization, and production of publication-quality graphics.
Usage
Create a .qiime_config or use the template
The proper setup of the ".qiime_config" file is critical to successful operation of QIIME.
Anyone running QIIME MUST have this file in their home directory.
Create a .qiime_config file in your home directory.
There is a template available on /sw/qiime/Qiime/.qiime_config
If you wish to use that, simply do this:
cp /sw/qiime/Qiime/.qiime_config ~/
cp /sw/qiime/Qiime/1.5.0/.qiime_config ~/
OR if version specific:
cp /sw/qiime/Qiime/1.9.1/qiime_config ~/
Modify it to suit you.
cat ~/.qiime_config # qiime_config # WARNING: DO NOT EDIT OR DELETE Qiime/qiime_config # To overwrite defaults, copy this file to $HOME/.qiime_config or a full path # specified by $QIIME_CONFIG_FP and edit that copy of the file. cluster_jobs_fp /sw/qiime/Qiime/1.5.0/bin/start_parallel_jobs.py python_exe_fp python working_dir blastmat_dir /sw/bioinformatics/ncbi/2009.3/ncbi/data blastall_fp blastall pynast_template_alignment_fp /sw/qiime/greengenes/core_set_aligned.fasta.imputed pynast_template_alignment_blastdb template_alignment_lanemask_fp /sw/qiime/greengenes/lanemask_in_1s_and_0s jobs_to_start 1 seconds_to_sleep 60 qiime_scripts_dir /sw/qiime/Qiime/1.5.0/bin #qiime_scripts_dir /sw/qiime/Qiime/1.5.0/bin temp_dir /tmp denoiser_min_per_core 50 cloud_environment False topiaryexplorer_project_dir torque_queue workq sc_queue all.q topiaryexplorer_project_dir assign_taxonomy_reference_seqs_fp qiime_test_data_dir
Lines which may be altered by the user include:
cluster_jobs_fp
Set to "start_parallel_jobs_torque.py" for cluster jobs
Set to "start_parallel_jobs.py" for single node threading
working_dir
blastmat_dir (default appears above, blast matrices location)
jobs_to_start (the MAXIMUM number of jobs to auto-spawn)
seconds_to_sleep (change is not recommended)
Lines which you must pay careful attention to during configuration and testing:
qiime_scripts_dir - must be set to scripts location for your installation
torque_queue - must be set to "workq" to comply with the name of the Gowonda HPC main queue (default from template is friendlyq)
module command to load qiime dependencies
module load qiime/1.5.0 module load qiime/1.9.1
Testing
module load qiime/1.5.0 print_qiime_config.py -t print_qiime_config.py -t System information ================== Platform: linux2 Python version: 2.7.1 (r271:86832, Jun 29 2011, 09:08:45) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] Python executable: /sw/python/2.7.1/bin/python Dependency versions =================== PyCogent version: 1.5.1 NumPy version: 1.5.1 matplotlib version: 1.1.0 biom-format version: 0.9.3 QIIME library version: 1.5.0 QIIME script version: 1.5.0 PyNAST version (if installed): 1.1 RDP Classifier version (if installed): rdp_classifier-2.3.jar QIIME config values =================== blastmat_dir: /sw/bioinformatics/ncbi/2009.3/ncbi/data sc_queue: all.q topiaryexplorer_project_dir: None pynast_template_alignment_fp: /sw/qiime/greengenes/core_set_aligned.fasta.imputed cluster_jobs_fp: /sw/qiime/Qiime/1.5.0/bin/start_parallel_jobs.py pynast_template_alignment_blastdb: None assign_taxonomy_reference_seqs_fp: None torque_queue: workq qiime_test_data_dir: None template_alignment_lanemask_fp: /sw/qiime/greengenes/lanemask_in_1s_and_0s jobs_to_start: 1 cloud_environment: False qiime_scripts_dir: /sw/qiime/Qiime/1.5.0/bin denoiser_min_per_core: 50 working_dir: None python_exe_fp: python temp_dir: /tmp blastall_fp: blastall seconds_to_sleep: 60 assign_taxonomy_id_to_taxonomy_fp: None running checks: test_FastTree_supported_version (__main__.Qiime_config) FastTree is in path and version is supported ... ok test_INFERNAL_supported_version (__main__.Qiime_config) INFERNAL is in path and version is supported ... ok test_ParsInsert_supported_version (__main__.Qiime_config) ParsInsert is in path and version is supported ... ok test_R_supported_version (__main__.Qiime_config) R is in path and version is supported ... ok test_ampliconnoise_install (__main__.Qiime_config) AmpliconNoise install looks sane. ... ok test_blast_supported_version (__main__.Qiime_config) blast is in path and version is supported ... ok test_blastall_fp (__main__.Qiime_config) blastall_fp is set to a valid path ... ok test_blastmat_dir (__main__.Qiime_config) blastmat_dir is set to a valid path. ... ok test_cdbtools_supported_version (__main__.Qiime_config) cdbtools is in path and version is supported ... ok test_cdhit_supported_version (__main__.Qiime_config) cd-hit is in path and version is supported ... ok test_chimeraSlayer_install (__main__.Qiime_config) no obvious problems with ChimeraSlayer install ... ok test_clearcut_supported_version (__main__.Qiime_config) clearcut is in path and version is supported ... ok test_cluster_jobs_fp (__main__.Qiime_config) cluster_jobs_fp is set to a valid path and is executable ... ok test_denoiser_supported_version (__main__.Qiime_config) denoiser aligner is ready to use ... ok test_for_obsolete_values (__main__.Qiime_config) local qiime_config has no extra params ... ok test_matplotlib_suported_version (__main__.Qiime_config) maptplotlib version is supported ... ok test_mothur_supported_version (__main__.Qiime_config) mothur is in path and version is supported ... ok test_muscle_supported_version (__main__.Qiime_config) muscle is in path and version is supported ... ok test_numpy_suported_version (__main__.Qiime_config) numpy version is supported ... ok test_pplacer_supported_version (__main__.Qiime_config) pplacer is in path and version is supported ... ok test_pynast_suported_version (__main__.Qiime_config) pynast version is supported ... ok test_pynast_template_alignment_blastdb_fp (__main__.Qiime_config) pynast_template_alignment_blastdb, if set, is set to a valid path ... ok test_pynast_template_alignment_fp (__main__.Qiime_config) pynast_template_alignment, if set, is set to a valid path ... ok test_python_exe_fp (__main__.Qiime_config) python_exe_fp is set to a working python env ... ok test_python_supported_version (__main__.Qiime_config) python is in path and version is supported ... ok test_qiime_scripts_dir (__main__.Qiime_config) qiime_scripts_dir, if set, is set to a valid path ... ok test_qiime_test_data_dir (__main__.Qiime_config) qiime_test_data_dir, if set, is set to a valid path ... ok test_raxmlHPC_supported_version (__main__.Qiime_config) raxmlHPC is in path and version is supported ... ok test_rtax_supported_version (__main__.Qiime_config) rtax is in path and version is supported ... ok test_temp_dir (__main__.Qiime_config) temp_dir, if set, is set to a valid path ... ok test_template_alignment_lanemask_fp (__main__.Qiime_config) template_alignment_lanemask, if set, is set to a valid path ... ok test_uclust_supported_version (__main__.Qiime_config) uclust is in path and version is supported ... ok test_usearch_supported_version (__main__.Qiime_config) usearch is in path and version is supported ... ok test_working_dir (__main__.Qiime_config) working_dir, if set, is set to a valid path ... ok ---------------------------------------------------------------------- Ran 34 tests in 0.318s OK
Installation
External Dependencies
QIIME 1.5.0 relies on the following third-party applications and reference data sets (some are optional):
AmpliconNoise 1.25
biom-format 0.9.3
blast-2.2.22
cd-hit 3.1.1
cdbtools
ChimeraSlayer
clearcut v1.0.9
Cytoscape v2.7.0 (visualization)
fasttree 2.1.3
ghc 6.8
greengenes alignment lanemask file
greengenes core set data file
GSL
infernal 1.0.2
jre1.6.0_05
MatPlotLib 1.1.0
mothur 1.25.0
muscle 3.8.31
Numpy 1.5.1
ParsInsert 1.04
pplacer 1.1
PyCogent 1.5.1
PyNAST 1.1
Python 2.7.1
raxml 7.3.0
rdp_classifier-2.2
rtax 0.981
sfffile and sffinfo (proprietary)
uclust 1.2.22q
usearch v5.2.32
Pre-requisite
Load the apps modules files:
module load python/2.7.1 module load qiime/1.3.0 module load bioinformatics/uclust/1.2.22 module load bioinformatics/fasttree/2.1.3 module load bioinformatics/rdpclassifier/2.3 module load R/2.13.0 module load blast/2.2.25 module load bioinformatics/cd-hit/4.5.4 module load bioinformatics/cdbfasta/0.99 module load bioinformatics/chimeraslayer/r20110519 module load bioinformatics/mothur/1.21.1 module load bioinformatics/clearcut/1.0.9 module load bioinformatics/raxml/7.2.8a-mpi #(other versions available) module load bioinformatics/infernal/1.0.2-mpi #(OR module load bioinformatics/infernal/1.0.2-serial) module load bioinformatics/mafft/6.857 # (OR: module load bioinformatics/mafft/6.857-extensions) module load bioinformatics/muscle/3.8.31 module load gsl/gsl-1.15 module load bioinformatics/greengenes/gg module load bioinformatics/ampliconnoise/1.25 module load bioinformatics/ghc/7.0.3 module load cytoscape/2.8.1 module load mpi/openMPI/1.4.3-gnu #(optional)
Also installed on login nodes: yum groupinstall "Development Tools"
Developmental Tools on the image
mount --bind /proc/ /compute/proc/ mount --bind /dev /compute/dev yum --installroot=/compute/ groupinstall "Development Tools" umount /compute/dev umount /compute/proc
Installation of Dependent apps
Core Apps
Python (Already Installed)
http://confluence.rcs.griffith.edu.au:8080/display/GHPC/python
module load python/2.7.1
PyCogent (Already Installed)
http://confluence.rcs.griffith.edu.au:8080/display/GHPC/PyCogent
http://pycogent.sourceforge.net/
module load python/2.7.1
sh /sw/pycogent/PyCogent-1.5.1/run_tests
Numpy (Already Installed)
>>> numpy._version_
'1.5.1'
http://confluence.rcs.griffith.edu.au:8080/display/GHPC/numpy
module load python/2.7.1
PyNAST alignment, tree-building, taxonomy assignment, OTU picking, and other data generation steps (required in default pipeline)
uclust
version: 1.2.22
module load bioinformatics/uclust/1.2.22
http://www.drive5.com/uclust/downloads1_2_22q.html
Extreme high-speed sequence clustering, alignment and database search
Files are in "tarball" () format. To extract, use:
tar -zxf filename
Linux binaries must have the executable bit set. Sometimes this bit gets lost (not sure why), in which case you will need to set it using:
chmod a+x filename
mkdir -p /sw/qiime/uclust/1.2.22 cp /data1/qiime/uclustq1.2.22_i86linux64 /sw/qiime/uclust/1.2.22 cd /sw/qiime/uclust/1.2.22 chmod a+x uclustq1.2.22_i86linux64 mv uclustq1.2.22_i86linux64 uclust cp uclust uclustq1.2.22_i86linux64
Place the following in the PATH
/sw/qiime/uclust/1.2.22
PyNAST
version: 1.1
PyNAST is a python implementation of the NAST sequence alignment tool.
http://sourceforge.net/projects/pynast/
http://pynast.sourceforge.net/install.html
module load python/2.7.1
Citing PyNAST
==============
If you make use of PyNAST_ in published work, please cite:
*PyNAST: a flexible tool for aligning sequences to a template alignment.* J. Gregory Caporaso, Kyle Bittinger, Frederic D. Bushman, Todd Z. DeSantis, Gary L. Andersen, and Rob Knight. January 15, 2010, DOI 10.1093/bioinformatics/btp636. Bioinformatics 26: 266-267.
Need help?
==========
For PyNAST_ support, you can contact Greg Caporaso @ gregcaporaso AT gmail.com.
mkdir -p /sw/qiime/PyNAST cd /sw/qiime/PyNAST cp -i /data1/qiime/PyNAST-1.1.tgz . tar -zxvf PyNAST-1.1.tgz mv PyNAST-1.1 1.1 cd /sw/qiime/PyNAST/1.1 python setup.py install cd tests python all_tests.py cd pynast -h
Output from PyNAST install:
python setup.py install running install running build running build_py creating build creating build/lib creating build/lib/pynast copying pynast/__init__.py -> build/lib/pynast copying pynast/logger.py -> build/lib/pynast copying pynast/util.py -> build/lib/pynast running build_scripts creating build/scripts-2.7 copying and adjusting scripts/pynast -> build/scripts-2.7 changing mode of build/scripts-2.7/pynast from 644 to 755 running install_lib creating /sw/python/2.7.1/lib/python2.7/site-packages/pynast copying build/lib/pynast/__init__.py -> /sw/python/2.7.1/lib/python2.7/site-packages/pynast copying build/lib/pynast/logger.py -> /sw/python/2.7.1/lib/python2.7/site-packages/pynast copying build/lib/pynast/util.py -> /sw/python/2.7.1/lib/python2.7/site-packages/pynast byte-compiling /sw/python/2.7.1/lib/python2.7/site-packages/pynast/__init__.py to __init__.pyc byte-compiling /sw/python/2.7.1/lib/python2.7/site-packages/pynast/logger.py to logger.pyc byte-compiling /sw/python/2.7.1/lib/python2.7/site-packages/pynast/util.py to util.pyc running install_scripts copying build/scripts-2.7/pynast -> /sw/python/2.7.1/bin changing mode of /sw/python/2.7.1/bin/pynast to 755 running install_egg_info Writing /sw/python/2.7.1/lib/python2.7/site-packages/PyNAST-1.1-py2.7.egg-info
To get usage information for the PyNAST command line application run:
pynast -h
greengenes core set data file
http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/
http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/core_set_aligned.fasta.imputed
mkdir -p /sw/qiime/greengenes
cd /sw/qiime/greengenes
wget http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/core_set_aligned.fasta.imputed
wget http://greengenes.lbl.gov/Download/Sequence_Data/lanemask_in_1s_and_0s
module load bioinformatics/greengenes/gg
fasttree 2.1.3
http://www.microbesonline.org/fasttree/
module load bioinformatics/fasttree/2.1.3
mkdir -p /sw/qiime/fasttree/2.1.3 wget http://www.microbesonline.org/fasttree/FastTree-2.1.3.c gcc -lm -O3 -finline-functions -funroll-loops -Wall -o FastTree FastTree-2.1.3.c multi-threaded "FastTreeMP," use gcc -DOPENMP -fopenmp -lm -O3 -finline-functions -funroll-loops -Wall -o FastTreeMP FastTree-2.1.3.c
jre1.6.0_05
ls -la /usr/bin/java lrwxrwxrwx. 1 root root 22 May 26 05:30 /usr/bin/java -> /etc/alternatives/java ls -la /etc/alternatives/java lrwxrwxrwx. 1 root root 35 May 26 05:30 /etc/alternatives/java -> /usr/lib/jvm/jre-1.6.0-sun/bin/java
rdp_classifier-2.3
http://rdp.cme.msu.edu/
http://sourceforge.net/projects/rdp-classifier/files/rdp-classifier/rdp_classifier_2.3/rdp_classifier_2.3.zip/download
module load bioinformatics/rdpclassifier/2.3
General RDP Contact Ribosomal Database Project 2225A Biomedical and Physical Sciences Building Michigan State University East Lansing, MI, 48824 rdpstaff@msu.edu
Define an RDP_JAR_PATH variable in the module file.
set base_path /sw/qiime/ set rdp_base $base_path/rdp_classifier/2.3 setenv RDP_JAR_PATH $rdp_base/rdp_classifier-2.3.jar OR: setenv RDP_JAR_PATH /sw/qiime/rdp_classifier/2.3/rdp_classifier-2.3.jar
Alignment, tree-building, taxonomy assignment, OTU picking, and other data generation steps (required for alternative pipelines)
blast-2.2.25
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/
(legacy BLAST from NCBI, NOT BLAST+)
ncbi-toolbox:
ftp://ftp.ncbi.nih.gov/toolbox/
Install Directory:
/sw/blast/2.2.25
module load blast/2.2.25 module display blast/2.2.25 ------------------------------------------------------------------- /sw/com/modulefiles/blast/2.2.25: prepend-path PATH /sw/blast/2.2.25/ncbi/bin prepend-path MANPATH /sw/blast/2.2.25/ncbi/doc/man setenv NCBI /sw/blast/2.2.25/local -------------------------------------------------------------------
cd-hit
Version: v4.5.4-2011-03-07
http://bioinformatics.org/cd-hit/
CD-HIT was originally a protein clustering program. The main advantage of this program
is its ultra-fast speed. It can be hundreds of times faster than other clustering programs, for
example, BLASTCLUST. Therefore it can handle very large databases, like NR.
Current CD-HIT package can perform various jobs like clustering a protein database,
clustering a DNA/RNA database, comparing two databases (protein or DNA/RNA),
generating protein families, and many others.
Most CD-HIT programs were written in C++. Installing CD-HIT package is very simple:
1. download current CD-HIT at http://bioinformatics.org/cd-hit/, for example cd-hit-
2006-0215.tar.gz
2. unpack the file with “tar xvf cd-hit-XXX.tar.gz --gunzip”
3. change dir by “cd cd-hit”
4. compile the programs by “make”
5. you will have all cd-hit programs compiled
User guide: http://confluence.rcs.griffith.edu.au:8080/download/attachments/25952300/cdhit-user-guide.pdf
cd /sw/qiime tar -zxvf /data1/cd-hit-v4.5.4-2011-03-07.tgz mkdir /sw/qiime/cd-hit cd /sw/qiime/cd-hit mv ../cd-hit-v4.5.4-2011-03-07 4.5.4 cd /sw/qiime/cd-hit/4.5.4 You can take advantage of multiple-?threaded function of cd-?hit to speed up calculation make openmp=yes 2>&1 |tee cd-hit-install.txt
module load bioinformatics/cd-hit/4.5.4
cdbtools
ftp://occams.dfci.harvard.edu/pub/bio/tgi/software/cdbfasta/cdbfasta.tar.gz
cd /sw/qiime tar -zxvf /data1/cdbfasta.tar.gz mv cdbfasta 0.99 mkdir /sw/qiime/cdbfasta mv 0.99 /sw/qiime/cdbfasta cd /sw/qiime/cdbfasta/0.99 export GCLDIR=/sw/qiime/cdbfasta/gclib make 2>&1 |tee cdbfasta.txt
Usage:
module load bioinformatics/cdbfasta/0.99
ChimeraSlayer
module load bioinformatics/chimeraslayer/r20110519
Requirement:
The following software tools must be separately installed and made available via your standard PATH setting.
megablast: http://www.ncbi.nlm.nih.gov/BLAST/download.shtml
Installed. See: http://confluence.rcs.griffith.edu.au:8080/display/GHPC/qiime#qiime-blast2.2.25
module load blast/2.2.25
module load bioinformatics/cdbfasta/0.99
http://qiime.sourceforge.net/install/install.html#chimeraslayer-install
Source: http://sourceforge.net/projects/microbiomeutil/
module load blast/2.2.25 module load bioinformatics/cdbfasta/0.99 cd /sw/qiime tar -zxvf /data1/microbiomeutil-r20110519.tgz mkdir microbiomeutil mv microbiomeutil-r20110519 microbiomeutil/r20110519 cd /sw/qiime/microbiomeutil/r20110519 make 2>&1 |tee microbiomeutil_install.txt Run tests in the following order: make testNast 2>&1 |tee make_testNast.txt make testChimeraSlayer 2>&1 |tee make_testChimeraSlayer.txt make testWigeon 2>&1 |tee make_testWigeon.txt
http://microbiomeutil.sourceforge.net/
mothur
module load bioinformatics/mothur/1.21.1
http://www.mothur.org/wiki/Installation
http://www.mothur.org/w/images/6/64/Mothur.1.21.1.zip http://www.mothur.org/wiki/Download_mothur http://www.mothur.org/
cd /sw/qiime unzip /data1/Mothur.1.21.1.zip mkdir mothur mv Mothur.source mothur/1.21.1 cd /sw/qiime/mothur/1.21.1 module load mpi/openMPI/1.4.3-gnu vi makefile make the following changes: Change from: ============ CC_OPTIONS = -O3 CXXFLAGS += -O3 #in 1.13 or 1.12 CC_OPTIONS = CXXFLAGS To: === CC_OPTIONS = -O3 -mtune=native -march=native -m64 CXXFLAGS += -O3 -m64 #in 1.13 or 1.12 CC_OPTIONS = CXXFLAGS 64BIT_VERSION ?= yes USEMPI ?= yes MOTHUR_FILES="\"/sw/qiime/mothur/1.21.1\"" #if you are a linux user use the following line CXXFLAGS += -mtune=native -march=native -m64 #if you are a mac user use the following line #TARGET_ARCH += -arch x86_64 <===Comment this out
yum install readline-devel ncurses-devel ==> install on nodes as well as it is important for mothur install ( mount --bind /proc/ /compute/proc/; mount --bind /dev /compute/dev;yum --installroot=/compute/ yum install readline-devel ncurses-devel;umount /compute/dev;umount /compute/proc) module load mpi/openMPI/1.4.3-gnu make 2>&1 |tee make_mothur_install.txt
mothur 2.25.0
mkdir /sw/qiime/mothur/1.25.0/src cd /sw/qiime/mothur/1.25.0/src wget http://www.mothur.org/w/images/6/6d/Mothur.1.25.0.zip cd /sw/qiime/mothur/1.25.0/src/Mothur.source module load mpi/openMPI/1.4.3-gnu vi makefile edit makefile and comment out line 31: #TARGET_ARCH += -arch x86_64 make 2>&1 |tee makeLog.txt mkdir /sw/qiime/mothur/1.25.0/bin cp mothur /sw/qiime/mothur/1.25.0/bin (or ln -s /sw/qiime/mothur/1.25.0/mothur /sw/qiime/mothur/1.25.0/bin/mothus)
Clearcut
http://www.mothur.org/wiki/Clearcut
module load bioinformatics/clearcut/1.0.9
Install Dir: /sw/qiime/Clearcut
cd /sw/qiime unzip /data1/Clearcut.source.zip mv clearcut 1.0.9 mkdir -p /sw/qiime/clearcut mv 1.0.9 clearcut/ cd /sw/qiime/clearcut/1.0.9
Edit "Makefile" and select the appropriate optimization flags
Type "make" to compile and link clearcut
Type "make install" to install clearcut on your system
vi Makefile # Specify your compiler here CC = mpicc # DEFAULT GCC OPTIMIZATION CONFIGURATION (ALL ARCHITECTURES) #CFLAGS = -O3 -Wall -funroll-loops -fomit-frame-pointer CFLAGS = -O3 -Wall -funroll-loops -fomit-frame-pointer -mtune=native -march=native -m64
make 2>&1 |tee make_clearcut_install.txt mkdir bin cd /sw/qiime/clearcut/1.0.9/bin
RAxML v7.2.8alpha
module load bioinformatics/raxml/7.2.8a-mpi
http://www.exelixis-lab.org/
http://wwwkramer.in.tum.de/exelixis/countSource728.php
Documentation: http://wwwkramer.in.tum.de/exelixis/software.html http://confluence.rcs.griffith.edu.au:8080/download/attachments/25952300/RAxML-Manual.7.0.4.pdf
Installation instructions:
==========================
This version comes in three flavors:
1. raxmlHPC just the standard sequential version, compile it with gcc by typing make -f Makefile.gcc
for LINUX and MAC.
2. raxmlHPC-PTHREADS the Pthreads-parallelized version of RAxML which is intended for shared-memory
and multi-core architectures. It is compiled with the gcc compiler by typing make -f Makefile.PTHREADS
or make -f Makefile.PTHREADS.MAC on MACs.
3. raxmlHPC-MPI the MPI-parallelized version for all types of clusters to perform parallel bootstraps, rapid
parallel bootstraps, or multiple inferences on the original alignment, compile with the mpicc (MPI)
compiler by typing make -f Makefile.MPI.
Other compilers: It might make sense to use the now much improved Intel-compiler icc instead of gcc
on some systems. The icc version 10.0 I have on my laptop produces 20-30% faster code than gcc.
IMPORTANT WARNING FOR MPI and PTHREADS VERSIONS: If you want to compile the MPI or
PTHREADS version of RAxML but have previously compiled the sequential version, make sure to remove
all object files of the sequential code by typing “rm *.o”, everything needs to be re-compiled for MPI and
PTHREADS!
When to use which Version? Check the user guide.
mkdir -p /sw/bioinformatics/raxml mkdir /sw/bioinformatics/raxml/7.2.8a/mpi/ompi cd /sw/bioinformatics/raxml/7.2.8a/mpi/ompi bunzip2 /data1/RAxML-7.2.8-ALPHA.tar.bz2 tar -xvf /data1/RAxML-7.2.8-ALPHA.tar MPI version =========== cd /sw/bioinformatics/raxml/7.2.8a/mpi/ompi make -f Makefile.MPI.gcc 2>&1 |tee make_RAxML_install.txt openMP version ============== mkdir /sw/bioinformatics/raxml/7.2.8a/omp cd /sw/bioinformatics/raxml/7.2.8a/omp tar -xvf /data1/RAxML-7.2.8-ALPHA.tar mv RAxML-7.2.8-ALPHA/* . rmdir RAxML-7.2.8-ALPHA make -f Makefile.PTHREADS.gcc 2>&1 |tee make_RAxML_openMI_install.txt mkdir bin;cp -rp raxmlHPC-PTHREADS bin/ Serial ====== mkdir /sw/bioinformatics/raxml/7.2.8a/serial cd /sw/bioinformatics/raxml/7.2.8a/serial mv RAxML-7.2.8-ALPHA/* . rmdir RAxML-7.2.8-ALPHA make -f Makefile.gcc 2>&1 |tee make_RAxML_serial_install.txt mkdir bin;cp -rp raxmlHPC bin/
raxml-7.3.0
raxmlHPC ===== mkdir /sw/bioinformatics/raxml/7.3.0/src cd /sw/bioinformatics/raxml/7.3.0/src wget ftp://thebeast.colorado.edu/pub/QIIME-v1.5.0-dependencies/stamatak-standard-RAxML-5_7_2012.tgz tar -zxvf stamatak-standard-RAxML-5_7_2012.tgz cd /sw/bioinformatics/raxml/7.3.0/src/stamatak-standard-RAxML-5_7_2012 make -f Makefile.SSE3.gcc mkdir /sw/bioinformatics/raxml/7.3.0/bin cp raxmlHPC-SSE3 /sw/bioinformatics/raxml/7.3.0/bin/raxmlHPC #OR: ln -s /sw/bioinformatics/raxml/7.3.0/raxmlHPC-SSE3 /sw/bioinformatics/raxml/7.3.0/bin/raxmlHPC
infernal 1.0.2
module load bioinformatics/infernal/1.0.2-mpi
OR
module load bioinformatics/infernal/1.0.2-serial
Sequence analysis using profiles of RNA secondary structure consensus
ftp://selab.janelia.org/pub/software/infernal/
http://confluence.rcs.griffith.edu.au:8080/download/attachments/25952300/infernal-Userguide.pdf
Installation instruction from the user guide ============================================= Download the source tarball (infernal.tar.gz) from ftp://selab.janelia.org/pub/software/infernal/ or http://infernal.janelia.org. Unpack the software: > tar xvf infernal.tar.gz Go into the newly created top-level directory (named either infernal, or infernal-xx where xx is a release number): > cd infernal Configure for your system, and build the programs: > ./configure --prefix=/usr/mylocal --enable-mpi > make Run the automated testsuite. This is optional. All these tests should pass: > make check The programs are now in the src/ subdirectory. The user’s guide (this document) is in the documentation/userguide subdirectory. The man pages are in the documentation/manpages subdirectory. You can manually move or copy all of these to appropriate locations if you want. You will want the programs to be in your $PATH. make install To run a program in MPI mode, you must run them in an MPI environment with mpirun or mpiexec, with the --mpi option enabled. For example, in our LAM environment: > mpirun C cmsearch --mpi query.cm target.fa Other environments besides LAM MPI should work also, but may require different command syntax.
mkdir -p /sw/bioinformatics/infernal/1.0.2 cd /sw/bioinformatics/infernal/1.0.2/ tar -zxvf /data1/infernal.tar.gz mkdir ompi mkdir serial cp -rp infernal-1.0.2 ompi/ cp -rp infernal-1.0.2 serial/ Using openMPI ============= cd /sw/bioinformatics/infernal/1.0.2/ompi/infernal-1.0.2 module load mpi/openMPI/1.4.3-gnu ./configure --prefix=/sw/bioinformatics/infernal/1.0.2/ompi --enable-mpi 2>&1 |tee configure_infernal-1.0.2.txt make 2>&1 |tee make_infernal-1.0.2.txt make check All 59 exercises at level <= 4 passed. make install Serial build ============= cd /sw/bioinformatics/infernal/1.0.2/serial/infernal-1.0.2 ./configure --prefix=/sw/bioinformatics/infernal/1.0.2/serial 2>&1 |tee configure_serial_infernal-1.0.2.txt make 2>&1 |tee make_serial_infernal-1.0.2.txt make check make install
muscle 3.8.31
Multiple sequence alignment
Faster and more accurate than CLUSTALW
http://drive5.com/muscle/ http://www.drive5.com/muscle/downloads.htm http://www.drive5.com/muscle/muscle_userguide3.8.html
Userguide ==> http://confluence.rcs.griffith.edu.au:8080/download/attachments/25952300/muscle_userguide3.8.pdf
module load bioinformatics/muscle/3.8.31
Installation
============
Copy the muscle binary file to a directory that is accessible from your computer. That's it—there are no configuration files, libraries, environment variables or other settings to worry about. From now on muscle should be understood to mean "the file or path name of your executable file".
mkdir /sw/bioinformatics/muscle/3.8.31 cp /data1/muscle3.8.31_i86linux64 . mv muscle3.8.31_i86linux64 muscle chmod +x muscle
sfffile and sffinfo
Not installed as it is proprietary
Processing sff files:
sfffile and sffinfo (optional, QIIME 1.2.0 and later contain built-in tools for processing sff files although they are about 10x slower than the tools from Roche) (license: proprietary - must be obtained from Roche/454)
Denoising 454 data
GNU Science Library
module load gsl/gsl-1.15
http://confluence.rcs.griffith.edu.au:8080/display/GHPC/gsl
MAFFT
http://mafft.cbrc.jp/alignment/software/
It is a pre-Requisite for AmpliconNoise. MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <?200 sequences), FFT-NS-2 (fast; for alignment of <?10,000 sequences), etc.
module load bioinformatics/mafft/6.857
OR
module load bioinformatics/mafft/6.857-extensions
http://mafft.cbrc.jp/alignment/software/installation_without_root.html
http://mafft.cbrc.jp/alignment/software/source.html
mkdir /sw/bioinformatics/mafft tar -zxvf /data1/mafft-6.857-with-extensions-src.tgz mv mafft-6.857-with-extensions 6.857 cd /sw/bioinformatics/mafft/src/core To enable multithreading, uncomment line 8 of Makefile. vi Makefile (any other text editor is ok.) ENABLE_MULTITHREAD = -Denablemultithread #Uncomment this to enable multithreading (linux only) >> PREFIX = /sw/bioinformatics/mafft/6.857 LIBDIR = $(PREFIX)/libexec/mafft BINDIR = $(PREFIX)/bin MANDIR = $(PREFIX)/share/man/man1 #MNO_CYGWIN = -mno-cygwin ENABLE_MULTITHREAD = -Denablemultithread >>>>>> make clean make 2>&1 |tee make_muffat.txt make install 2>&1 |tee make_install_muffat.txt MAFFT Extensions ================= cd /sw/bioinformatics/mafft/src/extensions vi Makefile >>>>> PREFIX = /sw/bioinformatics/mafft/6.857 >>>>> make clean make 2>&1 |tee make_mafft_ext.txt make install 2>&1 |tee make_install_muffat_ext.txt
AmpliconNoise 1.25
module load bioinformatics/ampliconnoise/1.25
AmpliconNoise is a collection of programs for the removal of noise from 454 sequenced PCR amplicons. It involves two steps the removal of noise from the sequencing itself and the removal of PCR point errors. This project also includes the Perseus algorithm for chimera removal.
A cluster is not necessary but reasonable size data sets will only run on a cluster or good server.
A version of Message Passing Interface (MPI) is necessary to install the programs. OpenMPI is a good choice:
http://code.google.com/p/ampliconnoise/
http://code.google.com/p/ampliconnoise/downloads/detail?name=AmpliconNoiseV1.25.tar.gz&can=2&q=
User Guide ==> http://confluence.rcs.griffith.edu.au:8080/download/attachments/25952300/AmpliconNoiseV1.25UserGuide.pdf
Install Directory : /sw/bioinformatics/AmpliconNoise/1.25/
cd /sw/bioinformatics tar -zxvf /data1/AmpliconNoiseV1.25.tar.gz mkdir AmpliconNoise/1.25 mv AmpliconNoiseV1.25 AmpliconNoise/1.25/ cd /sw/bioinformatics/AmpliconNoise/1.25/AmpliconNoiseV1.25 module load gsl/gsl-1.15;module load bioinformatics/mafft/6.857 make clean make 2>&1 |tee make_ampliconNoise.txt make install
GHC
The Glasgow Haskell Compiler
module load bioinformatics/ghc/7.0.3
http://haskell.org/ghc/
Documentation: ==> http://haskell.org/haskellwiki/GHC
User guide: http://confluence.rcs.griffith.edu.au:8080/download/attachments/25952300/GHC-users_guide.pdf
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Stop!
For most users, we recommend installing the Haskell Platform instead of GHC. The current Haskell Platform release includes a recent GHC release as well as some other tools (such as cabal), and a larger set of libraries that are known to work together.
This 7.2.1 release is intended to be more of a "technology preview" than normal GHC stable branches. In particular, it supports a significantly improved version of DPH, as well as new features such as compiler plugins and "safe Haskell". The design of these new features may evolve as we get more experience with them.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
As for the above reasons, Haskell Platform was installed (see installation notes below). However, it needs GHC 7.0.3. A generic version of GHC was hence installed.
mkdir /sw/bioinformatics/ghc cd /sw/bioinformatics/ghc bunzip2 /data1/ghc-7.0.3-x86_64-unknown-linux.tar.bz2 tar -xvf /data1/ghc-7.0.3-x86_64-unknown-linux.tar mkdir /sw/bioinformatics/ghc/7.0.3 cd /sw/bioinformatics/ghc/ghc-7.0.3 ./configure --prefix=/sw/bioinformatics/ghc/7.0.3 2>&1 |tee configure_ghc703.txt (Examine the generated config.status and Makefile files to check if all is well...) make show-install-setup make install 2>&1 |tee make_ghc703.txt
Haskell Platform
Not Installed as GHC 7.0.3 was installed.
http://hackage.haskell.org/platform/ http://hackage.haskell.org/platform/linux.html
You need GHC 7.0.3 installed before building the platform.
Visualization and plotting steps
MatPlotLib
module load python/2.7.1
Already installed as part of python 2.7.1
Cytoscape 2.8.1
Installation directory: /sw/cytoscape/2.8.1
module load cytoscape/2.8.1
Supervised learning (supervised_learning.py)
module load R/2.13.0
http://confluence.rcs.griffith.edu.au:8080/display/GHPC/R
Build R http://www.r-project.org/
module load R/2.13.0
Once R is installed, run R and excecute the command
install.packages('randomForest') q()
* installing *source* package ârandomForestâ ... ** libs gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c classTree.c -o classTree.o gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c regTree.c -o regTree.o gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c regrf.c -o regrf.o gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c rf.c -o rf.o gfortran -fpic -g -O2 -c rfsub.f -o rfsub.o gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c rfutils.c -o rfutils.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o randomForest.so classTree.o regTree.o regrf.o rf.o rfsub.o rfutils.o -lgfortran -lm -L/sw/R/2.13.0/lib64/R/lib -lR installing to /sw/R/2.13.0/lib64/R/library/randomForest/libs ** R ** data ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded * DONE (randomForest) The downloaded packages are in â/tmp/RtmprZm6wd/downloaded_packagesâ Updating HTML index of packages in '.Library' Making packages.html ... done > q() Save workspace image? [y/n/c]: n
Assigning taxonomy using BLAST or picking OTUs against Greengenes filtered at 97% identity
module load bioinformatics/greengenes/gg
cd /sw/qiime/greengenes cp /data1/gg_otus_6oct2010.zip . unzip gg_otus_6oct2010.zip cd /sw/qiime/greengenes/gg_otus_6oct2010/
QIIME documentation locally - Sphinx
module load python/2.7.1
Sphinx-1.0.7-py2.7.egg
http://pypi.python.org/pypi/Sphinx
Sphinx is Python documentation generator
Sphinx is a tool that makes it easy to create intelligent and beautiful documentation for Python projects (or other documents consisting of multiple reStructuredText sources), written by Georg Brandl. It was originally created for the new Python documentation, and has excellent facilities for Python project documentation, but C/C++ is supported as well, and more languages are planned.
wget http://pypi.python.org/packages/2.7/S/Sphinx/Sphinx-1.0.7-py2.7.egg#md5=1321a3888d4fad656a5ede5838686e12 module load python/2.7.1 easy_install Sphinx-1.0.7-py2.7.egg
[root@admin data1]# easy_install Sphinx-1.0.7-py2.7.egg Processing Sphinx-1.0.7-py2.7.egg creating /sw/python/2.7.1/lib/python2.7/site-packages/Sphinx-1.0.7-py2.7.egg Extracting Sphinx-1.0.7-py2.7.egg to /sw/python/2.7.1/lib/python2.7/site-packages Adding Sphinx 1.0.7 to easy-install.pth file Installing sphinx-build script to /sw/python/2.7.1/bin Installing sphinx-quickstart script to /sw/python/2.7.1/bin Installing sphinx-autogen script to /sw/python/2.7.1/bin Installed /sw/python/2.7.1/lib/python2.7/site-packages/Sphinx-1.0.7-py2.7.egg Processing dependencies for Sphinx==1.0.7 Searching for docutils>=0.5 Reading http://pypi.python.org/simple/docutils/ Reading http://docutils.sourceforge.net/ Best match: docutils 0.8 Downloading http://pypi.python.org/packages/source/d/docutils/docutils-0.8.tar.gz#md5=f57474b69bfbf0eb608706a104f92dda Processing docutils-0.8.tar.gz Running docutils-0.8/setup.py -q bdist_egg --dist-dir /tmp/easy_install-oAgwED/docutils-0.8/egg-dist-tmp-F6zwlI warning: no files found matching 'MANIFEST' warning: no previously-included files matching '.cvsignore' found under directory '*' warning: no previously-included files matching '*.pyc' found under directory '*' warning: no previously-included files matching '*~' found under directory '*' warning: no previously-included files matching '.DS_Store' found under directory '*' zip_safe flag not set; analyzing archive contents... docutils.writers.s5_html.__init__: module references __file__ docutils.writers.html4css1.__init__: module references __file__ docutils.writers.latex2e.__init__: module references __file__ docutils.writers.pep_html.__init__: module references __file__ docutils.writers.odf_odt.__init__: module references __file__ docutils.parsers.rst.directives.misc: module references __file__ Adding docutils 0.8 to easy-install.pth file Installing rst2xml.py script to /sw/python/2.7.1/bin Installing rst2odt_prepstyles.py script to /sw/python/2.7.1/bin Installing rstpep2html.py script to /sw/python/2.7.1/bin Installing rst2xetex.py script to /sw/python/2.7.1/bin Installing rst2pseudoxml.py script to /sw/python/2.7.1/bin Installing rst2man.py script to /sw/python/2.7.1/bin Installing rst2html.py script to /sw/python/2.7.1/bin Installing rst2odt.py script to /sw/python/2.7.1/bin Installing rst2s5.py script to /sw/python/2.7.1/bin Installing rst2latex.py script to /sw/python/2.7.1/bin Installed /sw/python/2.7.1/lib/python2.7/site-packages/docutils-0.8-py2.7.egg Searching for Jinja2>=2.2 Reading http://pypi.python.org/simple/Jinja2/ Reading http://jinja.pocoo.org/ Best match: Jinja2 2.6 Downloading http://pypi.python.org/packages/source/J/Jinja2/Jinja2-2.6.tar.gz#md5=1c49a8825c993bfdcf55bb36897d28a2 Processing Jinja2-2.6.tar.gz Running Jinja2-2.6/setup.py -q bdist_egg --dist-dir /tmp/easy_install-evp4VJ/Jinja2-2.6/egg-dist-tmp-YT9MZB warning: no previously-included files matching '*' found under directory 'docs/_build' warning: no previously-included files matching '*.pyc' found under directory 'jinja2' warning: no previously-included files matching '*.pyc' found under directory 'docs' warning: no previously-included files matching '*.pyo' found under directory 'jinja2' warning: no previously-included files matching '*.pyo' found under directory 'docs' Adding Jinja2 2.6 to easy-install.pth file Installed /sw/python/2.7.1/lib/python2.7/site-packages/Jinja2-2.6-py2.7.egg Searching for Pygments>=0.8 Reading http://pypi.python.org/simple/Pygments/ Reading http://pygments.org/ Reading http://pygments.pocoo.org/ Best match: Pygments 1.4 Downloading http://pypi.python.org/packages/2.7/P/Pygments/Pygments-1.4-py2.7.egg#md5=acbdde4dae30efaba8cfa86dcb6070f2 Processing Pygments-1.4-py2.7.egg creating /sw/python/2.7.1/lib/python2.7/site-packages/Pygments-1.4-py2.7.egg Extracting Pygments-1.4-py2.7.egg to /sw/python/2.7.1/lib/python2.7/site-packages Adding Pygments 1.4 to easy-install.pth file Installing pygmentize script to /sw/python/2.7.1/bin Installed /sw/python/2.7.1/lib/python2.7/site-packages/Pygments-1.4-py2.7.egg Finished processing dependencies for Sphinx==1.0.7
qiime
module load qiime/1.3.0
Using Qiime/setup.py (and thereby python’s distutils package) is the recommended way of installing the Qiime library code and scripts. You can optionally specify where the library code and scripts should be installed – depending on your setup, you may want to do this. By default, the QIIME library code will be placed under python’s site-packages, and the QIIME scripts will be place in /usr/local/bin/. You may need to run setup.py using sudo if you do not have permission to place files in the default locations.
http://qiime.sourceforge.net/install/install.html http://sourceforge.net/projects/qiime/ http://www.qiime.org/
mkdir /sw/qiime/Qiime cd /sw/qiime/Qiime tar -zxvf /data1/Qiime-1.3.0.tar.gz mv Qiime-1.3.0 1.3.0 cd /sw/qiime/Qiime/1.3.0 By default the QIIME scripts will be installed in /usr/local/bin. As there are a lot of QIIME scripts, we highly recommend customizing the script directory to keep your system organized. This can be customized with the --install_scripts option. You also can specify and alternate directory for the library files with --install-purelib, but if you do so you must also specify --install-data as the same directory. Failure to do this will result in a broken QIIME install. An example command is: python setup.py install --install-scripts=/sw/qiime/Qiime/1.3.0/bin 2>&1 |tee qiime_installLOG.txt OR: python setup.py install --install-scripts=/home/qiime/bin/ --install-purelib=/home/qiime/lib/ --install-data=/home/qiime/lib/ 2>&1 |tee qiime_installLOG.txt python setup.py install --install-scripts=/sw/qiime/Qiime/1.3.0/bin 2>&1 |tee qiime_installLOG.txt copying static files... done dumping search index... done dumping object inventory... done build succeeded, 36 warnings. Build finished. The HTML pages are in _build/html. Local documentation built with Sphinx. Open to following path with a web browser: /sw/qiime/Qiime/1.3.0/doc/_build/html/index.html ####cp /sw/qiime/Qiime/1.3.0/qiime/support_files/qiime_config /sw/qiime/Qiime/1.3.0/.qiime_config cp /sw/qiime/Qiime/1.3.0/.qiime_config ~/ Edit .qiime_config and add: Find the line beginning qiime_scripts_dir and add a tab, followed by the QIIME scripts directory (/sw/qiime/Qiime/1.3.0/bin)
http://qiime.sourceforge.net/install/qiime_config.html
cd /sw/qiime/Qiime/1.3.0/bin python print_qiime_config.py -t python print_qiime_config.py -t >/tmp/qq 2>&1 test_FastTree_supported_version (__main__.Qiime_config) FastTree is in path and version is supported ... ok test_INFERNAL_supported_version (__main__.Qiime_config) INFERNAL is in path and version is supported ... ok test_ampliconnoise_install (__main__.Qiime_config) AmpliconNoise install looks sane. ... ok test_blast_supported_version (__main__.Qiime_config) blast is in path and version is supported ... FAIL test_blastall_fp (__main__.Qiime_config) blastall_fp is set to a valid path ... ok test_blastmat_dir (__main__.Qiime_config) blastmat_dir is set to a valid path. ... ok test_cdbtools_supported_version (__main__.Qiime_config) cdbtools is in path and version is supported ... ok test_cdhit_supported_version (__main__.Qiime_config) cd-hit is in path and version is supported ... ok test_chimeraSlayer_install (__main__.Qiime_config) no obvious problems with ChimeraSlayer install ... ok test_clearcut_supported_version (__main__.Qiime_config) clearcut is in path and version is supported ... ok test_cluster_jobs_fp (__main__.Qiime_config) cluster_jobs_fp is set to a valid path and is executable ... ok test_denoiser_supported_version (__main__.Qiime_config) denoiser aligner is ready to use ... FAIL test_for_obsolete_values (__main__.Qiime_config) local qiime_config has no extra params ... ok test_matplotlib_suported_version (__main__.Qiime_config) maptplotlib version is supported ... FAIL test_mothur_supported_version (__main__.Qiime_config) mothur is in path and version is supported ... ERROR test_muscle_supported_version (__main__.Qiime_config) muscle is in path and version is supported ... FAIL test_numpy_suported_version (__main__.Qiime_config) numpy version is supported ... FAIL test_pynast_suported_version (__main__.Qiime_config) pynast version is supported ... ok test_pynast_template_alignment_blastdb_fp (__main__.Qiime_config) pynast_template_alignment_blastdb, if set, is set to a valid path ... ok test_pynast_template_alignment_fp (__main__.Qiime_config) pynast_template_alignment, if set, is set to a valid path ... ok test_python_exe_fp (__main__.Qiime_config) python_exe_fp is set to a working python env ... ok test_python_supported_version (__main__.Qiime_config) python is in path and version is supported ... FAIL test_qiime_scripts_dir (__main__.Qiime_config) qiime_scripts_dir, if set, is set to a valid path ... ok test_raxmlHPC_supported_version (__main__.Qiime_config) raxmlHPC is in path and version is supported ... FAIL test_temp_dir (__main__.Qiime_config) temp_dir, if set, is set to a valid path ... ok test_template_alignment_lanemask_fp (__main__.Qiime_config) template_alignment_lanemask, if set, is set to a valid path ... ok test_uclust_supported_version (__main__.Qiime_config) uclust is in path and version is supported ... ok test_working_dir (__main__.Qiime_config) working_dir, if set, is set to a valid path ... ok ====================================================================== ERROR: test_mothur_supported_version (__main__.Qiime_config) mothur is in path and version is supported ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 472, in test_mothur_supported_version version_string = stdout.strip().split(' ')[1].strip('v.') IndexError: list index out of range ====================================================================== FAIL: test_blast_supported_version (__main__.Qiime_config) blast is in path and version is supported ---------------------------------------------------------------------- ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 374, in test_blast_supported_version % ('.'.join(map(str,acceptable_version)), version_string)) AssertionError: Unsupported blast version. 2.2.22 is required, but running 2.2.25. ====================================================================== FAIL: test_denoiser_supported_version (__main__.Qiime_config) denoiser aligner is ready to use ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 497, in test_denoiser_supported_version "which components of QIIME you plan to use.") AssertionError: Denoiser flowgram aligner not found or not executable.This may or may not be a problem depending on which components of QIIME you plan to use. ====================================================================== FAIL: test_matplotlib_suported_version (__main__.Qiime_config) maptplotlib version is supported ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 332, in test_matplotlib_suported_version version_string)) AssertionError: Unsupported matplotlib version. Must be >= 0.98.5.3 and < 0.98.5.4 , but running 1.0.1. ====================================================================== FAIL: test_muscle_supported_version (__main__.Qiime_config) muscle is in path and version is supported ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 458, in test_muscle_supported_version % ('.'.join(map(str,acceptable_version)), version_string)) AssertionError: Unsupported muscle version. 3.6 is required, but running 3.8.31. ====================================================================== FAIL: test_numpy_suported_version (__main__.Qiime_config) numpy version is supported ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 314, in test_numpy_suported_version version_string)) AssertionError: Unsupported numpy version. Must be >= 1.3.0 and < 1.5.1 , but running 1.5.1. ====================================================================== FAIL: test_python_supported_version (__main__.Qiime_config) python is in path and version is supported ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 296, in test_python_supported_version version_string)) AssertionError: Unsupported python version. Must be >= 2.6.0 and < 2.7.0 , but running 2.7.1. ====================================================================== FAIL: test_raxmlHPC_supported_version (__main__.Qiime_config) raxmlHPC is in path and version is supported ---------------------------------------------------------------------- Traceback (most recent call last): File "print_qiime_config.py", line 504, in test_raxmlHPC_supported_version "which components of QIIME you plan to use.") AssertionError: raxmlHPC not found. This may or may not be a problem depending on which components of QIIME you plan to use. ---------------------------------------------------------------------- Ran 28 tests in 0.240s FAILED (failures=7, errors=1) System information ================== Platform: linux2 Python version: 2.7.1 (r271:86832, Jun 29 2011, 09:08:45) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] Python executable: /sw/python/2.7.1/bin/python Dependency versions =================== PyCogent version: 1.5.1 NumPy version: 1.5.1 matplotlib version: 1.0.1 QIIME library version: 1.3.0 QIIME script version: 1.3.0 PyNAST version (if installed): 1.1 RDP Classifier version (if installed): rdp_classifier-2.3.jar QIIME config values =================== blastmat_dir: None topiaryexplorer_project_dir: None pynast_template_alignment_fp: /sw/qiime/greengenes/core_set_aligned.fasta.imputed cluster_jobs_fp: /sw/qiime/Qiime/1.3.0/bin/start_parallel_jobs.py pynast_template_alignment_blastdb: None torque_queue: friendlyq template_alignment_lanemask_fp: /sw/qiime/greengenes/lanemask_in_1s_and_0s jobs_to_start: 1 cloud_environment: False qiime_scripts_dir: /sw/qiime/Qiime/1.3.0/bin denoiser_min_per_core: 50 working_dir: None python_exe_fp: python temp_dir: None blastall_fp: blastall seconds_to_sleep: 60
test
align_seqs.py -h align_seqs.py -h Usage: align_seqs.py [options] {-i/--input_fasta_fp INPUT_FASTA_FP} [] indicates optional input (order unimportant) {} indicates required input (order unimportant) This script aligns the sequences in a FASTA file to each other or to a template sequence alignment, depending on the method chosen. Currently, there are three methods which can be used by the user: 1. PyNAST (Caporaso et al., 2009) - The default alignment method is PyNAST, a python implementation of the NAST alignment algorithm. The NAST algorithm aligns each provided sequence (the "candidate" sequence) to the best-matching sequence in a pre-aligned database of sequences (the "template" sequence). Candidate sequences are not permitted to introduce new gap characters into the template database, so the algorithm introduces local mis-alignments to preserve the existing template sequence. 2. MUSCLE (Edgar, 2004) - MUSCLE is an alignment method which stands for MUltiple Sequence Comparison by Log-Expectation. 3. INFERNAL (Nawrocki, Kolbe, & Eddy, 2009) - Infernal ("INFERence of RNA ALignment") is for an alignment method for using RNA structure and sequence similarities. Example usage: Print help message and exit align_seqs.py -h Alignment with MUSCLE: One could also use the MUSCLE algorithm. The following command can be used to align sequences (i.e. the resulting FASTA file from pick_rep_set.py), where the output is written to the directory "muscle_alignment/" align_seqs.py -i repr_set_seqs.fasta -m muscle -o muscle_alignment/ Alignment with PyNAST: The default alignment method is PyNAST, a python implementation of the NAST alignment algorithm. The NAST algorithm aligns each provided sequence (the "candidate" sequence) to the best-matching sequence in a pre-aligned database of sequences (the "template" sequence). Candidate sequences are not permitted to introduce new gap characters into the template database, so the algorithm introduces local mis-alignments to preserve the existing template sequence. The quality thresholds are the minimum requirements for matching between a candidate sequence and a template sequence. The set of matching template sequences will be searched for a match that meets these requirements, with preference given to the sequence length. By default, the minimum sequence length is 150 and the minimum percent id is 75%. The minimum sequence length is much too long for typical pyrosequencing reads, but was chosen for compatibility with the original NAST tool. The following command can be used for aligning sequences using the PyNAST method, where we supply the program with a FASTA file of unaligned sequences (i.e. resulting FASTA file from pick_rep_set.py, a FASTA file of pre-aligned sequences (this is the template file, which is typically the Greengenes core set - available from http://greengenes.lbl.gov/), and the results will be written to the directory "pynast_aligned/" align_seqs.py -i repr_set_seqs.fasta -t core_set_template.fasta -o pynast_aligned/ Alternatively, one could change the minimum sequence length ("-e") requirement and minimum sequence identity ("-p"), using the following command align_seqs.py -i repr_set_seqs.fasta -t core_set_template.fasta -o pynast_aligned/ -e 500 -p 95.0 Alignment with Infernal: An alternative alignment method is to use Infernal. Infernal is similar to the PyNAST method, in that you supply a template alignment, although Infernal has several distinct differences. Infernal takes a multiple sequence alignment with a corresponding secondary structure annotation. This input file must be in Stockholm alignment format. There is a fairly good description of the Stockholm format rules at: http://en.wikipedia.org/wiki/Stockholm_format. Infernal will use the sequence and secondary structural information to align the candidate sequences to the full reference alignment. Similar to PyNAST, Infernal will not allow for gaps to be inserted into the reference alignment. Using Infernal is slower than other methods, and therefore is best used with sequences that do not align well using PyNAST. The following command can be used for aligning sequences using the Infernal method, where we supply the program with a FASTA file of unaligned sequences, a STOCKHOLM file of pre-aligned sequences and secondary structure (this is the template file - an example file can be obtained from: http://bmf.colorado.edu/QIIME/seed.16s.reference_model.sto.zip), and the results will be written to the directory "infernal_aligned/" align_seqs.py -m infernal -i repr_set_seqs.fasta -t seed.16s.reference_model.sto -o infernal_aligned/ Options: --version show program's version number and exit -h, --help show this help message and exit -v, --verbose Print information during execution -- useful for debugging [default: False] -t TEMPLATE_FP, --template_fp=TEMPLATE_FP Filepath for template against [REQUIRED if -m pynast or -m infernal] -m ALIGNMENT_METHOD, --alignment_method=ALIGNMENT_METHOD Method for aligning sequences. Valid choices are: pynast, infernal, clustalw, muscle, infernal, mafft [default: pynast] -a PAIRWISE_ALIGNMENT_METHOD, --pairwise_alignment_method=PAIRWISE_ALIGNMENT_METHOD method for performing pairwise alignment in PyNAST. Valid choices are muscle, pair_hmm, clustal, blast, uclust, mafft [default: uclust] -d BLAST_DB, --blast_db=BLAST_DB Database to blast against when -m pynast [default: created on-the-fly from template_alignment] -o OUTPUT_DIR, --output_dir=OUTPUT_DIR Path to store result file [default: <ALIGNMENT_METHOD>_aligned] -e MIN_LENGTH, --min_length=MIN_LENGTH Minimum sequence length to include in alignment [default: 150] -p MIN_PERCENT_ID, --min_percent_id=MIN_PERCENT_ID Minimum percent sequence identity to closest blast hit to include sequence in alignment [default: 0.75] REQUIRED options: The following options must be provided under all circumstances. -i INPUT_FASTA_FP, --input_fasta_fp=INPUT_FASTA_FP path to the input fasta file [REQUIRED]
Extra Notes
QIIME Denoiser Install Notes¶
If you do not install QIIME using setup.py and you plan to use the QIIME Denoiser, you’ll need to compile the FlowgramAlignment program. To do this you’ll need to have ghc installed. Then from the Qiime/qiime/support_files/denoiser/FlowgramAlignment/ directory, run the following command:
make ; make install
cd /sw/qiime/Qiime/1.3.0/qiime/support_files/denoiser/FlowgramAlignment make 2>&1 |tee make_FlowgramAlignment.txt make install |tee make_install_FlowgramAlignment.txt
sample pbs script
Please make sure you copy the .qiime_config and make changes to it if neccessary.
cp /sw/qiime/Qiime/1.3.0/.qiime_config ~/
#PBS -m abe #PBS -M <youremail>@griffith.edu.au #PBS -N qiimeOpenMPI #PBS -l select=2:ncpus=3:mem=12g:mpiprocs=3 source $HOME/.bashrc module load qiime/1.3.0 WORK_DIR=/export/home/snumber/pbs/qiime HOME_DIR=/export/home/snumber/ echo "Starting job" ## The number of nodes is given by the select =<NUM > above NODES=1 ###$PBS_NODEFILE is a node-list file created with select and mpiprocs options by PBS ###### The number of MPI processes available is mpiprocs * nodes (=NPROCS) NPROCS=6 mpirun --prefix /sw/openMPI/1.4.3-gnu/ -machinefile $PBS_NODEFILE -np $NPROCS env PATH=$PATH:$SWORK_DIR:$HOME_DIR env LD_LIBRARY_PATH=$LD_LIBRARY_PATH make_otu_heatmap_html.py -i otus/otu_table.txt -o otus/OTU_Heatmap/ echo "Done with job"
Another Sample PBS script
#!/bin/bash #PBS -N QIIME #PBS -m abe #PBS -M YourEmail@griffith.edu.au #PBS -l select=1:ncpus=2:mem=14gb -l walltime=100:00:00 module load qiime/1.5.0 export FEKO_USER_HOME=/export/home/s12345/pbs/feko source $HOME/.bashrc echo "Starting job: " pick_otus_through_otu_table.py -i /export/home/s12345/pbs/qiime/seq.fasta -o /export/home/s12345/pbs/qiime/otus/
An example seq.fasta file
>34 TGGCACGTGCGATGGCAGTTCAACGTTCA >23 TAAATCCCGTAGGCTTTTAGCATCGACG >2 TTATTCGAAGCGCTTTACGACACGCGCGCA
Version 1.5.0
Usage
module load qiime/1.5.0 OR: module load python/2.7.1 module load qiime/1.3.0 module load bioinformatics/uclust/1.2.22 module load bioinformatics/fasttree/2.1.3 module load bioinformatics/rdpclassifier/2.3 module load R/2.13.0 module load blast/2.2.25 module load bioinformatics/cd-hit/4.5.4 module load bioinformatics/cdbfasta/0.99 module load bioinformatics/chimeraslayer/r20110519 module load bioinformatics/mothur/1.21.1 module load bioinformatics/clearcut/1.0.9 module load bioinformatics/raxml/7.2.8a-mpi #(other versions available) module load bioinformatics/infernal/1.0.2-mpi #(OR module load bioinformatics/infernal/1.0.2-serial) module load bioinformatics/mafft/6.857 # (OR: module load bioinformatics/mafft/6.857-extensions) module load bioinformatics/muscle/3.8.31 module load gsl/gsl-1.15 module load bioinformatics/greengenes/gg module load bioinformatics/ampliconnoise/1.25 module load bioinformatics/ghc/7.0.3 module load cytoscape/2.8.1 module load mpi/openMPI/1.4.3-gnu #(optional) module load qiime/1.5.0
Installation of Version Qiime 1.5.0
biom Version 0.9.3
biom-format 0.9.3 is a requirement for qiime version 1.5.0
cd /sw/qiime/biom/0.9.3/src/biom-format-0.9.3 module load python/2.7.1 module load bioinformatics/uclust/1.2.22 module load bioinformatics/fasttree/2.1.3 module load bioinformatics/rdpclassifier/2.3 module load R/2.13.0 module load blast/2.2.25 module load bioinformatics/cd-hit/4.5.4 module load bioinformatics/cdbfasta/0.99 module load bioinformatics/chimeraslayer/r20110519 module load bioinformatics/mothur/1.21.1 module load bioinformatics/clearcut/1.0.9 module load bioinformatics/raxml/7.2.8a-mpi #(other versions available) module load bioinformatics/infernal/1.0.2-mpi #(OR module load bioinformatics/infernal/1.0.2-serial) module load bioinformatics/mafft/6.857 # (OR: module load bioinformatics/mafft/6.857-extensions) module load bioinformatics/muscle/3.8.31 module load gsl/gsl-1.15 module load bioinformatics/greengenes/gg module load bioinformatics/ampliconnoise/1.25 module load bioinformatics/ghc/7.0.3 module load cytoscape/2.8.1 module load mpi/openMPI/1.4.3-gnu #(optional) module load qiime/1.5.0 python setup.py install
mkdir /sw/qiime/Qiime/1.5.0/ cd /sw/qiime/Qiime/1.5.0/ tar -zxvf Qiime-1.5.0.tar.gz cp /sw/qiime/Qiime/1.5.0/src/Qiime-1.5.0/qiime/support_files/qiime_config /sw/qiime/Qiime/1.5.0/.qiime_config Find the line beginning qiime_scripts_dir and add a tab, followed by the QIIME scripts directory (/sw/qiime/Qiime/1.5.0/bin) cd /sw/qiime/Qiime/1.5.0/src/Qiime-1.5.0 mv * /sw/qiime/Qiime/1.5.0/ cd /sw/qiime/Qiime/1.5.0 python setup.py install --install-scripts=/sw/qiime/Qiime/1.5.0/bin 2>&1 |tee qiime_installLOG.txt >>>>>>>>>>>>> You need to make sure that the install-data path points to the same place as the purelib path, e.g: python setup.py install --install-scripts=/sw/qiime/Qiime/1.5.0/bin/ --install-purelib=/sw/qiime/Qiime/1.5.0/lib/ --install-data=/sw/qiime/Qiime/1.5.0/lib/ Temporarily fixed as the above sysntax was not used in the install as follows: mkdir /sw/python/2.7.1/lib/python2.7/site-packages/qiime/support_files/denoiser/bin cp -i /sw/qiime/Qiime/1.5.0/qiime/support_files/denoiser/bin/FlowgramAli_4frame /sw/python/2.7.1/lib/python2.7/site-packages/qiime/support_files/denoiser/bin/ >>>>>>>>>>>>> CHECK for errors: cd /sw/qiime/Qiime/1.5.0/bin python print_qiime_config.py -t
Create .qiime_config file in your home directory
vi ~/.qiime_config
more .qiime_config # qiime_config # WARNING: DO NOT EDIT OR DELETE Qiime/qiime_config # To overwrite defaults, copy this file to $HOME/.qiime_config or a full path # specified by $QIIME_CONFIG_FP and edit that copy of the file. cluster_jobs_fp /sw/qiime/Qiime/1.5.0/bin/start_parallel_jobs.py python_exe_fp python working_dir None blastmat_dir blastall_fp blastall pynast_template_alignment_fp /sw/qiime/greengenes/core_set_aligned.fasta.imputed pynast_template_alignment_blastdb template_alignment_lanemask_fp /sw/qiime/greengenes/lanemask_in_1s_and_0s jobs_to_start 1 seconds_to_sleep 60 qiime_scripts_dir /sw/qiime/Qiime/1.5.0/bin temp_dir /tmp denoiser_min_per_core 50 cloud_environment False topiaryexplorer_project_dir torque_queue workq
Another sample
blastmat_dir None sc_queue all.q topiaryexplorer_project_dir None pynast_template_alignment_fp /export/home/s123456/install/QIIME/data/gg_otus_4feb2011/rep_set/gg_97_otus_4feb2011_aligned.fasta cluster_jobs_fp /export/home/s123456/Install/QIIME/bin/start_parallel_jobs.py pynast_template_alignment_blastdb /export/home/s123456/install/QIIME/data/core_set_aligned.fasta.imputed assign_taxonomy_reference_seqs_fp None torque_queue workq qiime_test_data_dir None template_alignment_lanemask_fp /export/home/s123456/install/QIIME/data/lanemask_in_1s_and_0s jobs_to_start 10 cloud_environment False qiime_scripts_dir /export/home/s123456/Install/QIIME/bin denoiser_min_per_core 50 working_dir None python_exe_fp python temp_dir /tmp blastall_fp blastall seconds_to_sleep 60 assign_taxonomy_id_to_taxonomy_fp None
Check : https://wiki.hpcc.msu.edu/display/Bioinfo/Using+QIIME
Reference
1. https://groups.google.com/forum/?fromgroups=#!topic/qiime-forum/HAWoYltmS68
2. https://github.com/biocore/qiime/blob/1.5.0/scripts/start_parallel_jobs.py
3. https://github.com/biocore/qiime