/
RepeatSeq
RepeatSeq
Introduction
RepeatSeq determines genotypes for microsatellite repeats in high-throughput sequencing data.
If you use this code, please cite manuscript:
G. Highnam, C. Franck, A. Martin, C. Stephens, A. Puthige, and D. Mittelman (2012) Accurate human microsatellite genotypes from high-throughput resequencing data using informed error profiles, Nucleic Acids Res, Epub Oct 22.
Usage
Readme
Required Input RepeatSeq requires a BAM file, a FASTA file, and a region file as the minimal parameters. Optional Input The user to specify a number of command-line options to customize the behavior of RepeatSeq: Command-line Options: -r use only a specific read length or range of read lengths (e.g. LENGTH or MIN:MAX) -L required number of reference matching bases BEFORE the repeat [3] -R required number of reference matching bases AFTER the repeat [3] -M minimum mapping quality for a read to be used for allele determination -multi exclude reads flagged with the XT:A:R tag -pp exclude reads that are not properly paired (for PE reads only) -error manually override the RepeatSeq error model and set a constant error rate [0.05] -haploid assume a haploid rather than diploid genome -repeatseq write .repeatseq file (**see below for more information**) -calls write .calls file (**see below for more information**) -t include user-defined tag in the output filename -o number of flanking bases to output from each read Running RepeatSeq Usage: repeatseq [options] <in.bam> <in.fasta> <in.regions>, If an improper command line option is found, RepeatSeq will exit and print usage information.
Installation
Download the following packages as well:
BamTools (http://sourceforge.net/projects/bamtools/)
fastahack (https://github.com/ekg/fastahack)
Steps to install (requires CMake): [..navigate to repeatseq directory..] (1) download bamtools, place in repeatseq/ in directory named "bamtools" (https://github.com/pezmaster31/bamtools) (2) download fastahack, place in repeatseq/ in directory named "fastahack" (https://github.com/ekg/fastahack) (3) build bamtools $ mkdir bamtools/build $ cd bamtools/build/ $ cmake .. $ make (4) build repeatseq $ cd ../.. $ make
Actual Install on gowonda: cd /sw/bioinformatics/repeatseq/0.6.4/ unzip /sw/bioinformatics/repeatseq/0.6.4/src/repeatseq-master.zip cd /sw/bioinformatics/repeatseq/0.6.4/repeatseq-master mv * ../ cd ../ rmdir repeatseq-master mkdir /sw/bioinformatics/repeatseq/0.6.4/bamtools mkdir /sw/bioinformatics/repeatseq/0.6.4/fastahack >>>>>>>> bamtools build <<<<<<<<<<< cd /sw/bioinformatics/repeatseq/0.6.4/bamtools unzip /sw/bioinformatics/repeatseq/0.6.4/src/bamtools-master.zip cd /sw/bioinformatics/repeatseq/0.6.4/bamtools/bamtools-master mv * ../ cd /sw/bioinformatics/repeatseq/0.6.4/bamtools/ rmdir bamtools-master cd /sw/bioinformatics/repeatseq/0.6.4/ mkdir bamtools/build cd bamtools/build/ cmake .. [to build a debug version which can be used with gdb, do as follows] [cmake -DCMAKE_BUILD_TYPE=Debug ../ ] ..................... cmake .. -- The C compiler identification is GNU -- The CXX compiler identification is GNU -- Check for working C compiler: /sw/gcc/4.7.1/bin/gcc -- Check for working C compiler: /sw/gcc/4.7.1/bin/gcc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working CXX compiler: /sw/gcc/4.7.1/bin/g++ -- Check for working CXX compiler: /sw/gcc/4.7.1/bin/g++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Configuring done -- Generating done -- Build files have been written to: /sw/bioinformatics/repeatseq/0.6.4/bamtools/build ...................... make >>>>>>>>>>>>>>>>>>>> cd /sw/bioinformatics/repeatseq/0.6.4/fastahack unzip /sw/bioinformatics/repeatseq/0.6.4/src/fastahack-master.zip cd /sw/bioinformatics/repeatseq/0.6.4/fastahack/fastahack-master mv * ../ mv .gitignore ../ cd /sw/bioinformatics/repeatseq/0.6.4/fastahack/ rmdir fastahack-master .....build repeatseq............. cd /sw/bioinformatics/repeatseq/0.6.4/ make ..................................
Reference
1. https://github.com/BioinformaticsArchive/repeatseq#readme
2. https://github.com/pezmaster31/bamtools
3. https://github.com/ekg/fastahack
, multiple selections available,
Related content
repeatmasker
repeatmasker
More like this
stacks
More like this
Bioconductor
Bioconductor
More like this
bfast
More like this
velvet
More like this
SAMtools
More like this