*** Welcome to Meta-aligner ***

Meta-aligner is a package for long-read mapping. The source files are available here: meta-aligner-package

============================

Command Line:

usage:
============================

./meta-aligner [options]* -x <index name> -fa <ref name> -r <read name> -o [<hit>]

 

Main arguments:
============================

-x <index name>

The base name of the indexes for the reference genome used at each step of the Meta-aligner. Note that if any aligner is used at the alignment or assignment stages, their indexes must exist with this name. For Soap2 and mrsFast aligners which are used at the first stage of Meta-aligner, this name must be used without any suffix.

-fa <ref name>

The reference genome which is used for the local alignment (assumed to be in Fasta format).

-r <read name>

The base name of the input read set (assumed to be in FastQ format).

-o <output name>

File to write SAM alignments to. By default, alignments are written to “output.sam”.

============================
Options:
============================
Input option:
============================

-FA

Reads are Fasta files. Fasta files usually have extension .fa, .fasta.

-pg

Percent of gap within the input read set. This value is used in -ed options. The default value is 0.01. This parameter can be estimated when user uses the PE algorithm.

============================
Alignment options:
============================

-al <int>

Flag for using different short read aligners at the alignment stage only. Flag values are: Bowtie 1 / mrsFast 2 / SOAP2 3. The default is Bowtie (1).

-l1 <int>

The fragment size (l1) which is used at the alignment stage, and the first step of the assignment step. The default value is 40. This parameter can be learned when user uses the PE algorithm.

-sl1 <int>

The length of sliding window. This parameter is used at the alignment stage and the first step of the assignment stage of Meta-aligner. The default value is 0, i.e., no sliding is used. This parameter can be learned when user uses the PE algorithm.

-cfd1 <int>

The consecutive distance between two anchored fragment which is used for confirming two fragments of a read and anchor read (g1). This parameter is used at the alignment stage and the first step of the assignment stage of Meta-aligner. The default value is g1=0.1*l1.

-d <int>

Edit distance between fragments and the reference genome using for alignment. This parameter is used at the alignment stage and the first step of the assignment stage of Meta-aligner. Setting this value to zero means that only exact matches are desired. The default value is 2. This parameter can be learned when user uses the PE algorithm.

For Bowtie: this command works as -v (may be an integer from 0 through 3) and determines only number of mismatches.
For mrsFast: this command works as -e.
For Soap2: this command works as -v.

-tr <int>

Length of reads that are trimmed and only <int> bases of each read is used for anchoring at the alignment stage and the first step of the assignment stage of Meta-aligner. The remaining bases of each reads are used in the local alignment. In the default mode, this value is not used.

============================
Assignment options:
============================

-l2 <int>

The fragment size (l2) which is used at the second step of the assignment stage. The default value is 150.

-sl2 <int>

The length of sliding window for the second step of the assignment stage. The default value is 50.

-cfd2 <int>

The consecutive fragments distance which is used for confirming two fragments of a read and anchor it (g2) which is used at the second step of the assignment stage. The default value is g2=0.1*l2.

-seedmm2 <int>

Number of mismatches which is allowed in a seed alignment at the second step of the assignment stage. The default value is 1.

-seedlen2 <int>

Length of the seed substrings to align at the second step of the assignment stage. The default value is 20.

-ls1 <int>

List size of the assignment stage when Bowtie is used (at the second step of the assignment stage of Meta-aligner). The default value is 10.

-ls2 <int>

List size of the assignment stage when Bowtie2 is used (at the third step of the assignment stage of Meta-aligner). The default value is 40.

-thrsc <double>

Threshold of path selection step at the assignment stage of Meta-aligner (for both Bowtie and Bowtie2). Paths are filtered by their scores. The default value is 0.3.

============================
Scoring options:
============================

-ms <double>

The match score.

-mp <double>

The mismatch penalty.

-gp <double>

The gap penalty.

Reporting options:

-dis

This option discards local alignment of the anchored reads. By using this option, only reads, their flags and positions on the reference genome are reported. This parameter is only used at the first stage of Meta-aligner.

-disHeader

This option suppresses the header of the output SAM file.

============================
Other options:
============================

-step <1 or 2 or 3>

This parameter specifies that Meta-aligner is run up to the selected step, in case of selecting “1”: run only the alignment stage; “2”: run the alignment stage and the first step of the assignment stage; “3”: run all steps of Meta-aligner. The default value is “2”.

-dir <address>

If this parameter is used, Meta-aligner creates a new directory at the input address, and all steps are executed at that address. The default address is “./results”.

-p <int>

Number of threads which is used for running Meta-aligner (both stages). The default value is 1.

-ed <double>

This parameter controls the normalized cutting length of the local alignment table in the Smith-Waterman algorithm (relative to each read length). With this parameter, only ed/2*(read-length) cells adjacent to the original diagonal of the local alignment table are used for local alignment procedure. This parameter must be between 0 (consider only original diagonal cells of the dynamic table) and 2 (consider all cells of the dynamic table). The default value is 5*pg. This parameter can be estimated from indel rate when user uses the PE algorithm.

-ram <double>

User can set the available RAM when running Meta-aligner. By this parameter, user can run Meta-aligner in all platforms without any restriction of RAM. By using this command, Meta-aligner handles number of threads (-p) and length of reads at the local alignment step. If some reads cannot be processed by this value of RAM (even with one thread), Meta-aligner reports these reads in a file (named “NotEnoughRAM.txt”) which consists of reads in Fastq format, with their flags and the anchored positions written in their header section by underline.

-h

Used for printing of Meta-aligner commands.

 

 

*** Sample runs:

1) ./meta-align.out -x genomeindex -fa chr1.fa -r reads.fq -o output.sam -dir aligndir:

Using “reads.fq” to align reads to “chr1.fa” with the BOWTIE index base-name “genomeindex”. Results are in the “output.sam” file in “aligndir” directory.

Default directory is “<current directory>/results/”.

2) ./meta-align.out -x genomeindex -fa chr1.fa -r reads.fq -o output.sam -l1 25 -d 1 -sl1 10:

Change the value of subfragment size to 25 base pairs and distance of mapping for BOWTIE to “d=1”. Sliding window length becomes 10 which is used for increasing mapping rate.

3) ./meta-align.out -x genomeindex -fa chr1.fa -r reads.fq -o output.sam -al 2:

Using Soap2 as a short-read mapper in the first step. Note that for “genomeindex” is now the base-name of index for both Soap2 and BOWTIE.
We recommend use “-al 1” for mapping.

4) ./meta-align.out -x genomeindex -fa chr1.fa -r reads.fq -o output.sam -p 4 -ram 4:

Specifying number of threads to 4 and limiting ram to 4 GB for the desired computer.

5) ./meta-align.out -x genomeindex -fa chr1.fa -r reads.fq -o output.sam -dis

No local alignment at the first stage of running. With this option, output is generated very fast with these information for each read:
I. Flag of mapping
II. Reference name
III. Position of mapping
****** Help to fix parameters ******

# For Illumina/454/Solid input reads:

use “-l1 30 -d 0 -sl1 10”.

If depth of coverage is high (>10):

use “-l1 30 -d 0/1”.
# For PacBio/Nanopore input reads:

use “-l1 25 -d 2 -sl1 5”.