1. Hanfei Sun
  2. MISP

Overview

HTTPS SSH
===========================
 misp: a fast Motif Scanner
===========================

Install
=======


Type in this command under the project's directory to install:
    make

To see the test result, type in:
    make test


Usage
=====

For one motif:
    misp <in.seq> <in.db> <p-value> <motif-id> <output-prefix>

For all motifs:
    misp <in.seq> <in.db> <p-value> all <output-prefix>
    
The output file will be named `<output-prefix>_<motif-id>` or `<output-prefix>_all`


Input
=====

1. in.seq:   FASTA format (masked sequence is recommended)
2. in.db:    locates at `database/cistrome.db`
3. p-value:  0.001 is recommended
4. motif-id: check `database/cistrome.db` and `database/cistrome.xml` to get this ID
5. output-prefix: the prefix of the output file name

Test
====

Enter the directory of misp, and then type in:
    ./misp test.seq database/cistrome.db 0.001 all test

or:
    ./misp test.seq database/cistrome.db 0.001 EN_0055 test
    
Output
======

For one motif:

Column1: sequence name
    
Column2: sequence length
    
Column3: hit score:
    The higher this score is, the better this sequence matches the motif.
    Scores lower than thredshold(tolerance) will be changed to zero.

Column4: hit postion:
    The position of the hits. This offset starts from 0. If the hits locate on the negative strand, this value is negative or followed by (-).

Column5: sequence:
    The sequence at hits region

For all motifs:

Different motifs are separated by `\n` and different intervals are separeted by `,`.
The value is hits score and the value in the round bracket is hits position.

Formatter
=========

Try this command for the result of all motifs to get a tabular format:

awk -f mis_formatter.awk <result_of_mis>