1. Hanfei Sun
  2. MISP


 misp: a fast Motif Scanner


Type in this command under the project's directory to install:

To see the test result, type in:
    make test


For one motif:
    misp <in.seq> <in.db> <p-value> <motif-id> <output-prefix>

For all motifs:
    misp <in.seq> <in.db> <p-value> all <output-prefix>
The output file will be named `<output-prefix>_<motif-id>` or `<output-prefix>_all`


1. in.seq:   FASTA format (masked sequence is recommended)
2. in.db:    locates at `database/cistrome.db`
3. p-value:  0.001 is recommended
4. motif-id: check `database/cistrome.db` and `database/cistrome.xml` to get this ID
5. output-prefix: the prefix of the output file name


Enter the directory of misp, and then type in:
    ./misp test.seq database/cistrome.db 0.001 all test

    ./misp test.seq database/cistrome.db 0.001 EN_0055 test

For one motif:

Column1: sequence name
Column2: sequence length
Column3: hit score:
    The higher this score is, the better this sequence matches the motif.
    Scores lower than thredshold(tolerance) will be changed to zero.

Column4: hit postion:
    The position of the hits. This offset starts from 0. If the hits locate on the negative strand, this value is negative or followed by (-).

Column5: sequence:
    The sequence at hits region

For all motifs:

Different motifs are separated by `\n` and different intervals are separeted by `,`.
The value is hits score and the value in the round bracket is hits position.


Try this command for the result of all motifs to get a tabular format:

awk -f mis_formatter.awk <result_of_mis>