an error from running with mouse databases
Dear Eduardo,
After downloading 'ftp://mirbase.org/pub/mirbase/CURRENT/genomes/mmu.gff3' and 'ftp://ftp.ccb.jhu.edu/pub/data/bowtie_indexes/mm9.ebwt.zip', I tried to generate 'ReadCount results'. But, the output is different from the one I had from your Example data: S177_ATCGTT S178_GACGTT S179_AGAGTT S180_GTTCTT S83_ATGTTT S84_GAGTTT S85_AGCTTT S86_GTATTT S89_ACATTT S90_GGTGTT S91_AATGTT S92_GCGGTT 5 5 23 6 41 12 71 69 7 44 15 61
It looks like my mouse mmu.gff3 may not be compatible with the mirrma program. I wonder if you can provide your comments on how to solve this program.
thank you in advance,
Hoon
Comments (7)
-
-
reporter Thank you very much for your comments. Now it works.
By the way, I think your program can also measure # of reads on other small RNAs(<100 bps) if the corresponding gtf is provided. Is it correct?
thank you,
Hoon
-
reporter Dear Eduardo,
I have another question. After creating an annotation GTF consisting of all small mouse transcripts (<=100 bps), I ran miarma.
a part of 'summary_results.xls' is:
Alignment [/scratch/bcb/hkim6/SH-mouse-miRNA/work/merged-fastq/miARmaSeq/Known_miRNAs/results_mmu.NCBIM37.67_100bp.gtf//Bowtie1_results/] Filename Processed Reads Aligned reads Failed to align S177_ATCGTT.fastq 1189389 893362 (75.11%) 296027 (24.89%) S178_GACGTT.fastq 768298 580088 (75.50%) 188210 (24.50%) S179_AGAGTT.fastq 1183877 866511 (73.19%) 317366 (26.81%) S180_GTTCTT.fastq 1378839 1039826 (75.41%) 339013 (24.59%) S83_ATGTTT.fastq 2264620 1633403 (72.13%) 631217 (27.87%) S84_GAGTTT.fastq 1422850 1108686 (77.92%) 314164 (22.08%) S85_AGCTTT.fastq 2364015 1656693 (70.08%) 707322 (29.92%) S86_GTATTT.fastq 2570765 1656320 (64.43%) 914445 (35.57%) S89_ACATTT.fastq 1338242 1084348 (81.03%) 253894 (18.97%) S90_GGTGTT.fastq 2634816 1691761 (64.21%) 943055 (35.79%) S91_AATGTT.fastq 1121329 828391 (73.88%) 292938 (26.12%) S92_GCGGTT.fastq 2574415 1778573 (69.09%) 795842 (30.91%) ReadCount [/scratch/bcb/hkim6/SH-mouse-miRNA/work/merged-fastq/miARmaSeq/Known_miRNAs/results_mmu.NCBIM37.67_100bp.gtf//Readcount_results/] Filename Processed Reads Assigned reads Strand Number of identified entities S177_ATCGTT_nat_bw1 1189389 57491 (4.8%) no 399 S178_GACGTT_nat_bw1 768298 21267 (2.8%) no 347 S179_AGAGTT_nat_bw1 1183877 62632 (5.3%) no 415 S180_GTTCTT_nat_bw1 1378839 43788 (3.2%) no 411 S83_ATGTTT_nat_bw1 2264620 135825 (6.0%) no 453 S84_GAGTTT_nat_bw1 1422850 43822 (3.1%) no 387 S85_AGCTTT_nat_bw1 2364015 159310 (6.7%) no 440 S86_GTATTT_nat_bw1 2570765 139655 (5.4%) no 450 S89_ACATTT_nat_bw1 1338242 39425 (2.9%) no 378 S90_GGTGTT_nat_bw1 2634816 161102 (6.1%) no 450 S91_AATGTT_nat_bw1 1121329 53323 (4.8%) no 393 S92_GCGGTT_nat_bw1 2574415 176444 (6.9%) no 472
Overall, only a small fraction (3~4%) of the reads were assigned to the transcript GTF, and I think these alignment fractions are too low. I wonder if you, as an expert in analysis of miRNA sequencing, can provide your comments on what would be a potential problem causing such low alignment fractions.
Thank you in advance,
Hoon
-
Dear Hoon, I see that you are already an expert on miARma ;). I hope you find it useful and easy-to-use.
Regarding your question, you only have to think if your results make sense. Although you have change the gtf file that doesn't mean that you have to find anything, what I mean is that in some cases the wet part remove bigger fragments. For example in miRNAs, the protocol needs fragmentation and sonication to remove big fragments. Thus for lnc-RNAs is better to use a RNASeq protocol rather to a miRNASeq protocol.
As a test, you can change the strand (now you have strand=no, try strand=yes), to make sure it is not a quantification problem Another way to check it (visually), is to use the IGV software using a bam file from Bowtie1_results folder and your gtf file. In such a way you can see the number of reads in each small mouse transcript
Those are my suggestions
Regards
-
Dear Hoon should I close the issue ?
Regards
-
reporter Sorry, I should have done it. Please, close it. Thank you,
HK
-
- changed status to resolved
- Log in to comment
Dear Hoon, All kind of GFF are compatible with miARma as GFTT files, are standard annotation files. I wonder which parameters did you use. For this mmu.gff3, you will need:
greetings