Error running DeNovo

Issue #80 resolved

Sebastien created an issue 2017-06-09

Hello, I want to use miARma-seq to discover new miRNA with miRDeep2 since miARma-seq is easier to use than miRDeep2. In my lab, we are trying a new smallRNA kit. Reads needs a special bioinformatic proccesing before being mapped on a genome. After this process, I would like to use those precessed reads with miARma-seq but I get this error.

MINION ERROR :: system args failed: 7936 (minion search-adapter -i .//data/STAGE_snin/Results/cutadapt//mirD2_test_smallRNA.PF.R1_cutadapt_length.fastq -show 4 1>/tmp/minion.sq 2>> /data/STAGE_snin/Results/miARma_Test_smallRNA/denovo/miARma_stat.20368.log) Are you sure this fastq file contains and adapter sequence? at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib//CbBio/RNASeq/Adapt.pm line 1539.

I already trimmed adapters, that's why minion didn't find any adapter sequence. Is there a way to use the Denovo module without adapter prediction?

Here is my config file: ;General parameters [General] ; type of analysis (miRNA, mRNA or circRNA) type=miRNA ; Folder for miRNA reads read_dir=/data/STAGE_snin/Results/cutadapt/ ; Number of process to run at the same time threads=4 ; label for the analsysis label=Denovo_smallRNA ; Folder where miARma has been instaled miARmaPath=/data/software/miARma/cbbio-miarma-ba7e51dd6d2e/ ; Folder to store results output_dir=/data/STAGE_snin/Results/miARma_Test_smallRNA/denovo/ ; organism used organism=mouse ; Whether the data is from a strand-specific assay (yes, no or reverse, yes by default) for featureCounts analysis strand=yes stats_file=/data/STAGE_snin/Results/miARma_Test_smallRNA/denovo/miARma_stat.20368.log logfile=/data/STAGE_snin/Results/miARma_Test_smallRNA/denovo/miARma_logfile.20368.log

[Quality] prefix=Pre

[DeNovo] ; Indexed genome to align your reads in format .ebwt (Mandatory for analysis with miRDeep) bowtie1index=/data/Genomes/mm10_GRCm38.68/bowtie1/mm10 ; a fasta file with all mature sequence from your organism mature_miRNA_file=/data/Genomes/mirbase/mature_mmu.fa ;a fasta file with all known pre-miRNa sequence precursor_miRNA_file=/data/Genomes/mirbase/hairpin.fa ;fasta file for the cmplete genome of our organism genome=/data/Genomes/mm10_GRCm38.68/multifasta/mm10_multifasta.fa

Thank you for your help.

Sebastien

Comments (61)

Eduardo Andres Leon

Hi Sebastian, I'm out of the office so I can't access the code, but you should be able to do this:

[DeNovo]
;Indexed genome to align your reads in format .ebwt (Mandatory for analysis with miRDeep)
bowtie1index=Genomes/Indexes/bowtie1/human/bw1_homo_sapiens19
;Reads already trimmed
adapter=No
;a fasta file with all mature sequence from your organism
mature_miRNA_file=Examples/basic_examples/miRNAs/data/hsa_mature_miRBase20.fasta
;a fasta file with all known pre-miRNa sequence
precursor_miRNA_file=Examples/basic_examples/miRNAs/data/precursors_miRBase20.fasta
;fasta file for the cmplete genome of our organism
genome=Genomes/Indexes/bowtie1/human/homo_sapiens19.fa

let me know

2017-06-09T09:15:31+00:00

Sebastien reporter
Thank you for your fast answer. I tried with your options, but I got this error:

ERROR :: system args failed: 2 (mkdir -p /data/STAGE_snin/Results/miARma_Test_smallRNA/denovo//miRDeep_results/ ;export PERL5LIB=/data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib/Perl/; mapper.pl /data/STAGE_snin/Results/cutadapt//mirD2_test_smallRNA.PF.R1_cutadapt_length.fastq -e -h -i -j -n -m -o 4 -p /data/Genomes/mm10_GRCm38.68/bowtie1/mm10 -s /data/STAGE_snin/Results/miARma_Test_smallRNA/denovo//miRDeep_results/mirD2_test_smallRNA.PF.R1_cutadapt_length.fa -t /data/STAGE_snin/Results/miARma_Test_smallRNA/denovo//miRDeep_results/mirD2_test_smallRNA.PF.R1_cutadapt_length_vs_genome.arf >> /data/STAGE_snin/Results/miARma_Test_smallRNA/denovo/miARma_stat.20368.log 2>&1) at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib//CbBio/RNASeq/Aligner.pm line 2210.

The execution did not create the .arf file. I tried to find the issue, but I couldn't find it. I can run the mapper.pl with the same command line without problem but while using miARma I got this error.
- 2017-06-09T13:04:50+00:00
Eduardo Andres Leon
Could you send me the log files? ( miARma-log and miARma-stat files)?
- 2017-06-12T06:17:13+00:00
Sebastien reporter
- attached miARma_stat.20368.log
- attached miARma_logfile.20368.log
Here are the log files
- 2017-06-12T06:31:51+00:00
Eduardo Andres Leon
I don't see any error in any of the logs file, which is really weird. Is there any way to get your input files to reproduce your setup/analysis ?

Eduardo
- 2017-06-12T06:36:26+00:00
Sebastien reporter
I didn't have error while running miARma-seq. The script continue running for hour if I don't stop it. And I got the system args failed error when I stop the run. My input files are big with more than 2 Go for each files and I got 8 files. Do you know how I can send these files to you?

Sebastien
- 2017-06-12T06:47:32+00:00
Eduardo Andres Leon
Try compressing the files. Then you can use wetransfer

Eduardo
- 2017-06-12T06:56:11+00:00
Sebastien reporter
I'll try this thank you. I'll paste link to wetransfer when it will be done.
- 2017-06-12T07:03:22+00:00
Sebastien reporter
Here are the links to my fastq files. Thank you for your help.

https://we.tl/mbEZzcKCDG https://we.tl/Sl5qYjo0d4 https://we.tl/0WZEv2CF8y https://we.tl/9HJx39LgAh
- 2017-06-12T08:16:06+00:00
Eduardo Andres Leon
Thanks to you.

Could you also include you files: mature_mmu.fa and hairpin.fa ?
- 2017-06-12T09:03:20+00:00
Sebastien reporter
here are the mirbase files, I included my ini file to run miARma-seq https://we.tl/nVGr03HgZv

The genome I used is mm10_GRCm38.68. The genome files are too big for wetransfer.

Thank you
- 2017-06-12T09:25:10+00:00

Eduardo Andres Leon

Could you confirm that the numbers of lines in each fastq file is correct ?

200567044 L_Ol-6bis-MedakamiR_PF.R1.fastq
76069296 mirCE1_test_smallRNA.PF.R1.fastq
86983560 mirD1-200ng.PF.R1.fastq
88930032 mirD1_test_smallRNA.PF.R1.fastq
72564740 mirD2-200ng.PF.R1.fastq
69891968 mirD2_test_smallRNA.PF.R1.fastq
78812848 mirP2_test_smallRNA.PF.R1.fastq
91571936 mirQ1_test_smallRNA.PF.R1.fastq

2017-06-12T09:56:14+00:00

Eduardo Andres Leon
The first thing that I see is that your are providing the mature sequence for mmu miRNAs as a fastq file whereas miRDeep2 expect a fasta file.

Then as a recommendation (and of course this will dependen of your hypothesis) , as miRNA sequence (mostly the seed) is evolutionary conserved, you should provide a hairpin fasta file from all known miRNAs (provided by miRBase)
- 2017-06-12T09:59:52+00:00
Sebastien reporter
I confirm the number of lines for each files.

I miss clicked the mature_mmu file when I send it yo tou. I use a fasta file. You can see it in the ini file. Sorry for the mistake with the file.

Thenk you for your advise I'll take notes.
- 2017-06-12T11:46:02+00:00
Sebastien reporter
- attached mature_mmu.fa
Here is the mature_mmu.fa file
- 2017-06-12T12:40:21+00:00
Sebastien reporter
Hi Eduardo, Did you success to run miARma with my files?
- 2017-06-21T07:54:34+00:00
Eduardo Andres Leon
Hi Sebastian, Sorry for the delay. I've been out of the office until today so I couldn't take a look closer to your error. I'll check it this week for sure

Edu
- 2017-06-26T09:41:40+00:00
Sebastien reporter
Hi Eduardo, thank you ;) I hope we can find a solution :)
- 2017-06-26T09:47:20+00:00
Eduardo Andres Leon
Still working on it
- 2017-07-04T12:03:16+00:00
Eduardo Andres Leon
Hi Sebastien. Just to be 100% sure. Are these samples from mouse ?

Eduardo
- 2017-07-05T12:10:53+00:00
Sebastien reporter
Hi eduardo Yes samples are from mouse

Sébastien
- 2017-07-05T14:37:23+00:00

Eduardo Andres Leon

Hi, The thing is that none of the reads are mapped against the mouse genome. That is what miRDeep says:

Mapping statistics from mirD1-200ng.PF.R1.fastq

#desc   total   mapped  unmapped        %mapped %unmapped
total: 21745890 15      21745875        0.000   1.000
seq: 21745890   15      21745875        0.000   1.000

I'm using the mm10 build. I downloaded the fasta genome from ensemble and I've create a bowtie1 index using the miARma utility. ¿?

2017-07-07T06:35:59+00:00

Sebastien reporter

Hi, well that's strange. miARma is still running an infinite loop. Using bowtie, I get :

Alignment [/data/STAGE_snin/Results/miARma_Test_smallRNA//Bowtie1_results/]
Filename        Processed Reads Aligned reads   Failed to align
mirCE1_test_smallRNA.PF.R1_cutadapt_length.fastq        14489448        11574834 (79.88%)       2914614 (20.12%)
mirD1-200ng.PF.R1_cutadapt_length.fastq 16826809        13692134 (81.37%)       3134675 (18.63%)
mirD1_test_smallRNA.PF.R1_cutadapt_length.fastq 17247179        14041978 (81.42%)       3205201 (18.58%)
mirD2-200ng.PF.R1_cutadapt_length.fastq 12470600        10258910 (82.26%)       2211690 (17.74%)
mirD2_test_smallRNA.PF.R1_cutadapt_length.fastq 14989552        12201156 (81.40%)       2788396 (18.60%)
mirP2_test_smallRNA.PF.R1_cutadapt_length.fastq 13446332        10526848 (78.29%)       2919484 (21.71%)
mirQ1_test_smallRNA.PF.R1_cutadapt_length.fastq 16063664        12842004 (79.94%)       3221660 (20.06%)

I created the index using the bowtie1 index but not using miARma. I'll investigate that.

2017-07-07T08:28:20+00:00

Sebastien reporter
I forgot to say that I got these alignments using miARma for Known miRNAs, since the DeNovo module run an infinite loop
- 2017-07-07T08:38:47+00:00

Eduardo Andres Leon

It should be my index:

bowtie1 -q -p 12 /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results/Bowtie1_index/bw1_mm10 /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/reads/mirD1-200ng.PF.R1.fastq > file.sam
# reads processed: 21745890
# reads with at least one reported alignment: 272 (0.00%)
# reads that failed to align: 21745618 (100.00%)
Reported 272 alignments to 1 output stream(s)

Although it also strange that the number of total reads in your mirD1-200ng sample and mine, are different

2017-07-07T08:49:29+00:00

Sebastien reporter
Hum well i'm a bad student ^^ I gave you the raw fastq files and when I gave it to you I didn't know that the fastq needed a special trimming treatment. I'll gave you the fastq files trimmed. That explain your alignment

Really sorry for that.
- 2017-07-07T08:59:04+00:00
Eduardo Andres Leon
jaja very bad !!!

Don't worry
- 2017-07-07T09:05:10+00:00
Sebastien reporter
Here is the link for the file: https://we.tl/ehGeWli9Lo I'm really tired today so tell me if you have trouble with the file. I'm doing some bad things at work cause of the exhaust ^^
- 2017-07-07T09:35:07+00:00

Eduardo Andres Leon

Hi Sebastien, Although it didn't finish yet, using your trimmed reads, the DeNovo pipeline is working correctly. Two xls results files have already been created:

mirD1-200ng.PF.R1_cutadapt_randomsequences_vs_genome.arf
mirD1-200ng.PF.R1_cutadapt_randomsequences.fa
mirD1_test_smallRNA.PF.R1_cutadapt_randomsequences.xls<-
mirD1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf
mirD1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa
mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences.xls<-
mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf
mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa

I will let you know once it finish

2017-07-10T09:05:46+00:00

Sebastien reporter
Hi Eduardo, Thank you for the time you are according to my issue. I don't get the same result here. Can you send me your config file and tell me what you did so I can reproduce it here? Do you know the time your miARma is running?

Thank you very much
- 2017-07-10T09:14:25+00:00

Eduardo Andres Leon

It is a pleasure. It started 1:30 hours ago:

#########################################################################
#   miARma, miRNA and RNASeq Multiprocess Analysis                      #
#                miARma v 1.6.1 (Feb-2017)                              #
#                                                                       #
#   Created at Computational Biology and Bioinformatics Group (CbBio)   #
#   Institute of Biomedicine of Seville. IBIS (Spain)                   #
#   Copyright (c) 2017 IBIS. All rights reserved.                       #
#   mail : miARma-devel@cbbio.es                                        #
#########################################################################

[Mon Jul 10 09:52:09 2017] Starting a miARma analysis for miRNA
[Mon Jul 10 09:52:09 2017] Checking provided parameters for: DeNovo.
[Mon Jul 10 09:52:09 2017] All parameters are correct.
[Mon Jul 10 09:52:09 2017] Starting a De novo identification and quantification of miRNAs

2017-07-10T09:16:51+00:00

Sebastien reporter
I'll make some test on that dataset. I'll come back if I have trouble. Did you use the config file I send or did you modify it?
- 2017-07-10T09:23:49+00:00

Eduardo Andres Leon

Hi, I've used your ini file (I'll paste it because I can't attach any file). Besides I've used you hairpin file and mature mmu file. The only difference is the mmu genome and the index (that I've made it using the miARma pipeline).

;General parameters
[General]
; type of analysis (miRNA, mRNA or circRNA)
type=miRNA
; Folder for miRNA reads
read_dir=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/trimmed_reads/
; Number of process to run at the same time
threads=12
; label for the analsysis
label=Denovo_smallRNA
; Folder where miARma has been instaled
miARmaPath=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/miARma/
; Folder to store results
output_dir=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results/
; organism used
organism=mouse
; Whether the data is from a strand-specific assay (yes, no or reverse, yes by default) for featureCounts analysis
strand=yes
stats_file=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results//miARma_stat.23504.log
logfile=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results//miARma_logfile.23504.log

;[Quality]
;prefix=Pre
[DeNovo]
; Indexed genome to align your reads in format .ebwt (Mandatory for analysis with miRDeep)
bowtie1index=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/untrimmed/DeNovo_results/Bowtie1_index/bw1_mm10
; a fasta file with all mature sequence from your organism
mature_miRNA_file=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/mature_mmu.fa
; Reads already trimmed
adapter=no
;a fasta file with all known pre-miRNa sequence 
precursor_miRNA_file=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/hairpin_mmu.fa
;fasta file for the cmplete genome of our organism
genome=/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/Genomes/Mus_musculus.GRCm38.fa

2017-07-10T10:34:49+00:00

Sebastien reporter

Thanks, I'll try to use miARma to create index for mm10. But I used bowtie-build with default parameters when I create the index so it should lead to the same files. My run is running for more than 3 hours and I stille get only one file.

Here is the run:

#########################################################################   
#   miARma, miRNA and RNASeq Multiprocess Analysis          #
#                miARma v 1.6.1 (Feb-2017)                              #
#                                                           #
#   Created at Computational Biology and Bioinformatics Group (CbBio)   #
#   Institute of Biomedicine of Seville. IBIS (Spain)                   #
#   Copyright (c) 2017 IBIS. All rights reserved.                       #
#   mail : miARma-devel@cbbio.es                                        #
#########################################################################

[Mon Jul 10 11:27:47 2017] Starting a miARma analysis for miRNA
[Mon Jul 10 11:27:47 2017] Checking provided parameters for: Quality,DeNovo,DEAnalysis,TargetPrediction. 
[Mon Jul 10 11:27:47 2017] All parameters are correct.
[Mon Jul 10 11:27:47 2017] Starting Quality Analysis.
[Mon Jul 10 11:29:11 2017] Quality Analysis finished.
[Mon Jul 10 11:29:11 2017] Starting a De novo identification and quantification of miRNAs

Here are my result files

ll miRDeep_results/
total 12112
-rw-rw-r-- 1 snin mgx 12398859 10 juil. 11:32 mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa

I've got nothing else in my folder.

My config file is the same. I really don't see what I'm doing wrong.

2017-07-10T12:46:40+00:00

Eduardo Andres Leon

Hi again. It finished in around 3 hours (using 20 threads):

#########################################################################
#   miARma, miRNA and RNASeq Multiprocess Analysis                      #
#                miARma v 1.6.1 (Feb-2017)                              #
#                                                                       #
#   Created at Computational Biology and Bioinformatics Group (CbBio)   #
#   Institute of Biomedicine of Seville. IBIS (Spain)                   #
#   Copyright (c) 2017 IBIS. All rights reserved.                       #
#   mail : miARma-devel@cbbio.es                                        #
#########################################################################

[Mon Jul 10 09:52:09 2017] Starting a miARma analysis for miRNA
[Mon Jul 10 09:52:09 2017] Checking provided parameters for: DeNovo.
[Mon Jul 10 09:52:09 2017] All parameters are correct.
[Mon Jul 10 09:52:09 2017] Starting a De novo identification and quantification of miRNAs
[Mon Jul 10 12:58:46 2017] De novo identification and quantification of miRNAs finished
[Mon Jul 10 12:58:46 2017] miARma finished. Job took 186 minutes

Once you manage to run the pipeline correctly I will recommend you to use the hairpins from all organisms. Using only mouse hairpins and mouse mature miRNAs should not find any new miRNA.

2017-07-11T06:18:16+00:00

Sebastien reporter

Dear Eduardo, I stopped my miARma running cause it only did one fasta file within one day and I got no arf file. I got this error when I stopped my run:

miRDeep ERROR :: system args failed: 2 (mkdir -p /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/ ;export PERL5LIB=/data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib/Perl/; mapper.pl /data/STAGE_snin/Results/cutadapt/fastq_traites_random//mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq -e -h -i -j -n -m -o 20 -p /data/Genomes/mm10_GRCm38.68/bowtie1/mm10 -s /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa -t /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf >> /data/STAGE_snin/Results/miARma_DeNovo//miARma_stat.28902.log 2>&1) at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib//CbBio/RNASeq/Aligner.pm line 2210.

2017-07-11T06:20:34+00:00

Sebastien reporter
Well I'll try using miARma to create the index files. That the only thing we are not doing in the same way.
- 2017-07-11T06:22:59+00:00
Eduardo Andres Leon
Yes but this error is because you are canceling part of the miRDeep pipeline. Has you check the miARma_stat file ? This file shows the process of each file while miRDeep is analysing
- 2017-07-11T06:23:54+00:00
Eduardo Andres Leon
In the meantime, you can access your results in the following link:

https://wetransfer.com/downloads/61df00054de25443d1a011fc490df07c20170711062603/76ebb846eace22d13cb980ada44bbeef20170711062603/2a214f
- 2017-07-11T06:27:52+00:00

Sebastien reporter

Thank you for the file.

Here is the end of the stats file:

FASTQCSTATS :: [Mon Jul 10 16:47:33 2017]
 Name    mirD2-200ng.PF.R1_cutadapt_randomsequences_fastqc

                                Total Sequences:         12916555        Sequence length:        15-42

                                Encoding:        Sanger / Illumina 1.9   GCcontent:      44%
FASTQCSTATS :: [Mon Jul 10 16:47:33 2017]
 Name    mirD2_test_smallRNA.PF.R1_cutadapt_randomsequences_fastqc

                                Total Sequences:         15515249        Sequence length:        15-42

                                Encoding:        Sanger / Illumina 1.9   GCcontent:      44%
FASTQCSTATS :: [Mon Jul 10 16:47:33 2017]
 Name    mirP2_test_smallRNA.PF.R1_cutadapt_randomsequences_fastqc

                                Total Sequences:         13928297        Sequence length:        15-42

                                Encoding:        Sanger / Illumina 1.9   GCcontent:      43%
FASTQCSTATS :: [Mon Jul 10 16:47:33 2017]
 Name    mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences_fastqc

                                Total Sequences:         16630999        Sequence length:        15-42

                                Encoding:        Sanger / Illumina 1.9   GCcontent:      44%
miRDeep :: File:/data/STAGE_snin/Results/cutadapt/fastq_traites_random//mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq

2017-07-11T06:33:27+00:00

Eduardo Andres Leon

Pretty different than mine, miRDeep is no able to align the sequence so it must be bowtie related:

miRDeep :: File:/mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/trimmed_reads//mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq

Mapping statistics

#desc   total   mapped  unmapped        %mapped %unmapped
total: 16069390 14828539        1240851 0.923   0.077
seq: 16069390   14828539        1240851 0.923   0.077


#####################################
#                                   #
# miRDeep2.0.0.7                    #
#                                   #
# last change: 10/12/2014           #
#                                   #
#####################################

miRDeep2 started at 13:31:33


#Starting miRDeep2
/home/eandres/Projects/EduardoAndres/miARma_sebastien/miARma/bin/common/mirdeep/miRDeep2.pl /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results//miRDeep_results/mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa /mnt/beegfs/eandres/Projects/EduardoAndres/mi
ARma_sebastien/Genomes/Mus_musculus.GRCm38.fa /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results//miRDeep_results/mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/mature_mmu.fa none /
mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/hairpin_mmu.fa -r mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences -P -d -c -v

miRDeep2 started at 13:31:33


mkdir mirdeep_runs/run_10_07_2017_t_13_31_33

#Starting miRDeep2
#testing input files
started: 13:31:40
sanity_check_mature_ref.pl /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/mature_mmu.fa

#testing input files

ended: 13:31:40
total:0h:0m:0s

sanity_check_reads_ready_file.pl /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/DeNovo_results//miRDeep_results/mirQ1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa

started: 13:31:40

ended: 13:31:41
total:0h:0m:1s

started: 13:31:41
sanity_check_genome.pl /mnt/beegfs/eandres/Projects/EduardoAndres/miARma_sebastien/Genomes/Mus_musculus.GRCm38.fa

2017-07-11T06:36:48+00:00

Sebastien reporter
When I use miARma for Known miRNA I have no problem with Bowtie. Could the issue come cause I have miRDeep2 installed? I mean I installed miRDeep2 before miARma. Maybe The issue can come from that?
- 2017-07-11T06:46:35+00:00
Eduardo Andres Leon
Yes !!!! This can be also related. In miARma (due to referee petitions) we give preference for all installed pathway in the system. Then if no software is installed, the one in miARma is used. Could you try to move your miARma installation from your path ?
- 2017-07-11T06:48:48+00:00
Sebastien reporter
I'll look at this with my tutor today !!! Thank I'll come back when I got the results ;)
- 2017-07-11T06:51:53+00:00

Sebastien reporter

Hi Eduardo, I removed miARma from my path and tried to execute it. The mapper begin its execution, it create a dir_mapper for my first file. Then run in an infinite loop. I'm explaining, I got the directory for "mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq", the pipeline create these files: mapper.log_bak and mapper.log_tmp in the mapper.log_tmp file, I got the command line from the mapper.pl script.
The last command in this file is:

bowtie -p 10 -f -n 0 -e 80 -l 18 -a -m 5 --best --strata /data/STAGE_snin/Results/miARma_DeNovo/Bowtie1_index/mm10  --al dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_mapped --un dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_not_mapped  dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/reads_nr.fa dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/mappings.bwt 2>bowtie.log

I ran it like this to see the printing of results in my terminal instead of a file:

bowtie -p 10 -f -n 0 -e 80 -l 18 -a -m 5 --best --strata /data/STAGE_snin/Results/miARma_DeNovo/Bowtie1_index/mm10  --al dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_mapped --un dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_not_mapped  dir_mapper_seq_mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq_2971042749_12_07_2017_t_09_02_54/reads_nr.fa

I got some alignment lines and suddently got this

seq_14493788_x1 +   chr10   96617005    TTCGCCCCTCGGAGCTGG  IIIIIIIIIIIIIIIIII  0   
seq_14493777_x1 +   chr5    147742581   AAACATGAAGCCCTGCAACAC   IIIIIIIIIIIIIIIIIIIII   0   18:G>C
seq_14493684_x1 +   chr8    92827433    ATTGCCAGGACCTGCAAGCACCCGCGGC    IIIIIIIIIIIIIIIIIIIIIIIIIIII    0   
seq_14493767_x1 +   chr12   78350933    GTGTCCTAAGGCGAGCTCAGGGAGG   IIIIIIIIIIIIIIIIIIIIIIIII   2   
seq_14493731_x1 +   chr12   8330210 CAGGACTTCTGGGTCCTAGGGAATTGT IIIIIIIIIIIIIIIIIIIIIIIIIII 0

And no more lines printing in my terminal. I don't know where to search now. Bowtie run well when I run it but here it seems to bug. I checked the integrity of the fasta file created by the mapper.pl and the indexes are created by miARma. I let the alignment on the last night but when I came back to the office, the first fastq file was still under analysis. I'm usins 10 threads and when I run miARma on known miRNA alignment is fast (<1 hour for my 8 files). I also removed the aliases for miRDeep2 scripts so they are no conflict with miARma and miRDeep2.

I think if I don't find a solution I'll run miRDeep2 with some shell scripts instead of with miARma. But I'll use miARma to run Known miRNA pipeline.

Have you any idea or advice that can help me with this issue?

Thank you

2017-07-12T07:23:10+00:00

Eduardo Andres Leon
Hi, Have you check the files miARma_log and miARma_stat for an error ? In this file you can also check which miRDeep version are you using Most of the parts in miRDeep are not multithreaded but a whole night for just a sample is too much. With the same samples in my computers, miARma takes 186 minutes
- 2017-07-12T07:53:07+00:00

Sebastien reporter

Hi,

Here is my log for miRDeep

miRDeep :: [Wed Jul 12 09:02:54 2017] Executing mkdir -p /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/ ;export PERL5LIB=/data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib/Perl/; mapper.pl /data/STAGE_snin/Results/cutadapt/fastq_traites_random//mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq -e -h -i -j -n -m -o 10 -p /data/STAGE_snin/Results/miARma_DeNovo/Bowtie1_index/mm10 -s /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa -t /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf >> /data/STAGE_snin/Results/miARma_DeNovo//miARma_stat.31234.log 2>&1
miRDeep :: [Wed Jul 12 09:02:54 2017] Executing mkdir -p /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/ ; export PERL5LIB=/data/software/miARma/cbbio-miarma-ba7e51dd6d2e//lib/Perl/;miRDeep2.pl /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa /data/Genomes/mm10_GRCm38.68/multifasta/mm10_multifasta.fa /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf /data/Genomes/mirbase/mature_mmu.fa none /data/Genomes/mirbase/hairpin.fa -r mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences -P -d -c -v >> /data/STAGE_snin/Results/miARma_DeNovo//miARma_stat.31234.log 2>&1

And the stat for miRDeep

miRDeep :: File:/data/STAGE_snin/Results/cutadapt/fastq_traites_random//mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq

2017-07-12T08:03:50+00:00

Sebastien reporter
Where can I see the miRDeep2 version in these file? I don't find it
- 2017-07-12T08:04:28+00:00

Sebastien reporter

I found the problem,

I run the command line that miARma run to use the mapper.pl script.

And I got this

/data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl /data/STAGE_snin/Results/cutadapt/fastq_traites_random//mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fastq -e -h -i -j -n -m -o 10 -p /data/STAGE_snin/Results/miARma_DeNovo/Bowtie1_index/mm10 -s /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences.fa -t /data/STAGE_snin/Results/miARma_DeNovo//miRDeep_results/mirCE1_test_smallRNA.PF.R1_cutadapt_randomsequences_vs_genome.arf

sh: rna2dna.pl : commande introuvable
sh: collapse_reads_md.pl : commande introuvable
sh: convert_bowtie_output.pl : commande introuvable
sh: parse_mappings.pl : commande introuvable
Mapping statistics

#desc   total   mapped  unmapped    %mapped %unmapped
Use of uninitialized value $count2 in subtraction (-) at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 704.
Use of uninitialized value $count in subtraction (-) at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 704.
total: Use of uninitialized value $count in print at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 704.
    Use of uninitialized value $count2 in print at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 704.
    0   Use of uninitialized value $count in division (/) at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 705.
Use of uninitialized value $count2 in division (/) at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 705.
Illegal division by zero at /data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/mapper.pl line 705.

But I have no idea about how to fix it

2017-07-12T08:11:27+00:00

Eduardo Andres Leon
Hi, you can't run a single command without exporting the whole miRdeep bin directory to the path and PERL5LIB.

You should try something similar to:

1) export PATH=$PATH:/data/software/miARma/cbbio-miarma-ba7e51dd6d2e//bin/common/mirdeep/ 2) export PERL5LIB=$PERL5LIB:/data/software/miARma/cbbio-miarma-ba7e51dd6d2e/lib/Perl/ 3) Run the command that you type above
- 2017-07-12T08:54:18+00:00
Sebastien reporter
The command still run many hours without results. I'll try install miARma on my local computer and run it on a little dataset.
- 2017-07-13T06:09:15+00:00
Eduardo Andres Leon
That would be a really good idea. I've tested your files in a CentOS 7.3 machine and in an Ubuntu 16.04. In both they worked properly

Edu
- 2017-07-13T07:25:10+00:00
Sebastien reporter
I succeed to run it on my computer with small data set (20 reads for each file). I found an error in the hairpin.fa file. It contains Y, that stopped the execution with an error printed in the stat file. I didn't get this error on the server and even after correcting this error on the server I'm still unable to make it run correctly. I'll continue some tests and I hope I'll get some results for the end of the next week.

Thank you for all your help.

Sebastien
- 2017-07-13T10:33:53+00:00
Eduardo Andres Leon
I'm pretty sure that the error in the server is bowtie related (is bowtie also installed in the path?. If so, I recommend you to use the bowtie provided by miArma). This is the only thing that explains all events that you are experiencing

Edu
- 2017-07-13T10:37:15+00:00
Sebastien reporter
ho yeah it is!!!!! How did I not think to that before! How can I say to miARma to use its own bowtie?
- 2017-07-13T10:51:04+00:00
Eduardo Andres Leon
Umm this should be done the other way around, I mean you have to avoid the usage of you own bowtie. For example you can modify the path in the terminal where miARma is going to be executed. An example could be: Print your path
```
echo $PATH
```
In my case I have this: /usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lib64/openmpi/bin:/home/eandres/bowtie1/bin/

So I type the following (removing the path where bowtie is installed):
```
export PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lib64/openmpi/bin
```
And that's all
- 2017-07-13T11:31:02+00:00

Sebastien reporter

Thanks for the advice, I try this:

PATH=`echo $PATH | sed 's/\/data\/software\/bowtie2-2.2.9\/:\/data\/software\/bowtie-1.2\/://'`

I'll see if it works now.

2017-07-13T11:54:45+00:00

Sebastien reporter
It seems to run correctly. I'll have the results monday, tomorrow is a public holiday here.

Thank you for all your help.

Sebastien
- 2017-07-13T13:02:08+00:00
Sebastien reporter
Hi Eduardo, I have my results and all run well when I remove bowtie from the PATH. Thank you for all your help!

Sebastien
- 2017-07-17T06:11:18+00:00
Eduardo Andres Leon
Yes !!! We did it. I hope other users find this thread useful

Edu
- 2017-07-19T06:34:29+00:00
Eduardo Andres Leon
- changed status to resolved
- 2017-07-19T06:34:36+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: resolved

Votes: 0

Watchers: 1