how to run miARmaSeq with bam file
Hello,
I am trying miARma, but the data I get is sorted bam file, could I jump to the readcount part directly? Indeed, when I try to set the file direction to the file that contain bam files, the software always said bam files not found.
Kyra
Comments (9)
-
-
reporter Hi Eduardo, here is what I plan to do.
;General parameters [General] ; type of analysis (miRNA, mRNA or circRNA) type=miRNA ; Folder for miRNA reads (only *.bam files inside) read_dir=bam/ ; label for the analsysis label=Hypoxia ; Folder where miARma has been instaled miARmaPath=. ; Folder to store results output_dir=results/ ; organism used organism=mouse ; Number of process to run at the same time threads=4 ; Whether the data is from a strand-specific assay (yes, no or reverse, yes by default) for featureCounts analysis strand=no stats_file=results//miARma_stat.43178.log logfile=results//miARma_logfile.43178.log [ReadCount] #GFF file used to calculate the number of reads in featureCounts analysis database=Mus_musculus_miRNA.gtf ;GFF attribute to be used as feature ID (default: gene_id) for featureCounts analysis seqid=transcript_id ;Feature type (3rd column in GFF file) to be used, all features of other type are ignored (default:exon) for featureCounts analysis featuretype= [DEAnalysis] ; Complete path of the target file. targetfile=targets.txt ; Path of the contrast file. contrastfile=contrast.txt #This value refers to filter processing in the reads (Should be "yes" or "no"). filter=yes ;Specific software to perform the Differential Expression Analysis (Allowed values: edger, noiseq or edger-noiseq) desoft=EdgeR ;Provide a file with normalized reads cpm=yes ;Provide a file with RPKM values rpkm=yes
Indeed I also fail to use the aligner, always said the reference could not be open, but the reference have no problem in other programe, such as tophat2.
-
Hi Kyra, The way miARma works is by creating folder to store results inside. As you have your own bam files, you have to create a folder where miARma can search inside. So inside results/ you have to create a folder called Bowtie2_results. (you can creare a link among bam/ and results/Bowtie2_results/ . Then in order to avoid future problems, I encourage you to rename your bam files including _cut_bw2 at the end. So if you have a file called WTRep1.bam, this should be renamed to WTRep1_cut_bw2.bam
Once this is done, you can use the ReadCount module. Remember that both folder must exists: bam and results/Bowtie2_results
Regarding the index. Tophat2 uses bowtie1 and bowtie2 so you have to take care of what index are you using related with the aligner in tophat
Let me know
Eduardo
-
reporter Hi Eduardo
I was able to do the readcount, but then fail in the DEAnaltsis
[Thu Nov 10 21:42:39 2016] Readcount Analysis finished. [Thu Nov 10 21:42:39 2016] Starting a differential expression analysis using EdgeR software(s) Can't use string ("停止执行") as an ARRAY ref while "strict refs" in use at lib/CbBio/RNASeq/DEAnalysis.pm line 1061.
The system language is Chinese. string ("停止执行") = string("stop execution")
About the index, I wrote like this
[Aligner] ; Aligner (Bowtie1, Bowtie2, BWA or miRDeep) aligner=Bowtie2 ; Bowtie 2 index bowtie2index=/Users/labxi/Desktop/RNA seq/mm10Bowtie2Index/genome
In the index file, include the genome.fa and genome.1.bt2, genome.2.bt2....
-
Hi Kyra, Regarding the DEAnalysis, I need some information: * perl version * R version * bioconductor version
Then I'll need the miARma_stat.43178.log, miARma_logfile.43178.log, contrast.txt and targets.txt
And about the bowtie2index, try to remove the space, unix machines don't like spaces in path : /Users/labxi/Desktop/RNAseq/mm10Bowtie2Index/genome
Regards
-
reporter Hi Eduardo,
I solve the DEAnalysis problem, is the system language problem. I just need to change the system language to English. Then another failure came out.
[Fri Nov 11 19:17:57 2016] Starting a differential expression analysis using EdgeR software(s) Problem while running this R command: print(resultsfiles) Error: print(resultsfiles) : object 'resultsfiles' not found Execution halted
The logfile is like this
SEQCOUNT :: [Fri Nov 11 19:17:57 2016] Please check the folder: results//Readcount_results/. QC_EdgeR :: [Fri Nov 11 19:17:58 2016] Executing source("./lib/CbBio/RNASeq/R-Scripts/QC_EdgeR.R") setwd("/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results") resultsfiles<-QC_EdgeR(projectdir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results",dir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results", file="cut_bw2-ReadCount.tab", targetfile="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/targets.txt", label="DKO_cut_bw2", filter="yes", cpmvalue=2, repthreshold=1, normethod="TMM") QC_EdgeR :: [Fri Nov 11 19:18:03 2016] The file /Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/DKO_cut_bw2_QC_Report.pdf has been generated DE_EDGER :: [Fri Nov 11 19:18:03 2016] Executing source("./lib/CbBio/RNASeq/R-Scripts/DE_EdgeR.R") setwd("/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results") resultsfiles<-DE_EdgeR(projectdir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/EdgeR_results",dir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results", file="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results/cut_bw2-ReadCount.tab", targetfile="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/targets.txt", label="DKO_cut_bw2", contrastfile="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/contrast.txt", filter="yes", cpmvalue=2, repthreshold=1, normethod="TMM", replicates="yes", rpkm="yes", file_size="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results/cut_bw2-Size.tab", cpm="yes")
Stat.log have nothing.
The perl version is v5.18.2, R is 3.3.1 bioconductor is 3.3 targets.tex
Filename Name Condition 1-1 Control_1 WT 1-2 Control_2 WT 1-3 Control_3 WT 2-1 DKO_1 Bf-Mf-lyzcre 2-2 DKO_2 Bf-Mf-lyzcre 2-3 DKO_3 Bf-Mf-lyzcre
contrast.txt
"Name" "Compared=WT-Bf-Mf0lyzcre“
Indeed, I get some result, but I think that is not I really want. My DEAnalysis part is like this, what should I add would give me a file contain p-value and tell me if the gene have significant change.
[DEAnalysis] ; Complete path of the target file. targetfile=targets.txt ; Path of the contrast file. contrastfile=contrast.txt #This value refers to filter processing in the reads (Should be "yes" or "no"). filter=yes ;Specific software to perform the Differential Expression Analysis (Allowed values: edger, noiseq or edger-noiseq) desoft=EdgeR ;Provide a file with normalized reads cpm=yes ;Provide a file with RPKM values rpkm=yes ;Cutoff for the counts per million value to be used in filter processing with EdgeR cpmvalue=2 ;Number of replicates that have to contains at least a defined number of reads per million to perform the filter with EdgeR software repthreshold=1 ;Normalization method to perform the DE analysis with EdgeR. edger_normethod=TMM ;Providing replicates replicates=yes
Thank you for your help!
-
Hi, The result you want (with pvalues and FDR) should be inside EdgeR_results folder, but first the DEAnalysis must end correctly, so I see two errors in your setup:
First. R do not support the minus symbol in a variable name, so it interprets Bf-Mf-lyzcre as Bf minus Mf minus lyzcre. So you should rename this to Bf_Mf_lyzcre
Second I see in your targets two different condition: WT Bf-Mf-lyzcre
But in the contrast you use WT and Bf-Mf0lyzcre with is different from Bf-Mf-lyzcre
May be your problem is related to this. Then as a suggestion if WT is your "control", the comparison should be "Compared=Bf_Mf_lyzcre-WT“
Let me know
Edu
-
reporter Hi Edu,
Thank you for your help, I get my result! And thank you for making this brilliant program. It saves me so much time.
-
- changed status to resolved
- Log in to comment
Dear Kyra, Thank you for using miARma. If you start from a bam file, you can use miARma as well. In order to give you the detailed steps, could you send me (or paste) the ini file you are using ?
Kind regards