how to run miARmaSeq with bam file

Issue #65 resolved

Kyra Yang created an issue 2016-11-09

Hello,

I am trying miARma, but the data I get is sorted bam file, could I jump to the readcount part directly? Indeed, when I try to set the file direction to the file that contain bam files, the software always said bam files not found.

Kyra

Comments (9)

Eduardo Andres Leon
Dear Kyra, Thank you for using miARma. If you start from a bam file, you can use miARma as well. In order to give you the detailed steps, could you send me (or paste) the ini file you are using ?

Kind regards
- 2016-11-09T14:42:05+00:00

Kyra Yang reporter

Hi Eduardo, here is what I plan to do.

;General parameters
[General]
; type of analysis (miRNA, mRNA or circRNA)
type=miRNA
; Folder for miRNA reads (only *.bam files inside)
read_dir=bam/
; label for the analsysis
label=Hypoxia
; Folder where miARma has been instaled
miARmaPath=.
; Folder to store results
output_dir=results/
; organism used
organism=mouse
; Number of process to run at the same time
threads=4
; Whether the data is from a strand-specific assay (yes, no or reverse, yes by default) for featureCounts analysis
strand=no
stats_file=results//miARma_stat.43178.log
logfile=results//miARma_logfile.43178.log

[ReadCount]
#GFF file used to calculate the number of reads in featureCounts analysis
database=Mus_musculus_miRNA.gtf
;GFF attribute to be used as feature ID (default: gene_id) for featureCounts analysis
seqid=transcript_id
;Feature type (3rd column in GFF file) to be used, all features of other type are ignored (default:exon) for featureCounts analysis
featuretype=

[DEAnalysis]
; Complete path of the target file.
targetfile=targets.txt
; Path of the contrast file.
contrastfile=contrast.txt
#This value refers to filter processing in the reads (Should be "yes" or "no").
filter=yes
;Specific software to perform the Differential Expression Analysis (Allowed values: edger, noiseq or edger-noiseq)
desoft=EdgeR
;Provide a file with normalized reads
cpm=yes
;Provide a file with RPKM values
rpkm=yes

Indeed I also fail to use the aligner, always said the reference could not be open, but the reference have no problem in other programe, such as tophat2.

2016-11-10T06:19:02+00:00

Eduardo Andres Leon
Hi Kyra, The way miARma works is by creating folder to store results inside. As you have your own bam files, you have to create a folder where miARma can search inside. So inside results/ you have to create a folder called Bowtie2_results. (you can creare a link among bam/ and results/Bowtie2_results/ . Then in order to avoid future problems, I encourage you to rename your bam files including _cut_bw2 at the end. So if you have a file called WTRep1.bam, this should be renamed to WTRep1_cut_bw2.bam

Once this is done, you can use the ReadCount module. Remember that both folder must exists: bam and results/Bowtie2_results

Regarding the index. Tophat2 uses bowtie1 and bowtie2 so you have to take care of what index are you using related with the aligner in tophat

Let me know

Eduardo
- 2016-11-10T07:19:42+00:00

Kyra Yang reporter

Hi Eduardo

I was able to do the readcount, but then fail in the DEAnaltsis

[Thu Nov 10 21:42:39 2016] Readcount Analysis finished.
[Thu Nov 10 21:42:39 2016] Starting a differential expression analysis using EdgeR software(s)
Can't use string ("停止执行") as an ARRAY ref while "strict refs" in use at lib/CbBio/RNASeq/DEAnalysis.pm line 1061.

The system language is Chinese. string ("停止执行") = string("stop execution")

About the index, I wrote like this

[Aligner]
; Aligner (Bowtie1, Bowtie2, BWA or miRDeep)
aligner=Bowtie2
; Bowtie 2 index
bowtie2index=/Users/labxi/Desktop/RNA seq/mm10Bowtie2Index/genome

In the index file, include the genome.fa and genome.1.bt2, genome.2.bt2....

2016-11-10T14:04:49+00:00

Eduardo Andres Leon
Hi Kyra, Regarding the DEAnalysis, I need some information: * perl version * R version * bioconductor version

Then I'll need the miARma_stat.43178.log, miARma_logfile.43178.log, contrast.txt and targets.txt

And about the bowtie2index, try to remove the space, unix machines don't like spaces in path : /Users/labxi/Desktop/RNAseq/mm10Bowtie2Index/genome

Regards
- 2016-11-11T07:18:15+00:00

Kyra Yang reporter

Hi Eduardo,

I solve the DEAnalysis problem, is the system language problem. I just need to change the system language to English. Then another failure came out.

[Fri Nov 11 19:17:57 2016] Starting a differential expression analysis using EdgeR software(s)
Problem while running this R command:
print(resultsfiles)

Error:
print(resultsfiles) : object 'resultsfiles' not found
Execution halted

The logfile is like this

SEQCOUNT :: [Fri Nov 11 19:17:57 2016] Please check the folder: results//Readcount_results/.
QC_EdgeR :: [Fri Nov 11 19:17:58 2016] Executing        source("./lib/CbBio/RNASeq/R-Scripts/QC_EdgeR.R")
        setwd("/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results")
        resultsfiles<-QC_EdgeR(projectdir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results",dir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results", file="cut_bw2-ReadCount.tab", targetfile="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/targets.txt", label="DKO_cut_bw2", filter="yes", cpmvalue=2, repthreshold=1, normethod="TMM")

QC_EdgeR :: [Fri Nov 11 19:18:03 2016] The file /Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/DKO_cut_bw2_QC_Report.pdf has been generated 
DE_EDGER :: [Fri Nov 11 19:18:03 2016] Executing        source("./lib/CbBio/RNASeq/R-Scripts/DE_EdgeR.R")
        setwd("/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results")
        resultsfiles<-DE_EdgeR(projectdir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/EdgeR_results",dir="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results", file="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results/cut_bw2-ReadCount.tab", targetfile="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/targets.txt", label="DKO_cut_bw2", contrastfile="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/contrast.txt", filter="yes", cpmvalue=2, repthreshold=1, normethod="TMM", replicates="yes", rpkm="yes", file_size="/Users/labxi/Desktop/RNA_seq/Software/cbbio-miarma/results/Readcount_results/cut_bw2-Size.tab", cpm="yes")

Stat.log have nothing.

The perl version is v5.18.2, R is 3.3.1 bioconductor is 3.3 targets.tex

Filename    Name    Condition
1-1 Control_1   WT
1-2 Control_2   WT
1-3 Control_3   WT
2-1 DKO_1   Bf-Mf-lyzcre
2-2 DKO_2   Bf-Mf-lyzcre
2-3 DKO_3   Bf-Mf-lyzcre

contrast.txt

"Name"
"Compared=WT-Bf-Mf0lyzcre“

Indeed, I get some result, but I think that is not I really want. My DEAnalysis part is like this, what should I add would give me a file contain p-value and tell me if the gene have significant change.

[DEAnalysis]
; Complete path of the target file.
targetfile=targets.txt
; Path of the contrast file.
contrastfile=contrast.txt
#This value refers to filter processing in the reads (Should be "yes" or "no").
filter=yes
;Specific software to perform the Differential Expression Analysis (Allowed values: edger, noiseq or edger-noiseq)
desoft=EdgeR
;Provide a file with normalized reads
cpm=yes
;Provide a file with RPKM values
rpkm=yes
;Cutoff for the counts per million value to be used in filter processing with EdgeR
cpmvalue=2
;Number of replicates that have to contains at least a defined number of reads per million to perform the filter with EdgeR software
repthreshold=1
;Normalization method to perform the DE analysis with EdgeR.
edger_normethod=TMM
;Providing replicates
replicates=yes

Thank you for your help!

2016-11-11T12:39:56+00:00

Eduardo Andres Leon
Hi, The result you want (with pvalues and FDR) should be inside EdgeR_results folder, but first the DEAnalysis must end correctly, so I see two errors in your setup:

First. R do not support the minus symbol in a variable name, so it interprets Bf-Mf-lyzcre as Bf minus Mf minus lyzcre. So you should rename this to Bf_Mf_lyzcre

Second I see in your targets two different condition: WT Bf-Mf-lyzcre

But in the contrast you use WT and Bf-Mf0lyzcre with is different from Bf-Mf-lyzcre

May be your problem is related to this. Then as a suggestion if WT is your "control", the comparison should be "Compared=Bf_Mf_lyzcre-WT“

Let me know

Edu
- 2016-11-11T13:00:56+00:00
Kyra Yang reporter
Hi Edu,

Thank you for your help, I get my result! And thank you for making this brilliant program. It saves me so much time.
- 2016-11-11T14:48:47+00:00
Eduardo Andres Leon
- changed status to resolved
- 2017-03-02T17:26:27+00:00
Log in to comment

Assignee: –

Type: proposal

Priority: major

Status: resolved

Votes: 0

Watchers: 1