A problem with ROSE execution (using the example data set provided with ROSE)
Dear ‘ROSE’ developers,
Thank you very much for developing ‘ROSE’! It is of a great interest for us to be able and use your software for discovering super enhancer within our system and we would be very appreciable if you could advise us what could be amend in our procedure in order to perform a successful ‘ROSE’ run.
We have recently downloaded the current ROSE package from you website: https://bitbucket.org/young_computation/rose/downloads (upon downloading we have got the file: young_computation-rose-15819b1ee94a.zip, which we then extracted). We also downloaded the example library that goes along with ‘ROSE’, which you deposited under ‘example data for ROSE’: http://younglab.wi.mit.edu/super_enhancer_code.html
We have made several attempts to execute ‘ROSE’ using the command line that was provided within the example.sh file (a file contained within the ROSE_DATA folder). This is the command line we used: python ROSE_main.py -g HG18 -i ./data/HG18_MM1S_MED1.gff -r ./data/MM1S_MED1.hg18.bwt.sorted.bam -c ./data/MM1S_WCE.hg18.bwt.sorted.bam -o example -s 12500 -t 2500
It looks like ‘ROSE’ starts to run nicely, it actually creates the ‘example’ output folder, which includes two subfolder: mappedGFF, and gff. The gff folder is also become populated by two output files: HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff and HG18_MM1S_MED1.gff - but then, according to the output summary appears on the screen it looks like the program accumulates 4 error messages: "TypeError: float() argument must be a string or a number" the program continues until it reaches to the lines: WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN): 0 At this point the program seems to get stuck for long minutes without any change.
By now we tried executing ROSE on Linux, Cygwin, and Mac (on Terminal) - but in all cases my output contained the 4 error messages: "TypeError: float() argument must be a string or a number"
Please advice what should we do in order to be able and execute a ROSE run using the example files provided with the 'example data for Rose’. Does the fact that we get the message: "TypeError: float() argument must be a string or a number” Is important at all?.. Or should we ignore it? Why does the program seems to be stuck after reaching the line: WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN) ? What is the expected time frame in which ‘ROSE’ should complete analysis of the 'example data’? On our Unix server the program got stuck for more than 60 minutes without a change.
Thanks a lot in advance!
Roy
Here are the output lines that we get after executing the command line that within the example.sh file which you provided:
Macintosh-5:ROSE royblum$ python ROSE_main.py -g HG18 -i ./data/HG18_MM1S_MED1.gff -r ./data/MM1S_MED1.hg18.bwt.sorted.bam -c ./data/MM1S_WCE.hg18.bwt.sorted.bam -o example -s 12500 -t 2500 folder example/ does not exist folder example/gff/ does not exist folder example/mappedGFF/ does not exist USING ./data/HG18_MM1S_MED1.gff AS THE INPUT GFF USING HG18 AS THE GENOME MAKING START DICT LOADING IN GFF REGIONS STITCHING REGIONS TOGETHER PERFORMING REGION STITCHING REMOVED 7865 LOCI BECAUSE THEY WERE CONTAINED BY A TSS REMOVED 16 STITCHED LOCI BECAUSE THEY OVERLAPPED MULTIPLE TSSs ADDED BACK 42 ORIGINAL LOCI MAKING GFF FROM STITCHED COLLECTION WRITING STITCHED GFF TO DISK AS example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff OUTPUT WILL BE WRITTEN TO example/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_ENHANCER_REGION_MAP.txt python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b ./data/MM1S_MED1.hg18.bwt.sorted.bam -i example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff -o example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b ./data/MM1S_MED1.hg18.bwt.sorted.bam -i ./data/HG18_MM1S_MED1.gff -o example/mappedGFF/HG18_MM1S_MED1_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b ./data/MM1S_WCE.hg18.bwt.sorted.bam -i example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff -o example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b ./data/MM1S_WCE.hg18.bwt.sorted.bam -i ./data/HG18_MM1S_MED1.gff -o example/mappedGFF/HG18_MM1S_MED1_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff & PAUSING TO MAP {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': './data/MM1S_WCE.hg18.bwt.sorted.bam', 'rpm': True, 'input': './data/HG18_MM1S_MED1.gff'} {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': './data/MM1S_MED1.hg18.bwt.sorted.bam', 'rpm': True, 'input': './data/HG18_MM1S_MED1.gff'} [] {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': './data/MM1S_MED1.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff'} [] [] mapping to GFF and making a matrix with fixed bin number mapping to GFF and making a matrix with fixed bin number mapping to GFF and making a matrix with fixed bin number {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': './data/MM1S_WCE.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff'} [] mapping to GFF and making a matrix with fixed bin number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> Traceback (most recent call last): main() File "ROSE_bamToGFF.py", line 247, in <module> File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN): 0
When we let the program wait it is just accumulating more minutes – the lower part of the output becomes to look like this:
WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN): 0 30 60 90
Comments (3)
-
-
Hi Roy,
This looks to be the standard error output reporting the progress of bamToGFF. This is a counter to show how long the script has been running. If your output is created successfully, you don't need to worry about this.
-
- changed status to resolved
- Log in to comment
Dear ‘Rose’ developers,
I would like to inform that I solved my problem after reading a former reply (from ‘chaplain’) that appears at your blog: https://bitbucket.org/young_computation/rose/issue/2/error-list-index-out-of-range
I basically add the samtools-0.1.19 to my Terminal PATH and now the ROSE execution seems to work fine. The program's output includes now all the output files, including the .png R-based figure file!
However, something strange still happening and I hope that you may be able to advise me what could be the reason and how could I possibly solve it:
Here is the output from my ROSE execution - As you’ll see, the program runs and as some point seems to complete the analysis,as it reaches to a point where it releases the command line back to the user. Nevertheless, a few second after reaching to that point, it looks like some counting is continuing (!!)… This without any further command given by me. I will bold the post-analysis section here below so you’ll be able to identify it more easily. This counting basically seems to continue endlessly and I am wondering why… It is even continuing further after I re-press Control-C, to get again the command line back to the user. I get the command line for maybe two seconds but then the counting just continues!…
Would you be able to provide an explanation please? I would like to be sure that ROSE is running well on my computer.
Thanks a lot in advance!
Roy
Here is the output:
Macintosh-5:ROSE royblum$ python ROSE_main.py -g HG18 -i HG18_MM1S_MED1_1000.gff -r MM1S_MED1.hg18.bwt.sorted.bam -c MM1S_WCE.hg18.bwt.sorted.bam -o example2 -s 12500 -t 2500 USING HG18_MM1S_MED1_1000.gff AS THE INPUT GFF USING HG18 AS THE GENOME MAKING START DICT LOADING IN GFF REGIONS STITCHING REGIONS TOGETHER PERFORMING REGION STITCHING REMOVED 405 LOCI BECAUSE THEY WERE CONTAINED BY A TSS REMOVED 1 STITCHED LOCI BECAUSE THEY OVERLAPPED MULTIPLE TSSs ADDED BACK 1 ORIGINAL LOCI MAKING GFF FROM STITCHED COLLECTION WRITING STITCHED GFF TO DISK AS example2/gff/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL.gff OUTPUT WILL BE WRITTEN TO example2/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_ENHANCER_REGION_MAP.txt python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_MED1.hg18.bwt.sorted.bam -i example2/gff/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL.gff -o example2/mappedGFF/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_MED1.hg18.bwt.sorted.bam -i HG18_MM1S_MED1_1000.gff -o example2/mappedGFF/HG18_MM1S_MED1_1000_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_WCE.hg18.bwt.sorted.bam -i example2/gff/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL.gff -o example2/mappedGFF/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_WCE.hg18.bwt.sorted.bam -i HG18_MM1S_MED1_1000.gff -o example2/mappedGFF/HG18_MM1S_MED1_1000_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff & PAUSING TO MAP {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example2/mappedGFF/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_WCE.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'example2/gff/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL.gff'} [] mapping to GFF and making a matrix with fixed bin number {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example2/mappedGFF/HG18_MM1S_MED1_1000_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_MED1.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'HG18_MM1S_MED1_1000.gff'} [] mapping to GFF and making a matrix with fixed bin number {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example2/mappedGFF/HG18_MM1S_MED1_1000_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_WCE.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'HG18_MM1S_MED1_1000.gff'} [] mapping to GFF and making a matrix with fixed bin number {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example2/mappedGFF/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_MED1.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'example2/gff/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL.gff'} [] mapping to GFF and making a matrix with fixed bin number WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN): 0 using a MMR value of 17.4141 using a MMR value of 17.4141 using a MMR value of 20.5167 using a MMR value of 20.5167 using a MMR value of 20.5167 using a MMR value of 20.5167 using a MMR value of 17.4141 using a MMR value of 17.4141 has chr Number lines processed 0 has chr Number lines processed 0 has chr Number lines processed 0 has chr Number lines processed 0 has chr Number lines processed 0 has chr Number lines processed 0 has chr Number lines processed 0 has chr Number lines processed 0 100 100 100 100 100 100 200 100 100 200 200 200 200 200 200 300 300 300 200 300 300 300 400 300 400 400 300 400 400 400 500 500 500 400 400 500 500 500 600 600 500 500 600 600 600 700 700 700 600 700 700 800 800 800 900 900 700 800 800 900 1000 900 900 800 1100 1000 1000 900 1200 1100 1100 1000 1300 1200 1200 MAPPING TOOK 25 MINUTES BAM MAPPING COMPLETED NOW MAPPING DATA TO REGIONS FORMATTING TABLE GETTING MAPPED DATA GETTING MAPPING DATA FOR MM1S_MED1.hg18.bwt.sorted.bam OPENING example2/mappedGFF/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff MAKING SIGNAL DICT FOR MM1S_MED1.hg18.bwt.sorted.bam GETTING MAPPING DATA FOR MM1S_WCE.hg18.bwt.sorted.bam OPENING example2/mappedGFF/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff MAKING SIGNAL DICT FOR MM1S_WCE.hg18.bwt.sorted.bam CALLING AND PLOTTING SUPER-ENHANCERS R --no-save example2/ example2/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_ENHANCER_REGION_MAP.txt HG18_MM1S_MED1_1000 MM1S_WCE.hg18.bwt.sorted.bam < ROSE_callSuper.R ARGUMENT 'example2/' ignored
ARGUMENT 'example2/HG18_MM1S_MED1_1000_12KB_STITCHED_TSS_DISTAL_ENHANCER_REGION_MAP.txt' ignored
ARGUMENT 'HG18_MM1S_MED1_1000' ignored
ARGUMENT 'MM1S_WCE.hg18.bwt.sorted.bam' ignored
1400 1100
R version 3.1.0 (2014-04-10) -- "Spring Dance" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin13.1.0 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.
Macintosh-5:ROSE royblum$ 5500 5700 6100 5000 6200 5600 5800 5100 6300 5700 5900 5200