Error: list index out of range

Issue #2 resolved
Choon Sim created an issue

Hi, I received the following error when I ran ROSE_main.py,

USING BAT-k27ac-1.gff AS THE INPUT GFF USING mm9 AS THE GENOME MAKING START DICT LOADING IN GFF REGIONS Traceback (most recent call last): File "ROSE_main.py", line 471, in <module> main() File "ROSE_main.py", line 331, in main referenceCollection = ROSE_utils.gffToLocusCollection(inputGFFFile) File "/home/simck1/Documents/rose/ROSE_utils.py", line 512, in gffToLocusCollection if len(line[2]) > 0: IndexError: list index out of range

I use Ubuntu 12.04, Python 2.7.6, Samtools 0.1.19. Any idea how to resolve?

Comments (6)

  1. Choon Sim reporter

    I also used Python 2.7.6 in Windows 7 to run the same commands with the same files. Here, the stitched giff file was created. But subsequently the following error appeared:

    C:\K27ac\Rose>python ROSE_main.py -g mm9 -i BAT-k27ac-1.gff -r bat1_k27ac.sorted .bam -o results -t 2500 -c bat1_input.sorted.bam 'cp' is not recognized as an internal or external command, operable program or batch file. USING BAT-k27ac-1.gff AS THE INPUT GFF USING mm9 AS THE GENOME MAKING START DICT LOADING IN GFF REGIONS STITCHING REGIONS TOGETHER PERFORMING REGION STITCHING REMOVED 9640 LOCI BECAUSE THEY WERE CONTAINED BY A TSS REMOVED 42 STITCHED LOCI BECAUSE THEY OVERLAPPED MULTIPLE TSSs ADDED BACK 348 ORIGINAL LOCI MAKING GFF FROM STITCHED COLLECTION WRITING STITCHED GFF TO DISK AS results/gff/BAT-k27ac-1_12KB_STITCHED_TSS_DISTAL .gff OUTPUT WILL BE WRITTEN TO results/BAT-k27ac-1_12KB_STITCHED_TSS_DISTAL_ENHANCER _REGION_MAP.txt python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b bat1_k27ac.sorted.bam -i results/ gff/BAT-k27ac-1_12KB_STITCHED_TSS_DISTAL.gff -o results/mappedGFF/BAT-k27ac-1_12 KB_STITCHED_TSS_DISTAL_bat1_k27ac.sorted.bam_MAPPED.gff & {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 're sults/mappedGFF/BAT-k27ac-1_12KB_STITCHED_TSS_DISTAL_bat1_k27ac.sorted.bam_MAPPE D.gff', 'bam': 'bat1_k27ac.sorted.bam', 'rpm': True, 'input': 'results/gff/BAT-k 27ac-1_12KB_STITCHED_TSS_DISTAL.gff'} [] Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 196, in main fileList = os.listdir(pathFolder) WindowsError: [Error 3] The system cannot find the path specified: '' python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b bat1_k27ac.sorted.bam -i BAT-k27a c-1.gff -o results/mappedGFF/BAT-k27ac-1_bat1_k27ac.sorted.bam_MAPPED.gff & {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 're sults/mappedGFF/BAT-k27ac-1_bat1_k27ac.sorted.bam_MAPPED.gff', 'bam': 'bat1_k27a c.sorted.bam', 'rpm': True, 'input': 'BAT-k27ac-1.gff'} [] Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 196, in main fileList = os.listdir(pathFolder) WindowsError: [Error 3] The system cannot find the path specified: '' python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b bat1_input.sorted.bam -i results/ gff/BAT-k27ac-1_12KB_STITCHED_TSS_DISTAL.gff -o results/mappedGFF/BAT-k27ac-1_12 KB_STITCHED_TSS_DISTAL_bat1_input.sorted.bam_MAPPED.gff & {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 're sults/mappedGFF/BAT-k27ac-1_12KB_STITCHED_TSS_DISTAL_bat1_input.sorted.bam_MAPPE D.gff', 'bam': 'bat1_input.sorted.bam', 'rpm': True, 'input': 'results/gff/BAT-k 27ac-1_12KB_STITCHED_TSS_DISTAL.gff'} [] Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 196, in main fileList = os.listdir(pathFolder) WindowsError: [Error 3] The system cannot find the path specified: '' python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b bat1_input.sorted.bam -i BAT-k27a c-1.gff -o results/mappedGFF/BAT-k27ac-1_bat1_input.sorted.bam_MAPPED.gff & {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 're sults/mappedGFF/BAT-k27ac-1_bat1_input.sorted.bam_MAPPED.gff', 'bam': 'bat1_inpu t.sorted.bam', 'rpm': True, 'input': 'BAT-k27ac-1.gff'} [] Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 196, in main fileList = os.listdir(pathFolder) WindowsError: [Error 3] The system cannot find the path specified: '' PAUSING TO MAP WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN): 0

  2. young_computation repo owner

    Choon,

    It looks like your issues are with paths of input files and the input file formats. The first error seems to be an issue with the format of your input gff.

    The second issues seems to be a path issue where results/mappedGFF/BAT-k27ac-1_bat1_input.sorted.bam_MAPPED.gff cannot be found. Please give absolute paths as opposed to relative paths i.e. /DATA_FOLDER/results/mappedGFF/BAT-k27ac-1_bat1_input.sorted.bam_MAPPED.gff

    Please download the example data and example runs here

    http://younglab.wi.mit.edu/ROSE/ROSE_DATA.zip

    to make sure that your install of ROSE is working correctly. Also, please see our example GFFs to make sure your inputs are correctly formatted.

  3. Choon Sim reporter

    Hi, thanks for the example. What is the command that you run at the terminal window?

    I used: python ROSE_main.py -g hg18 -i /home/simck1/Documents/ROSE_DATA/HG18_MM1S_MED1.gff -r MM1S_MED1.hg18.bwt.sorted.bam -o example -t 2500 -c MM1S_WCE.hg18.bwt.sorted.bam

    and the output error is : USING /home/simck1/Documents/ROSE_DATA/HG18_MM1S_MED1.gff AS THE INPUT GFF USING hg18 AS THE GENOME MAKING START DICT LOADING IN GFF REGIONS STITCHING REGIONS TOGETHER PERFORMING REGION STITCHING REMOVED 7865 LOCI BECAUSE THEY WERE CONTAINED BY A TSS REMOVED 16 STITCHED LOCI BECAUSE THEY OVERLAPPED MULTIPLE TSSs ADDED BACK 42 ORIGINAL LOCI MAKING GFF FROM STITCHED COLLECTION WRITING STITCHED GFF TO DISK AS example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff OUTPUT WILL BE WRITTEN TO example/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_ENHANCER_REGION_MAP.txt python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_MED1.hg18.bwt.sorted.bam -i example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff -o example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_MED1.hg18.bwt.sorted.bam -i /home/simck1/Documents/ROSE_DATA/HG18_MM1S_MED1.gff -o example/mappedGFF/HG18_MM1S_MED1_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_WCE.hg18.bwt.sorted.bam -i example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff -o example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff & python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b MM1S_WCE.hg18.bwt.sorted.bam -i /home/simck1/Documents/ROSE_DATA/HG18_MM1S_MED1.gff -o example/mappedGFF/HG18_MM1S_MED1_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff & {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_MED1.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff'} [] mapping to GFF and making a matrix with fixed bin number PAUSING TO MAP {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_MM1S_MED1.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_MED1.hg18.bwt.sorted.bam', 'rpm': True, 'input': '/home/simck1/Documents/ROSE_DATA/HG18_MM1S_MED1.gff'} [] mapping to GFF and making a matrix with fixed bin number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_WCE.hg18.bwt.sorted.bam', 'rpm': True, 'input': 'example/gff/HG18_MM1S_MED1_12KB_STITCHED_TSS_DISTAL.gff'} MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) [] TypeError: float() argument must be a string or a number mapping to GFF and making a matrix with fixed bin number {'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'example/mappedGFF/HG18_MM1S_MED1_MM1S_WCE.hg18.bwt.sorted.bam_MAPPED.gff', 'bam': 'MM1S_WCE.hg18.bwt.sorted.bam', 'rpm': True, 'input': '/home/simck1/Documents/ROSE_DATA/HG18_MM1S_MED1.gff'} [] mapping to GFF and making a matrix with fixed bin number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number Traceback (most recent call last): File "ROSE_bamToGFF.py", line 247, in <module> main() File "ROSE_bamToGFF.py", line 238, in main newGFF = mapBamToGFF(bamFile,gffFile,options.sense,int(options.extension),options.floor,options.rpm,options.matrix) File "ROSE_bamToGFF.py", line 40, in mapBamToGFF MMR= round(float(bam.getTotalReads('mapped'))/1000000,4) TypeError: float() argument must be a string or a number WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN): 0

  4. chazlin

    Hi Choon,

    The code requires samtools to be installed and in the PATH.

    e.g. you should be able to run samtools directly from the command line by calling just "samtools"

    You can either fix this by placing samtools in your PATH variable or by editing the code in ROSE_utils.py to reflect the correct path to samtool. This occurs around lines

    588: command = 'samtools view %s | head -n 1' % (bamFile) 618: command = 'samtools flagstat %s' % (self._bam) 643: command = 'samtools view %s %s' % (self._bam,locusLine)

  5. Choon Sim reporter

    Hi chazlin,

    your suggestion works wonders. After I set the path to Samtools and installed R software, the code runs to completion without errors. I am a happy man today.

  6. Log in to comment