ERROR: no associated .bai file found with bam.
Issue #46
resolved
Hi, I am trying to run ROSE on some H3K27Ac peak sets but am getting an error stating there is no bam index (when there is).
ls *bam*
E3381K27WT-H3K27Ac.dedupe.bam E3387K27WT-H3K27Ac.dedupe.bam ROSE_bamToGFF.py
E3381K27WT-H3K27Ac.dedupe.bam.bai E3387K27WT-H3K27Ac.dedupe.bam.bai ROSE_bamToGFF_turbo.py
E3381K27WT-Input.dedupe.bam E3387K27WT-Input.dedupe.bam
E3381K27WT-Input.dedupe.bam.bai E3387K27WT-Input.dedupe.bam.bai
The script seems to be continuing but ROSE_bamToGFF is throwing these errors.
python ROSE_main.py -i Wt_H3K27Ac_peaks_filtered_intersected.gff -r E3381K27WT-H3K27Ac.dedupe.bam -o WT_AC_ROSE -g HG38 -b E3387K27WT-H3K27Ac.dedupe.bam -c E3387K27WT_Input.dedupe.bam -t 2500
USING Wt_H3K27Ac_peaks_filtered_intersected.gff AS THE INPUT GFF
USING HG38 AS THE GENOME
MAKING START DICT
LOADING IN GFF REGIONS
CHECKING INPUT TO MAKE SURE EACH REGION HAS A UNIQUE IDENTIFIER
REFERENCE COLLECTION PASSES QC
STITCHING REGIONS TOGETHER
PERFORMING REGION STITCHING
REMOVED 8321 LOCI BECAUSE THEY WERE CONTAINED BY A TSS
REMOVED 92 STITCHED LOCI BECAUSE THEY OVERLAPPED MULTIPLE TSSs
ADDED BACK 242 ORIGINAL LOCI
MAKING GFF FROM STITCHED COLLECTION
WRITING STITCHED GFF TO DISK AS WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff
OUTPUT WILL BE WRITTEN TO WT_AC_ROSE/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_ENHANCER_REGION_MAP.txt
python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b E3381K27WT-H3K27Ac.dedupe.bam -i WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff -o WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_E3381K27WT-H3K27Ac.dedupe.bam_MAPPED.gff &
python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b E3381K27WT-H3K27Ac.dedupe.bam -i Wt_H3K27Ac_peaks_filtered_intersected.gff -o WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_E3381K27WT-H3K27Ac.dedupe.bam_MAPPED.gff &
python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b E3387K27WT_Input.dedupe.bam -i WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff -o WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_E3387K27WT_Input.dedupe.bam_MAPPED.gff &
{'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_E3381K27WT-H3K27Ac.dedupe.bam_MAPPED.gff', 'bam': 'E3381K27WT-H3K27Ac.dedupe.bam', 'rpm': True, 'input': 'WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff'}
[]
mapping to GFF and making a matrix with fixed bin number
python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b E3387K27WT_Input.dedupe.bam -i Wt_H3K27Ac_peaks_filtered_intersected.gff -o WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_E3387K27WT_Input.dedupe.bam_MAPPED.gff &
{'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_E3381K27WT-H3K27Ac.dedupe.bam_MAPPED.gff', 'bam': 'E3381K27WT-H3K27Ac.dedupe.bam', 'rpm': True, 'input': 'Wt_H3K27Ac_peaks_filtered_intersected.gff'}
[]
mapping to GFF and making a matrix with fixed bin number
python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b E3387K27WT-H3K27Ac.dedupe.bam -i WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff -o WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_E3387K27WT-H3K27Ac.dedupe.bam_MAPPED.gff &
{'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_E3387K27WT_Input.dedupe.bam_MAPPED.gff', 'bam': 'E3387K27WT_Input.dedupe.bam', 'rpm': True, 'input': 'WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff'}
[]
ERROR: no associated .bai file found with bam. Must use a sorted bam with accompanying index file
Usage: ROSE_bamToGFF.py [options] -b [SORTED BAMFILE] -i [INPUTFILE] -o [OUTPUTFILE]
Options:
-h, --help show this help message and exit
-b BAM, --bam=BAM Enter .bam file to be processed.
-i INPUT, --input=INPUT
Enter .gff or ENRICHED REGION file to be processed.
-o OUTPUT, --output=OUTPUT
Enter the output filename.
-s SENSE, --sense=SENSE
Map to '+','-' or 'both' strands. Default maps to
both.
-f FLOOR, --floor=FLOOR
Sets a read floor threshold necessary to count towards
density
-e EXTENSION, --extension=EXTENSION
Extends reads by n bp. Default value is 200bp
-r, --rpm Normalizes density to reads per million (rpm)
-m MATRIX, --matrix=MATRIX
Outputs a variable bin sized matrix. User must specify
number of bins.
python ROSE_bamToGFF.py -f 1 -e 200 -r -m 1 -b E3387K27WT-H3K27Ac.dedupe.bam -i Wt_H3K27Ac_peaks_filtered_intersected.gff -o WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_E3387K27WT-H3K27Ac.dedupe.bam_MAPPED.gff &
{'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_E3387K27WT_Input.dedupe.bam_MAPPED.gff', 'bam': 'E3387K27WT_Input.dedupe.bam', 'rpm': True, 'input': 'Wt_H3K27Ac_peaks_filtered_intersected.gff'}
[]
ERROR: no associated .bai file found with bam. Must use a sorted bam with accompanying index file
Usage: ROSE_bamToGFF.py [options] -b [SORTED BAMFILE] -i [INPUTFILE] -o [OUTPUTFILE]
Options:
-h, --help show this help message and exit
-b BAM, --bam=BAM Enter .bam file to be processed.
-i INPUT, --input=INPUT
Enter .gff or ENRICHED REGION file to be processed.
-o OUTPUT, --output=OUTPUT
Enter the output filename.
-s SENSE, --sense=SENSE
Map to '+','-' or 'both' strands. Default maps to
both.
-f FLOOR, --floor=FLOOR
Sets a read floor threshold necessary to count towards
density
-e EXTENSION, --extension=EXTENSION
Extends reads by n bp. Default value is 200bp
-r, --rpm Normalizes density to reads per million (rpm)
-m MATRIX, --matrix=MATRIX
Outputs a variable bin sized matrix. User must specify
number of bins.
{'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL_E3387K27WT-H3K27Ac.dedupe.bam_MAPPED.gff', 'bam': 'E3387K27WT-H3K27Ac.dedupe.bam', 'rpm': True, 'input': 'WT_AC_ROSE/gff/Wt_H3K27Ac_peaks_filtered_intersected_12KB_STITCHED_TSS_DISTAL.gff'}
[]
mapping to GFF and making a matrix with fixed bin number
PAUSING TO MAP
{'matrix': '1', 'extension': '200', 'floor': '1', 'sense': 'both', 'output': 'WT_AC_ROSE/mappedGFF/Wt_H3K27Ac_peaks_filtered_intersected_E3387K27WT-H3K27Ac.dedupe.bam_MAPPED.gff', 'bam': 'E3387K27WT-H3K27Ac.dedupe.bam', 'rpm': True, 'input': 'Wt_H3K27Ac_peaks_filtered_intersected.gff'}
[]
mapping to GFF and making a matrix with fixed bin number
using a MMR value of 10.1709
has chr
Number lines processed
0
using a MMR value of 10.1709
has chr
Number lines processed
0
using a MMR value of 11.1455
has chr
Number lines processed
0
using a MMR value of 11.1455
has chr
Number lines processed
0
WAITING FOR MAPPING TO COMPLETE. ELAPSED TIME (MIN):
Comments (2)
-
-
- changed status to resolved
- Log in to comment
Solved this issue by putting the files in the same directory structure as in the example data and calling the bam file path with ./
E.G:
python ROSE_main.py -g HG38 -i ./data/Wt_H3K27Ac_peaks_filtered_intersected.gff -r ./data/E3381K27WT-H3K27Ac.dedupe.sorted.bam -c ./data/E3381K27WT-Input.dedupe.sorted.bam -o WT_AC_ROSE/ -s 12500 -t 2500
This program is very finicky with how it is called….
I think a small but hugely beneficial improvement to the program would be to put in a check at the start of ROSE_main.py to check that all input files exist and are findable, and to kill the script before executing the subscripts if not.