About bam2chicago.sh error

Issue #46 new
Junko Tomikawa created an issue

Hi,

I am trying to generate .chinput file using the .rmap and .baitmap from CapHiCdata, and the bam file attached. I have generated it using hicup as recommended. When starting the following:

\$ bam2chicago.sh CapHiC_BJ4ES_R1_2.hicup.genome1.bam designDir/S3175602_Covered_intersect_annotate2.baitmap designDir/mm9_MboI_fragment.rmap BJ4ES_genome1

Checking rmap and baitmap files...

Rmap and baitmap files checked successfully

Processing sample BJ4ES_genome1...

Using bam file CapHiC_BJ4ES_R1_2.hicup.genome1.bam

Using baitmap file designDir/S3175602_Covered_intersect_annotate2.baitmap

Using digest map (rmap) file designDir/mm9_MboI_fragment.rmap

Baitmap file contains >4 columns. Checking if designDir/S3175602_Covered_intersect_annotate2.baitmap_4col.txt exists...

Found designDir/S3175602_Covered_intersect_annotate2.baitmap_4col.txt

Intersecting with bait fragments (using min overhang of 0.6)...

Flipping all reads that overlap with the bait on to the right-hand side...

Intersecting with bait fragments again to produce a list of bait-to-bait interactions that can be used separately; note they will also be retained in the main output...

Error: Type checker found wrong number of fields while tokenizing data line.

I am not sure why I am getting this error. I am hoping you could please possibly advise me.

Best regards.

Comments (7)

  1. Junko Tomikawa reporter

    I tried again to generate .chinput files following opinion in #33 as a guide.

    “awk 'BEGIN{ OFS="\t" } { minRight=$13<$3?$13:$3; maxLeft=$12>$2?$12:$2; if($1==$11 && (minRight-maxLeft)/($3-$2)>=0.6){ print $4,$5,$6,$1,$2,$3,$7,$8,$10,$9,$11,$12,$13,$14,$15 } else { print $0 } }' ${samplename}/${bamname}_mappedToBaits.bedpe > ${samplename}/${bamname}_mappedToBaits_baitOnRight.bedpe

    I have removed the last $15 from the print statement.”

    So, I could generate .chinput files!

    Thanks!

  2. Junko Tomikawa reporter

    Sorry…I used bam2chicago.sh (modified version) with another bam files, following error was occurred.

    \$ ./bam2chicago_modified.sh CapHiC_JB2ES_R1_2.hicup.genome2.bam designDir/S3175602_Covered_intersect_annotate2.baitmap designDir/mm9_MboI_fragment.rmap JB2ES_genome2

    Checking rmap and baitmap files...

    Rmap and baitmap files checked successfully

    Processing sample JB2ES_genome2...

    Using bam file CapHiC_JB2ES_R1_2.hicup.genome2.bam

    Using baitmap file designDir/S3175602_Covered_intersect_annotate2.baitmap

    Using digest map (rmap) file designDir/mm9_MboI_fragment.rmap

    Baitmap file contains >4 columns. Checking if designDir/S3175602_Covered_intersect_annotate2.baitmap_4col.txt exists...

    Found designDir/S3175602_Covered_intersect_annotate2.baitmap_4col.txt

    Intersecting with bait fragments (using min overhang of 0.6)...

    *****WARNING: Query NB551733:5:HFCHVBGXB:1:11101:7827:8813 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.

    *****WARNING: Query NB551733:5:HFCHVBGXB:1:11101:26809:10628 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.

    *****WARNING: Query NB551733:5:HFCHVBGXB:1:11101:25176:12818 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.

    *****WARNING: Query NB551733:5:HFCHVBGXB:1:11101:22480:13563 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping….

    Bam format is same as previous ones. Why I am getting this error? I am hoping you could please possibly advise me.

    Best regards.

  3. Mikhail Spivakov

    Happy to hear you’ve sorted the first issue. Re second issue - how many error messages like this have you got? If they were all over the place, this simply means that you need to re-sort your file such that mate pairs are kept together (you can do it easily with samtools). If you only have a very small number of such warnings, I’d probably just disregard them.

  4. Junko Tomikawa reporter

    Thank you for reply.

    According to your direction, I sorted my bam file using samtools as follows:

    samtools sort CapHiC_JB2ES_R1_2.hicup.genome1.bam > CapHiC_JB2ES_R1_2.hicup.genome1_sort.bam

    But I got the same error for all lines.

    Could you check my bam file, if you have a time?

  5. Mikhail Spivakov

    Samtools sort with default parameters won't give you what you want. Please refer to samtools docs for the correct command line. I think it’s -n but please double check.

  6. Junko Tomikawa reporter

    Thanks. So I re-sort my bam file using -n option. Bam file was sorted according to the ID, but was not paired with next line such as other correct files.

    \$ samtools sort -n CapHiC_JB2ES_R1_2.hicup.genome1.bam > CapHiC_JB2ES_R1_2.hicup.genome1_sort.bam

    \$ samtools view CapHiC_JB2ES_R1_2.hicup.genome1_sort.bam | head

    NB551733:5:HFCHVBGXB:1:11101:1070:6925163chr79698101742133Mchr922064319TACTAATACCATGTTATAACAGAATCCCAAGTGTGAGAGAGCATACAGCCTTGCAAGACTGTTGGAAAAGTAGTGGCCCCAGGGGACAGCTAAATTTTAACTAAGACAGGATAGGGAGCAGAGTAATGGGATCAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEE<AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAS:i:-3XN:i:3XM:i:3XO:i:0XG:i:0NM:i:3MD:Z:45N8N10N67YT:Z:UUCT:Z:TRANSXX:Z:G1

    NB551733:5:HFCHVBGXB:1:11101:1082:558483chr1576343273234M3I143Mchr1337070680GCAAAGATCCTGTAGTATCCACTGACTCCTTCCCTCAGGTCACACTTTCTTCACGACACATCTCATGATGAGCAATCTGGGCTGCCCTGCAGGTGGTGTCTTTGTACATATGCAGAGAAAAGCGAACCCAGGCTTGGACTTTTGTGGATC/EEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAAAS:i:-36XN:i:4XM:i:8XO:i:1XG:i:3NM:i:11MD:Z:0C2C1T38N0N6N1N32C59YT:Z:UUCT:Z:TRANSXX:Z:G1

    NB551733:5:HFCHVBGXB:1:11101:1087:17754179chr713807668242150Mchr1171895797TGAGAGGGAGTCTTTTTGGAGGATTTGTTGGGACCTGCTCTGGTTTGGGTATGATTGGAGTGCAAGGTAGAATTTTGGTTCAGTGTTGTAAAAATGGAGGCCTAAGGCAGGTCACACATGATCATAGGACTGCAGGTCTCAGGGAGACTT//EAAA/EEEEEEEEEEE<<EAAEE<<////E/</EE</EEEE/EEE/A</EA<E<E<6//EEAEA<<E/EE//E6/E<<AA/E/E/AEEAEEEE//A/EEEEEEE/EEEE/EEAEEEEEEEEEEEEEEEAA<EAEE/EEEEEEEAAAAAAS:i:-7XN:i:1XM:i:XO:i:0XG:i:0NM:i:3MD:Z:72G13G33N29YT:Z:UUCT:Z:TRANSXX:Z:G1

    NB551733:5:HFCHVBGXB:1:11101:1087:17754179chr713807668242150Mchr1723701470TGAGAGGGAGTCTTTTTGGAGGATTTGTTGGGACCTGCTCTGGTTTGGGTATGATTGGAGTGCAAGGTAGAATTTTGGTTCAGTGTTGTAAAAATGGAGGCCTAAGGCAGGTCACACATGATCATAGGACTGCAGGTCTCAGGGAGACTT//EAAA/EEEEEEEEEEE<<EAAEE<<////E/</EE</EEEE/EEE/A</EA<E<E<6//EEAEA<<E/EE//E6/E<<AA/E/E/AEEAEEEE//A/EEEEEEE/EEEE/EEAEEEEEEEEEEEEEEEAA<EAEE/EEEEEEEAAAAAAS:i:-7XN:i:1XM:i:XO:i:0XG:i:0NM:i:3MD:Z:72G13G33N29YT:Z:UUCT:Z:TRANSXX:Z:G1

    I'm not sure what happened here…

  7. Log in to comment