bam2chicago.sh worng number of fields

Create issue
Issue #33 new
Former user created an issue

kind regards!

I am trying to generate .chinput file using the .rmap and .baitmap from PCHiCdata, and the BAM file attached. I have generated it using hicup as recommended. This is the error:

$ ./bam2chicago.sh hicup/output_hicup/SRR3535023_1_2.hicup.bam h19_chr20and21.baitmap h19_chr20and21.rmap out_bam2chicago2

Checking rmap and baitmap files... Rmap and baitmap files checked successfully Processing sample out_bam2chicago2... Using bam file hicup/output_hicup/SRR3535023_1_2.hicup.bam Using baitmap file h19_chr20and21.baitmap Using digest map (rmap) file h19_chr20and21.rmap Baitmap file contains >4 columns. Checking if h19_chr20and21.baitmap_4col.txt exists... Found h19_chr20and21.baitmap_4col.txt Intersecting with bait fragments (using min overhang of 0.6)... Flipping all reads that overlap with the bait on to the right-hand side... Intersecting with bait fragments again to produce a list of bait-to-bait interactions that can be used separately; note they will also be retained in the main output... Error: Type checker found wrong number of fields while tokenizing data line.

I have notice the error is produced in this command: bedtools intersect -a ${samplename}/${bamname}_mappedToBaits_baitOnRight.bedpe -wo -f 0.6 -b $baitfendsid >> ${samplename}/${samplename1}_bait2bait.bedpe

I have checked the ${samplename}/${bamname}_mappedToBaits_baitOnRight.bedpe file and just for some lines there is a tabs at the end of it, that is causing the issue. The thing is that I do not know why is that happening, if you could shade some light in here I would be very happy.

Thank you very much for your time!!

Comments (2)

  1. Pablo Acera

    Hay I have found a workaround to this issue by modifying this awk command:

    awk 'BEGIN{ OFS="\t" } { minRight=$13<$3?$13:$3; maxLeft=$12>$2?$12:$2; if($1==$11 && (minRight-maxLeft)/($3-$2)>=0.6){ print $4,$5,$6,$1,$2,$3,$7,$8,$10,$9,$11,$12,$13,$14,$15 } else { print $0 } }' ${samplename}/${bamname}_mappedToBaits.bedpe > ${samplename}/${bamname}_mappedToBaits_baitOnRight.bedpe

    I have removed the last $15 from the print statement, apparently some lines does not have 15 fields, so that causes to prints a empty tab, causing the issue.

    Will that be problematic for future analysis?? Thanks,

  2. Mikhail Spivakov

    I don't have the data at hand right now, so as a guess - is $15 the bait name from the .baitmap file? It will be expected by Chicago further down the line. I would check that .baitmap contains some kind of bait annotation in each row.

    Also, by any chance are you using bedtools 2.26+? Chicago currently doesn't support it exactly because the type checker causes problems.

  3. Log in to comment