Different problems running replicates

Issue #3 resolved
Noboru Sakabe created an issue

After running Chicago successfully in one replicate, I ran into problems running the same pipeline on other replicates. I processed them with bam2chicago.sh like I processed the first sample, so I don't know what's happening. The errors are different, so I believe it's not just something I'm doing wrong.

---------------------------------------------------------------------

Problem 1

---------------------------------------------------------------------

This is a different replicate and it doesn't run successfully.

bam2chicago.sh 3.bam probes-MboI_fragments.baitmap MboI.rmap 3

*** Running estimateTechnicalNoise...

Estimating technical noise based on trans-counts... Binning baits based on observed trans-counts... Defining interaction pools and gathering the observed numbers of trans-counts per pool... Computing the total number of possible interactions per pool... Preparing the data.....Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__, : Join results in 170243 rows; more than 142318 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice. Calls: chicagoPipeline ... estimateTechnicalNoise -> [ -> [.data.table -> vecseq In addition: Warning messages: 1: In fread(s$nperbinfile) : Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: # minFragLen=150 maxFragLen=40000 maxLBrownEst=1500000 binsize=20000 removeb2b=True removeAdjacent=True rmapfile=./MboI.rmap baitmapfile=./probes-MboI_fragments.baitmap 2: In fread(s$nbaitsperbinfile) : Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: # maxLBrownEst=1500000 binsize=20000 rmapfile=./MboI.rmap baitmapfile=./probes-MboI_fragments.baitmap Execution halted

---------------------------------------------------------------------

Problem 2

---------------------------------------------------------------------

This is a different replicate and it doesn't run successfully for a different reason.

bam2chicago.sh 4.bam probes-MboI_fragments.baitmap MboI.rmap 4

*** Running normaliseOtherEnds...

Preprocessing input... Computing trans-counts... Filtering out 1 other ends with top 0.01% number of trans-interactions Binning... Error in cut.default(transLenB2B$transLength, breaks = cutsB2B, include.lowest = T) : invalid number of intervals Calls: chicagoPipeline ... normaliseOtherEnds -> .addTLB -> set -> cut -> cut.default In addition: Warning messages: 1: In fread(s$nperbinfile) : Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: # minFragLen=150 maxFragLen=40000 maxLBrownEst=1500000 binsize=20000 removeb2b=True removeAdjacent=True rmapfile=./MboI.rmap baitmapfile=./probes-MboI_fragments.baitmap 2: In min(diff(x.unique)) : no non-missing arguments to min; returning Inf Execution halted

Comments (5)

  1. Jonathan Cairns

    Hi Noboru,

    I'm running your data now. I'm afraid I can't reproduce Problem 1 right now. While this is possibly because I'm using the development version of CHiCAGO, I also note that the baitmap has duplicate entries, perhaps that is causing problems? Continuing to investigate...

  2. Noboru Sakabe reporter

    It does! I removed them and I'm re-running Chicago. However, one replicate ran fine with this redundant file so I suspect this is not the culprit. Thanks!

  3. Jonathan Cairns

    OK - I can reproduce Problem 2, which is an issue that occurs in small data sets where each bait has no more than 1 trans read associated with it. I will look into a fix for this and keep you updated.

  4. Jonathan Cairns

    Hi, apologies for the delayed response - I believe I fixed Problem 2 and the latest versions of CHiCAGO should be able to handle it - please let me know if the issue reoccurs.

  5. Log in to comment