Different problems running replicates
After running Chicago successfully in one replicate, I ran into problems running the same pipeline on other replicates. I processed them with bam2chicago.sh like I processed the first sample, so I don't know what's happening. The errors are different, so I believe it's not just something I'm doing wrong.
---------------------------------------------------------------------
Problem 1
---------------------------------------------------------------------
This is a different replicate and it doesn't run successfully.
bam2chicago.sh 3.bam probes-MboI_fragments.baitmap MboI.rmap 3
*** Running estimateTechnicalNoise...
Estimating technical noise based on trans-counts... Binning baits based on observed trans-counts... Defining interaction pools and gathering the observed numbers of trans-counts per pool... Computing the total number of possible interactions per pool... Preparing the data.....Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__, : Join results in 170243 rows; more than 142318 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice. Calls: chicagoPipeline ... estimateTechnicalNoise -> [ -> [.data.table -> vecseq In addition: Warning messages: 1: In fread(s$nperbinfile) : Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: # minFragLen=150 maxFragLen=40000 maxLBrownEst=1500000 binsize=20000 removeb2b=True removeAdjacent=True rmapfile=./MboI.rmap baitmapfile=./probes-MboI_fragments.baitmap 2: In fread(s$nbaitsperbinfile) : Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: # maxLBrownEst=1500000 binsize=20000 rmapfile=./MboI.rmap baitmapfile=./probes-MboI_fragments.baitmap Execution halted
---------------------------------------------------------------------
Problem 2
---------------------------------------------------------------------
This is a different replicate and it doesn't run successfully for a different reason.
bam2chicago.sh 4.bam probes-MboI_fragments.baitmap MboI.rmap 4
*** Running normaliseOtherEnds...
Preprocessing input... Computing trans-counts... Filtering out 1 other ends with top 0.01% number of trans-interactions Binning... Error in cut.default(transLenB2B$transLength, breaks = cutsB2B, include.lowest = T) : invalid number of intervals Calls: chicagoPipeline ... normaliseOtherEnds -> .addTLB -> set -> cut -> cut.default In addition: Warning messages: 1: In fread(s$nperbinfile) : Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: # minFragLen=150 maxFragLen=40000 maxLBrownEst=1500000 binsize=20000 removeb2b=True removeAdjacent=True rmapfile=./MboI.rmap baitmapfile=./probes-MboI_fragments.baitmap 2: In min(diff(x.unique)) : no non-missing arguments to min; returning Inf Execution halted
Comments (5)
-
-
reporter It does! I removed them and I'm re-running Chicago. However, one replicate ran fine with this redundant file so I suspect this is not the culprit. Thanks!
-
OK - I can reproduce Problem 2, which is an issue that occurs in small data sets where each bait has no more than 1 trans read associated with it. I will look into a fix for this and keep you updated.
-
Hi, apologies for the delayed response - I believe I fixed Problem 2 and the latest versions of CHiCAGO should be able to handle it - please let me know if the issue reoccurs.
-
- changed status to resolved
- Log in to comment
Hi Noboru,
I'm running your data now. I'm afraid I can't reproduce Problem 1 right now. While this is possibly because I'm using the development version of CHiCAGO, I also note that the baitmap has duplicate entries, perhaps that is causing problems? Continuing to investigate...