correct workflow for weight recalibration
Hello, I have 2 biological replicates and would like to use them to recalibrate weights. I use the runChicago.R wrapper and would like to clarify if the below is the correct (and most efficient) workflow:
- Run separately runChicago.R on each of the 2 replicates, that is two runs: in one run the input is only rep1.chinput and in the other run the input is only rep2.chinput
- Get the 2 separate rds objects, rep1.rds and rep2.rds and run fitDistCurve.R with --input rep1.rds,rep2.rds
- Update the settings File with the new weights from (2)
- Use the updated settings File to do one more run of runChicago.R this time providing the comma-separated list rep1.chinput,rep2.chinput
Also, if the above is valid, can you advice on whether the attached diagnostics indicate that the estimates from (2) are reliable?
Thanks
Comments (7)
-
reporter -
Hi Elisabetta,
Yes, what you've done is the correct and most efficient workflow at this time.
weights_curveFit.pdf is a bit concerning, though not terrible. Most likely this is down to using only 2 replicates, or perhaps undersampling issues, though this could also be down to e.g. an unusual cell type that CHiCAGO's strategy isn't suitable for. For example, contrasting weights_curveFit.pdf with Figures 5A and Supplementary 5A in the CHiCAGO paper, we see a worse fit in your data.
weights_medianCurveFit.pdf looks fine overall (though some of the data subsets, i.e. the coloured lines, also show an unusual pattern).
-
reporter Thanks for your reply! Elisabetta
-
reporter I've run Chicago both with the recalibrated weights as above and with the default weights and I noticed an increase in the percentage of significant trans interactions when I use the recalibrated weights. I find this somewhat worrisome, what's your take on it? Thanks.
-
Looks like the curve in weights_curveFit.pdf doesn't reach as low as it perhaps should, thus meaning that distal/trans interactions aren't penalized as much as they should be. I suspect that you'll struggle to get a better curve out of this data set without more replicates or more coverage - you could try decreasing the "--subsets" parameter, but I imagine that will have bad side effects.
-
reporter Thanks for your feedback. I'm really only interested in cis interactions, so I wouldn't mind throwing away the trans as long as the cis I get are reliable. Do you think the latter is the case?
-
- changed status to resolved
- Log in to comment