I am trying to run CHiCAGO to incorporate it in our capture HiC analysis pipelines. I am having some trouble with my input files and would like to understand how to get the files in the correct format.
Here is my current understanding: I will need 6 files, as follows:
1- restriction map file (.rmap)
2- Baitmap file (.baitmap)
3- nperbin file (.npb)
4- nbaitsperbin file (.nbpb)
5- proxOE file (.poe)
6- input file (.chinput)
What I have:
-bam aligned by HiCUP
-HiCUP digest file
My understanding is that I can get files 3,4,5 by running makeDesignFiles.py but I will need .rmap and .baitmap.
I can also get file 6 by using bam2chicago.sh. but I also need .rmap and .baitmap.
So i am missing files 1 and 2 (.rmap, .baitmap) and am wondering how to get those. Are there scripts to do so?
I thought I can make the .rmap file by taking the first 3 columns of the HiCUP digest file and assigning an arbitrary "numeric ID" and I can make the .baitmap from the 3 first columns of the chip bait coordinates and an arbitrary "numeric ID". but I am confused by the following comment in the README.md file " [bait coordinates] (should be a subset of the fragments listed in rmapfile), their numeric IDs (should match those listed in rmapfile for the corresponding fragments)".
Is there a better way to produce the .rmap and .baitmap files and how do I coordinate the ID of the rmap file with those in the baitmap file?
Thank you, Rola