also a-ctgs in input?
Hello, I am wondering if you include both p- and a-contigs in the input files. Since we already know that a-contigs are allelic, is it meaningful to include them again? will they mask hits between p-conitgs? Thanks, Dario
Comments (4)
-
repo owner -
repo owner - changed status to resolved
When I merge the new 'lastz' branch into 'master' the documentation will be explicit regarding this. Also the pipeline now iterates until convergence by default so there is no limit to the number of nested haplotigs that it can reassign.
-
reporter Thanks Mike, I am wondering if, after running the pipeline on the p-contigs of an Unzip assembly, it would be worth to run the Unzip again, since some p-contigs may now have become haplotigs.
Would it be worth?
Thanks,
Dario
-
repo owner I don't think unzip will run on the assembly once it's been through purge haplotigs. I'm guessing it would throw an error that it's missing some 'primary' contigs.
If you want to identify the syntenic blocks between then new primary contigs and haplotigs you could pull the dev branch of purge haplotigs and try the ncbi placement script.
purge_haplotigs ncbiplace
will give you the help mesasage. It will produce an NCBI 'placement' file (discussed here: http://www.pacb.com/wp-content/uploads/Sarah-Kingan-PacBio-East-Coast-Bioinformatics-Workshop-2017.pdf). I haven't had a chance to test the ncbi placement files in a submission yet though, and you'll need the latest MUMmer package. - Log in to comment
Hi Dario,
The original intention was to remove allelic primary contigs, and hence just use the primaries. The a-contigs (from FALCON, not unzip) are sometimes large bubbles that are not be present in their primary contig; so if you want a minimal unique haploid assembly then including them isn't a bad idea. If they do align well to the primary contig they'll just be flagged as a haplotig anyway. It shouldn't mask hits between primary contigs as long as you set the maximum passes high enough to reach convergence in STEP 3 (10 passes should be plenty).
If you have a FALCON-unzip assembly, then including the haplotigs can be beneficial as occasionally a haplotig will actually be larger than its primary contig.
Cheers, Mike