Multi-fasta format doesn't work

Issue #10 new
Andrew Hess created an issue

I have fasta files that contain multiple regions, and I would like to use that as an input to get a list of sgRNAs for all of my ROIs. This format seems to run in chopchop just fine, but it only generates an output for the first region in the fasta file. Considering one of the functions for chopchop is for sgRNA design for Cas9 enrichment, and Cas9 enrichment boasts being able to target up to a 100 different ROIs, it would be a useful to be able to use a multi-fasta to get the sgRNAs for all ROIs in a single job. Even further, it would be useful to be able to use multi-threading (e.g. using the parameter already found in config_local.json) to simultaneously run multiple ROI through chopchop.py.

Comments (2)

  1. Kornel Labun

    And in case of multi-fasta, the idea is to find guides that can target all of those fasta inputs at the same time? Or just like in a batch processing for multiple genes, separate output for separate lines of the multi-fasta?

  2. Andrew Hess reporter

    Finding guides that would target all ROIs simultaneously would be interesting, but that wasn’t what I had in mind. I was think about batch processing for multiple regions. I think this would best be handled by collating all the results into a single file, which can be distinguished by the “Genomic Location” field in the output generate (which, in the case of fasta files, are the header line + the position at which the sgRNA starts, so should be unique for each ROI provided the headers are distinguishable).

  3. Log in to comment