Efficiency algorithms don't work with "N" in sequence

Issue #5 resolved
Former user created an issue

Hi, I've noticed that the efficiency algorithms return zeros when the region provided contains an unknown nucleotide ("N"). I have confirmed this is the problem (the test example works fine and replacing any "N"s with "A"s in my fasta results in efficiency estimates being returned). My work-around works fine (and I can subsequently exclude any regions that contained "N"s), but I was wondering if you could build in a solution? Perhaps just adding a step that removes any gRNAs with an "N" would suffice, since these probably aren't very useful anyway? Cheers, Andrew

Comments (2)

  1. Kornel Labun

    It is impossible for me to retrain efficiency scoring models with Ns, these models are from the relevant papers copied as close to the original as possible. And filtering guides with Ns seems like more specific work that the user can do whenever he wants to do after he gets all the results from chopchop.

  2. Log in to comment