Can't add other decoy label

Issue #30 closed
Anonymous created an issue

Hi,

I have performed an entrapment experiment with X!Tandem. All target labels start with ">generic|T_" and all decoy labels start with ">generic|D_". Using Pyteomics I have read them with tandem.TandemXML(result.t.xml) and now I need to filter them using an FDR threshold. Therefore, I use tandem.filter(result.t.xml, fdr=0.05, is_decoy = "D_"), but it throws a KeyError: KeyError: 'D_'.

How can I solve this?

Thanks in advance!

Comments (2)

  1. Lev Levitsky repo owner

    Hi,

    short answer: you probably want to use tandem.filter('result.t.xml', fdr=0.05, decoy_prefix = "generic|D_")

    Long answer:

    When you specify a string as is_decoy, it is used as a key. In this case you are filtering an array of dicts, so the corresponding key must be present in each dict for this code to work. Another use case for string as is_decoy is when you are filtering a record array or a dataframe.

    Otherwise, is_decoy is supposed to be a function that takes a PSM (a dict in this case) and returns a boolean.

    tandem.filter has a default value for is_decoy which looks for decoy prefix in all proteins for a given PSM, the default prefix is "DECOY_". You can keep the function but override the prefix by providing decoy_prefix. However, the prefix is matched against the beginning of the FASTA header, hence I specified "generic|D_" and not just "D_".

    Let me know if you have any follow-up questions.

    Best regards,

    Lev

  2. Log in to comment