I'm trying to implement support for loading IdentiPy pep.xml files in PeptideShaker (http://compomics.github.io/projects/peptide-shaker.html), but I've come across a couple of issues with the pep.xml files that I hope you can take a look at?
1) The information about the PTMs is missing in the search_summary. We use this information to figure out which PTMs are fixed. Here's an example from how this is annotated Comet (http://comet-ms.sourceforge.net):
<aminoacid_modification aminoacid="M" massdiff="15.994915" mass="147.035400" variable="Y" symbol="*"/> <aminoacid_modification aminoacid="C" massdiff="57.021464" mass="160.030649" variable="N"/>
2) The name of the spectrum file should (according to the pep.xml schema) be written without the file ending, i.e. base_name="./my_spectra.pep.xml"should be changed to base_name="./my_spectra".
3) I don't see how to map back to the originating spectrum from the information provided in the spectrum_query tag? At first I assumed that spectrum="Spectrum_35528" referred to spectrum number 35528 in the mgf file used as input, but this number seem to go higher than the number of spectra in the mgf file? Would it be possible to instead include the spectrumNativeID tag referring to the spectrum title in the mgf file? This is how it is implemented in Comet and makes the mapping straightforward.
4) Have you considered introducing CV terms for the IdentiPy scores? These will for example be needed when converting the results to mzIdentML for submission to PRIDE (https://www.ebi.ac.uk/pride). Here's again an example from Comet: https://www.ebi.ac.uk/ols/ontologies/ms/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMS_1002252
Hopefully these things should not be too had to fix?
Please let me know if you need more details.