Can't read a non-proteinogenic amino acid

Issue #738 new
Roxana Maria created an issue

Hello,

I have a few simulations that contain a 3-letter coded non-proteinogenic amino acid.

I read the pdb file and printed its sequence with:

pdbseq(pdb,  aa1 = FALSE)

However, this amino acid is not printed. I would like to ask if this is a problem of the pdbseq or if this amino acid is skipped from reading and from the future analysis (RMSF and PCA).

Also, when I use print(pdb) it does find it here but its not shown in the sequence:

Protein Atoms#: 7242 (residues/Calpha atoms#: 445)
Nucleic acid Atoms#: 0 (residues/phosphate atoms#: 0)

Non-protein/nucleic Atoms#: 13 (residues: 1)
Non-protein/nucleic resid values: [ AIB (1) ]

I tried the same thing with a pdb from RCSB that has these types of amino acids and it read and showed those residues.

If anyone can help me with an explanation or suggestion I would very much appreciate it. Thank you.

Comments (2)

  1. Xinqiu Yao

    The residue was recognized as a “ligand”. So, yes, it will be skipped in analyses that focus on protein residues and in the output of pdbseq().

    One quick way to solve it is to modify the topology (usually a single PDB file) to make it recognizable. For example, you can change the residue name to some standard amino acid and the atom name equivalent to C-alpha to “CA” (if necessary).

    Double check once you done the modification by read.pdb() and printing it in R.

    It won’t change results because most analyses just care about XYZ coordinates. What you need is just to remember where the residue is located (by residue number, for example).

  2. Log in to comment