dssp parsing results in extra residue being added

Issue #366 resolved
Mark Sun created an issue

Calling the dssp function results in an extra residue.

R 3.3 To reproduce error using pdb id: 207l

p = read.pdb("207l", rm.insert=TRUE, rm.alt=TRUE)

p_dssp = dssp(p)

length(p_dssp$acc) # result:131

length(which(p$calpha)) # result:130

When you look at the raw dssp output, only 130 residues were processed by dssp, indicating a bio3d parsing error.

Comments (14)

  1. Xinqiu Yao

    Well, dssp actually processed 131 residues in my desktop (See the last row):

      128  128 A C  S <  S-     0   0    5     -3,-1.8    -2,-0.1     2,-0.1  -118,-0.1   0.571  82.9-119.6 -90.7 -11.9   18.4   28.7   16.1
      129  129 A G              0   0   80      1,-0.3    -3,-0.1    -4,-0.2    -4,-0.0   0.770 360.0 360.0  79.2  28.2   20.1   31.9   17.5
      130  130 A V              0   0   69     -5,-0.4    -1,-0.3  -117,-0.0    -3,-0.1  -0.734 360.0 360.0-116.1 360.0   21.0   30.2   20.8
      131        !              0   0    0      0, 0.0     0, 0.0     0, 0.0     0, 0.0   0.000 360.0 360.0 360.0 360.0    0.0    0.0    0.0
      132  200 A X              0   0   75    -34,-0.1   -67,-0.1   -33,-0.0   -52,-0.1   0.000 360.0 360.0 360.0 360.0    9.7   15.1   40.9
    

    I am using DSSP version 2.2.0.

    I think the extra one is an "amino acid like" ligand. Try p <- trim(p, 'protein') and you will get correct results.

  2. Lars Skjærven

    Maybe we should trim the pdb before launching dssp in the dssp function to avoid this? potentially also add argument trim=TRUE.

  3. Mark Sun reporter

    Using trim.pdb did the trick.

    I forgot that dssp only worked with amino acids.

    If an extra default argument is added to the dssp function, it would be great to see a warning.

    Anyways, thanks once again for the rapid response.

  4. Xinqiu Yao

    I guess simply trim internally would be fine (possibly with a warning message if non-protein atoms exist). Is there any situation that trim=FALSE is needed?

  5. Lars Skjærven

    only backbone atoms are needed for dssp. using trim(pdb, "ptotein") might remove non-standard amino acids residues that are not in our list, thus a warning should be there I guess.

  6. Barry Grant

    A warning for non protein residues would be helpful - we can add this to the 'ToDo list' for the next version. I note that we already have a warning for residues with missing backbone atoms.

    Not sure the user needs to ever set a trim argument in the function call. We should however consider labelling the output vectors with their residue number (and chain) from the input PDB object to make tracing unexpected output easier for users.

  7. Barry Grant

    ToDo:

    • Add check/warning for non-protein residues but continue without trimming as this could lead to further confusion.
    • Add residue number and chain labels to output vectors.
  8. Log in to comment