- edited description
dssp parsing results in extra residue being added
Calling the dssp function results in an extra residue.
R 3.3 To reproduce error using pdb id: 207l
p = read.pdb("207l", rm.insert=TRUE, rm.alt=TRUE)
p_dssp = dssp(p)
length(p_dssp$acc) # result:131
length(which(p$calpha)) # result:130
When you look at the raw dssp output, only 130 residues were processed by dssp, indicating a bio3d parsing error.
Comments (14)
-
reporter -
reporter - edited description
-
Well, dssp actually processed 131 residues in my desktop (See the last row):
128 128 A C S < S- 0 0 5 -3,-1.8 -2,-0.1 2,-0.1 -118,-0.1 0.571 82.9-119.6 -90.7 -11.9 18.4 28.7 16.1 129 129 A G 0 0 80 1,-0.3 -3,-0.1 -4,-0.2 -4,-0.0 0.770 360.0 360.0 79.2 28.2 20.1 31.9 17.5 130 130 A V 0 0 69 -5,-0.4 -1,-0.3 -117,-0.0 -3,-0.1 -0.734 360.0 360.0-116.1 360.0 21.0 30.2 20.8 131 ! 0 0 0 0, 0.0 0, 0.0 0, 0.0 0, 0.0 0.000 360.0 360.0 360.0 360.0 0.0 0.0 0.0 132 200 A X 0 0 75 -34,-0.1 -67,-0.1 -33,-0.0 -52,-0.1 0.000 360.0 360.0 360.0 360.0 9.7 15.1 40.9
I am using DSSP version 2.2.0.
I think the extra one is an "amino acid like" ligand. Try p <- trim(p, 'protein') and you will get correct results.
-
Maybe we should trim the pdb before launching dssp in the dssp function to avoid this? potentially also add argument
trim=TRUE
. -
reporter Using trim.pdb did the trick.
I forgot that dssp only worked with amino acids.
If an extra default argument is added to the dssp function, it would be great to see a warning.
Anyways, thanks once again for the rapid response.
-
I guess simply trim internally would be fine (possibly with a warning message if non-protein atoms exist). Is there any situation that
trim=FALSE
is needed? -
reporter I don't foresee that case existing as dssp ignores non-protein atoms.
-
only backbone atoms are needed for dssp. using
trim(pdb, "ptotein")
might remove non-standard amino acids residues that are not in our list, thus a warning should be there I guess. -
A warning for non protein residues would be helpful - we can add this to the 'ToDo list' for the next version. I note that we already have a warning for residues with missing backbone atoms.
Not sure the user needs to ever set a
trim
argument in the function call. We should however consider labelling the output vectors with their residue number (and chain) from the input PDB object to make tracing unexpected output easier for users. -
- marked as task
- marked as minor
- changed component to ToDo
- changed version to v2.3 [future]
ToDo:
- Add check/warning for non-protein residues but continue without trimming as this could lead to further confusion.
- Add residue number and chain labels to output vectors.
-
looks ok? https://bitbucket.org/Grantlab/bio3d/commits/0bf222df343df863a401c0173f5bf7e09f82693c
> p = read.pdb("207l", rm.insert=TRUE, rm.alt=TRUE) Note: Accessing on-line PDB file HEADER HYDROLASE/HYDROLASE INHIBITOR 26-MAR-96 207L > sse = dssp(p) Warning message: In dssp.pdb(p) : Non-protein residues detected in input PDB: SC2, HOH > tail(sse$sse) 126_A_NA 127_A_NA 128_A_NA 129_A_NA 130_A_NA 200_A_NA "T" "T" "S" " " " " " "
-
Looks good, thanks!
-
- changed status to resolved
-
- changed version to v2.3 [devel]
- Log in to comment