BUG: read.pdb() or maybe other related functions
library(bio3d)
read.pdb("1a7l")
Note: Accessing on-line PDB file
HEADER TRANSPORT 16-MAR-98 1A7L
Call: read.pdb(file = "1a7l")
Total Models#: 1
Total Atoms#: 8750, XYZs#: 26250 Chains#: 3 (values: A B C)
Protein Atoms#: 8634 (residues/Calpha atoms#: 1114)
Nucleic acid Atoms#: 0 (residues/phosphate atoms#: 0)
Non-protein/nucleic Atoms#: 116 (residues: 51)
Non-protein/nucleic resid values: [HOH (48), MAL (3) ]
Error in rep(pdb$helix$chain, (pdb$helix$end - pdb$helix$start + 1)) :
invalid 'times' argument
Note that this error only occurs for specific pdbs e.g. the one used above. And also, the bug exists since the released version 2.1.
Comments (5)
-
reporter -
I guess we need to add the 'insert' (if there) into the sse records as an extra vector. Thanks for catching this!
-
reporter Or maybe just a 'names' attributes to $start and $end vectors? Otherwise, we need two additional "insert" vectors, one for start residue and one for end.
Another question: I found that the internal function pdb2sse() has been used in many places with almost identical form (e.g. plot.bio3d(), plot.cmap(), read.fasta.pdb(), and also the new clean.pdb()). It would be handy if we make it public and just call it at necessary places instead of copying many times. It will also help a lot for the above debugging, which is related to the SSE trim in trim.pdb().
Does it make sense?
-
Yes, both make sense to me.
-
- changed status to resolved
Fixed with pdb2sse() function and updated trim.pdb()
- Log in to comment
Okay, I found that it is related to the residues with insert codes. Now it is the same problem as we discussed before (See issue): in some pdb files, residues are distinguished by a combination of not only "chain ID" and "resno" but also "insert".
In current read.pdb(), atom records store everything and so we can do renumbering for this situation (e.g. with the new clean.pdb() function). But, SSEs are annotated by chain and resno only.
One suggestion is: We add the 'insert' into pdb$helix and pdb$sheet. Then, clean.pdb() can do renumbering for these annotations, too.
What do you think?