more than 100,000 atoms notation
The atom ID / "eleno" (element number) filed (2nd column in a PB file) is only 5 digits long. Thus, it can accomodate 100,000 atoms using "conventional numbering" (e.g. 1, 2, 3, etc.)
But in PDB files with long molecules (e.g. 28s ribosomal RNA), if there are more than 100,00 atoms. In these cases, the PDB files resort to alpha-numeric numbering. i.e. after atom 100,000, the atom number / "eleno" continues with, for example: 186a0, 186a1,186a2, etc.
The read.pdb function detects these non-numeric "eleno" entries and throws an error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got '186a0'
Comments (9)
-
-
- changed version to v2.3 [future]
- changed component to ToDo
-
Hi,
Your question is related to this issue. The answer seems either convert the pdb file into Amber topology file and use
read.prmtop()
or use an alternative pdb reading function calledread.pdb2()
with the option hex=TRUE. Note thatread.pdb2()
only exists in the feature_cpp branch. See the issue for more details.[To developers] Btw, should we merge feature_cpp asap or leave it for a longer time testing?
-
also finish implementation of read.cif. the old pdb format will anyway be obsolete in some time.
-
We need better tests in general for the functions in this branch (as well as others). I am in favor of getting these into master once we have these tests in place.
-
- marked as major
-
-
assigned issue to
See issue
#306where Lars volunteered for looking after these things. -
assigned issue to
-
- changed version to v2.3 [devel]
-
- changed status to resolved
Resolved, right?
- Log in to comment
Thanks for reporting this - we will fix this...