pdb object

Issue #450 resolved
Diego Gallego created an issue

Hi! I've been playing around for a while with the package and found that "read.cif" and "read.pdb" do not provide entirely equivalent outputs. The difference is in the "object$atom" data frame. Whereas in the first case the 14th column is called "segid", in the second case it is called "entid" (I could not figure out what was the difference between both).

An example code:

cif<-read.cif("1A1T"); pdb<-read.pdb("1A1T")

str(cif$atom)

'data.frame': 1516 obs. of 16 variables: [...] $ entid : int 1 1 1 1 1 1 1 1 1 1 ... $ elesy : chr "P" "O" "O" "O" ... $ charge: chr NA NA NA NA ...

str(pdb$atom)

'data.frame': 1516 obs. of 16 variables: [...] $ segid : chr NA NA NA NA ... $ elesy : chr "P" "O" "O" "O" ... $ charge: chr NA NA NA NA ...

My real problem comes when I tried to execute the "dssp" function. It works fine for the pdb but generates an error for the cif:

dssp(cif) Error in [.data.frame(pdb$atom, , "segid") : undefined columns selected

I looked at the code and found the problem is when the "dssp.pdb" function calls the "write.pdb" function. At this point it's easy for me to work with the pdb instead of the cif, but I thought it would be useful to report the problem. Many thanks in advance!

Comments (7)

  1. Barry Grant

    Thanks for the very helpful report Diego. We should fix that (perhaps with a call to as.pdb() internally) along with some better tests of these functions. Cheers! Barry

  2. Diego Gallego reporter

    Hi Barry! Thanks for the quick answer. I tried what you commented:

    dssp.pdb(as.pdb(cif))

    ...but it generated the same error. The function "as.pdb" simply removes the column "entid", but it's not enough for the "write.pdb" function to generate the pdb file. I added a piece of code to the function "write.pdb" (then reinstalled the package) and now the "dssp.pdb" function works fine. I'm going to attach my updated function ("write.pdb") so that you can see the new code and maybe add it to a future version of the package.

    On the other hand, I'm pretty sure that the "dssp.pdb" function will not work for very large pdb (those ones that have more than 99999 atoms), since the function always writes a temporary file in the pdb format (input for "dssp") and in some cases the cif file will be necessary as input. I don't know what you are currently working on for the future bio3d release, but if I may suggest something, I think a "write.cif" function would be pretty useful! Cheers!

  3. Log in to comment