trim.pdb should provide a pdb object

Issue #15 resolved
Lars Skjærven created an issue

pdb$calpha is commented out in the function. Should I make this function more complete, or is there a reason why only the xyz and atom components are returned?

Comments (10)

  1. Xinqiu Yao

    I agree that it would be better if the return value of trim.pdb is a complete PDB object. What do you think, Barry?

  2. Lars Skjærven reporter

    I first made a version where I iterate over the helix and sheet components to provide SSE information consistent with the new PDB object. However, I'm not too convinced about the need for it, so I removed it in favor of a cleaner version.

    Instead I return sheet, helix, seqres, and het unmodified. Only components atom, xyz, xyz.models, and calpha are trimmed. Let me know if this is insufficient.

  3. Barry Grant

    I am also not sure there is a need for the sse records. However, if we return them for consistency then they probably need to be trimmed also.

    Having the $het unmodified is fine but returning the $helix and $sheet un-trimed will mess up "plot.bio3d(x,sse=pdb)".

    I am not sure how often the 'sse' gets used in this way, as you need resno and the $helix and $sheet components to match for these plots to be correct, and thats currently not as easy as it should be (this is the purpose of the "resno=FALSE" options to dssp() and stride() functions).

    Still I think if we return $helix and $sheet then they need to be trimmed. Have you got code for this see trimming already Lars?

  4. Lars Skjærven reporter

    uhm yes. but it's quite crappy. I did a for-loop for each element in pdb$helix and used atom.select to check if the helix was there :D. So far large proteins it was too slow.

         if(!is.null(pdb$helix)) {
            for ( i in 1:length(pdb$helix$start) ) {
    
              st <- paste("//", pdb$helix$chain[i], "/",
                          pdb$helix$start[i], ":", pdb$helix$end[i], "////", sep="")
              sele <- atom.select(new.pdb, st, verbose=verbose)
    
              if(length(sele$xyz)>0) {
                helix$start <- c(helix$start, as.numeric(new.pdb$atom[sele$atom[1], "resno"]))
                helix$end <- c(helix$end, as.numeric(new.pdb$atom[sele$atom[length(sele$atom)], "resno"]))
                helix$chain <- c(helix$chain, pdb$helix$chain[i])
                helix$type <- c(helix$type, pdb$helix$type[i])
              }
            }
          }
    

    Great? I'll see what changes I can do to make it better.

  5. Barry Grant

    Maybe a combination of unbound() and %in% and bounds() would do the trick? Thanks for looking into this!

    e.g. trim helix to resno 20:40

    pdb<-read.pdb("4q21")

    h <- unbound(pdb$helix$start, pdb$helix$end)

    ht <- bounds( h[h %in% 20:40] )

    new.helix$start = ht[,"start"] # etc... for "end" and "length"

  6. Lars Skjærven reporter

    New version committed. It's a bit clumsy written, but it should do the trick. I had problems getting the above code to work with multi-chain PDB files. Let me know if there is a more elegant way.

  7. Log in to comment