Grantlab / bio3d / issues / #15 - trim.pdb should provide a pdb object — Bitbucket

Issue #15 resolved

Lars Skjærven created an issue 2013-07-28

pdb$calpha is commented out in the function. Should I make this function more complete, or is there a reason why only the xyz and atom components are returned?

Comments (10)

Xinqiu Yao
I agree that it would be better if the return value of trim.pdb is a complete PDB object. What do you think, Barry?
- 2013-07-31T17:13:39+00:00
Lars Skjærven reporter
- assigned issue to
  
  Lars Skjærven
- 2013-07-31T18:26:37+00:00
Barry Grant
Sounds good, thanks guys!
- 2013-08-01T18:23:13+00:00
Lars Skjærven reporter
- changed status to resolved
I first made a version where I iterate over the helix and sheet components to provide SSE information consistent with the new PDB object. However, I'm not too convinced about the need for it, so I removed it in favor of a cleaner version.

Instead I return sheet, helix, seqres, and het unmodified. Only components atom, xyz, xyz.models, and calpha are trimmed. Let me know if this is insufficient.
- 2013-08-14T14:33:11+00:00
Barry Grant
I am also not sure there is a need for the sse records. However, if we return them for consistency then they probably need to be trimmed also.

Having the $het unmodified is fine but returning the $helix and $sheet un-trimed will mess up "plot.bio3d(x,sse=pdb)".

I am not sure how often the 'sse' gets used in this way, as you need resno and the $helix and $sheet components to match for these plots to be correct, and thats currently not as easy as it should be (this is the purpose of the "resno=FALSE" options to dssp() and stride() functions).

Still I think if we return $helix and $sheet then they need to be trimmed. Have you got code for this see trimming already Lars?
- 2013-08-14T17:41:02+00:00
Barry Grant
- changed status to open
trim $helix and $sheet components also...
- 2013-08-14T17:42:12+00:00

Lars Skjærven reporter

uhm yes. but it's quite crappy. I did a for-loop for each element in pdb$helix and used atom.select to check if the helix was there :D. So far large proteins it was too slow.

     if(!is.null(pdb$helix)) {
        for ( i in 1:length(pdb$helix$start) ) {

          st <- paste("//", pdb$helix$chain[i], "/",
                      pdb$helix$start[i], ":", pdb$helix$end[i], "////", sep="")
          sele <- atom.select(new.pdb, st, verbose=verbose)

          if(length(sele$xyz)>0) {
            helix$start <- c(helix$start, as.numeric(new.pdb$atom[sele$atom[1], "resno"]))
            helix$end <- c(helix$end, as.numeric(new.pdb$atom[sele$atom[length(sele$atom)], "resno"]))
            helix$chain <- c(helix$chain, pdb$helix$chain[i])
            helix$type <- c(helix$type, pdb$helix$type[i])
          }
        }
      }

Great? I'll see what changes I can do to make it better.

2013-08-14T18:40:17+00:00

Barry Grant
Maybe a combination of unbound() and %in% and bounds() would do the trick? Thanks for looking into this!

e.g. trim helix to resno 20:40

pdb<-read.pdb("4q21")

h <- unbound(pdb$helix$start, pdb$helix$end)

ht <- bounds( h[h %in% 20:40] )

new.helix$start = ht[,"start"] # etc... for "end" and "length"
- 2013-08-15T02:23:49+00:00
Lars Skjærven reporter
New version committed. It's a bit clumsy written, but it should do the trick. I had problems getting the above code to work with multi-chain PDB files. Let me know if there is a more elegant way.
- 2013-08-23T12:07:20+00:00
Barry Grant
- changed status to resolved
Works well in my tests!
- 2013-09-06T16:12:56+00:00
Log in to comment

Assignee: Lars Skjærven

Type: enhancement

Priority: minor

Status: resolved

Component: –

Version: –

Votes: 0

Watchers: 3

Jira: the preferred issue tracker for Bitbucket. Join the team!