comparative structural analysis --computing distance in ensemble of distinct structures

Lars Skjærven

Jepps. You'll need to dig into the xyz component of your pdbs object. (pdbs is just a multiple sequence alignment together with aligned coordinate data and associated attributes). Thus to fetch the coordinates of individual residues you can access the pdbs$xyz matrix like this:

# structure 1-5, residue/column 50
pdbs$xyz[1:5, atom2xyz(50)]

# structure 1-5, residue/column 51
pdbs$xyz[1:5, atom2xyz(51)]

You can then use dist.xyz(). e.g.:

dist.xyz(pdbs$xyz[1:5, atom2xyz(50)], pdbs$xyz[1:5, atom2xyz(50)])

          [,1]      [,2]       [,3]       [,4]       [,5]
[1,] 0.0000000 0.2084326 0.17615631 0.14990663 0.17016530
[2,] 0.2084326 0.0000000 0.23850228 0.25581043 0.25021297
[3,] 0.1761563 0.2385023 0.00000000 0.08628878 0.04510184
[4,] 0.1499066 0.2558104 0.08628878 0.00000000 0.04512605
[5,] 0.1701653 0.2502130 0.04510184 0.04512605 0.00000000

which will give you the pair-wise distances between residues 50 for the chosen structures (here structure 1-5).

Basically, by understanding how the pdbs$xyz matrix is composed you should be able to do the things you aim for here.

2015-02-10T13:46:31+00:00

Lars Skjærven

marked as task
marked as trivial

2015-02-10T14:43:14+00:00

Barry Grant

Sorry I am a bit late to this thread. Lars's response is excellent. Below I add just a little more detail on how to find the particular residues mentioned in the question (via first looking-up their residue numbers in the original PDB files, which are also stored in the pdbs named object returned from pdbaln() and related functions). I have also added an alternative way that exposes writing your own simple functions to do these sorts of things more easily.

library(bio3d)

##- Load the example data 
data(transducin)
attach(transducin)

##-- Find the alignment positions of RESNO 112 and 178
i.112 <- which(pdbs$resno["1TAG_A",] == 112)
i.178 <- which(pdbs$resno["1TAG_A",] == 178)

##- Convert these Calpha 'atom' indices to 'xyz' indices for coordinate access
i.112.xyz <- atom2xyz(i.112)
i.178.xyz <- atom2xyz(i.178)

##- Use 'xyz' indices to access coresponding coordinates
coords.112 <- pdbs$xyz[,i.112.xyz]
coords.178 <- pdbs$xyz[,i.178.xyz]

##- Calculate the requested distances 
##  (note the effect of the 'all.pairs=FALSE' option)
answer <- dist.xyz(coords.112, coords.178, all.pairs=FALSE)

> answer
 [1] 19.52610 19.72753 19.60729 20.76520 20.89739 21.12701 18.53349 20.43393
 [9] 20.99625 19.75095 20.05707 20.29599 20.12525 20.24655 18.82836 19.71828
[17] 19.80987 20.55976 20.96552 20.20745 20.32955 21.61822 22.89586 18.44744
[25] 20.81348 20.74089 21.81194 22.22241 21.50130 21.95177 22.15232 21.53123
[33] 20.00782 21.53460 18.28714 19.26866 20.41942 21.26434 21.90934 21.53309
[41] 21.70067 21.96786 21.52007 19.76650 21.11249 23.15418 23.73335 22.30739
[49] 21.78643 23.29267 22.70449 22.91519 21.53917

And a shorter alternative, but perhaps initially harder to read, approach that I include here to demonstrate a general way to calculate values across these types of pdbs objects.

xyz <- pdbs$xyz[, atom2xyz(which(pdbs$resno["1TAG_A",] %in% c(112,178))) ]

##- A simple distance function
mydist <-function(xyz) { dist(matrix(xyz, ncol = 3, byrow = TRUE)) }

##- Apply the function to each structure (row)
answer2 <- apply(xyz,1, mydist)

> answer2
  1TND_A   1TND_B   1TND_C   1TAD_A   1TAD_B   1TAD_C   1TAG_A   3V00_C
19.52610 19.72753 19.60729 20.76520 20.89739 21.12701 18.53349 20.43393
  3V00_B   3V00_A   1FQJ_A   1FQJ_D   1FQK_A   1FQK_C   1GOT_A   2XNS_A
20.99625 19.75095 20.05707 20.29599 20.12525 20.24655 18.82836 19.71828
  2XNS_B   1KJY_A   1KJY_C   2OM2_A   2OM2_C   4G5Q_A   4G5Q_D   1GP2_A
19.80987 20.55976 20.96552 20.20745 20.32955 21.61822 22.89586 18.44744
  1AGR_A   1AGR_D   1CIP_A   1GFI_A   1GIA_A   2GTP_B   2ZJY_A   3ONW_A
20.81348 20.74089 21.81194 22.22241 21.50130 21.95177 22.15232 21.53123
  3ONW_B   1BH2_A   1GG2_A   1GIT_A   3QI2_A   3QI2_B   1SVK_A   1SVS_A
20.00782 21.53460 18.28714 19.26866 20.41942 21.26434 21.90934 21.53309
  3FFA_A   1GIL_A   1AS0_A   1AS2_A   3QE0_B   2ODE_A   2ODE_C   2V4Z_A
21.70067 21.96786 21.52007 19.76650 21.11249 23.15418 23.73335 22.30739
  4G5R_A   2IHB_A   4G5O_A   4G5O_D   3C7K_C
21.78643 23.29267 22.70449 22.91519 21.53917

2015-02-12T18:33:48+00:00

Barry Grant

changed status to resolved

please re-open if related questions arise.

2015-02-15T01:12:21+00:00

Comments (4)