comparative structural analysis --computing distance in ensemble of distinct structures
I am following standard structural analysis section of the PCA tutorial on Bio3d website: http://thegrantlab.org/bio3d/tutorials/principal-component-analysis
My question is that besides RMSD and RMSF is it possible to compute distance or angle for equivalent positions for all structures in the ensemble even though sequence is different?
Specifically, for example what would be an easy way to compute distance between residue 178 and residue 112 of 1TAG and for all other 103 structures in the ensemble at the equivalent position? I am hoping to get a distance vector of length 103 as an output.
Comments (4)
-
-
-
Sorry I am a bit late to this thread. Lars's response is excellent. Below I add just a little more detail on how to find the particular residues mentioned in the question (via first looking-up their residue numbers in the original PDB files, which are also stored in the pdbs named object returned from pdbaln() and related functions). I have also added an alternative way that exposes writing your own simple functions to do these sorts of things more easily.
library(bio3d) ##- Load the example data data(transducin) attach(transducin) ##-- Find the alignment positions of RESNO 112 and 178 i.112 <- which(pdbs$resno["1TAG_A",] == 112) i.178 <- which(pdbs$resno["1TAG_A",] == 178) ##- Convert these Calpha 'atom' indices to 'xyz' indices for coordinate access i.112.xyz <- atom2xyz(i.112) i.178.xyz <- atom2xyz(i.178) ##- Use 'xyz' indices to access coresponding coordinates coords.112 <- pdbs$xyz[,i.112.xyz] coords.178 <- pdbs$xyz[,i.178.xyz] ##- Calculate the requested distances ## (note the effect of the 'all.pairs=FALSE' option) answer <- dist.xyz(coords.112, coords.178, all.pairs=FALSE) > answer [1] 19.52610 19.72753 19.60729 20.76520 20.89739 21.12701 18.53349 20.43393 [9] 20.99625 19.75095 20.05707 20.29599 20.12525 20.24655 18.82836 19.71828 [17] 19.80987 20.55976 20.96552 20.20745 20.32955 21.61822 22.89586 18.44744 [25] 20.81348 20.74089 21.81194 22.22241 21.50130 21.95177 22.15232 21.53123 [33] 20.00782 21.53460 18.28714 19.26866 20.41942 21.26434 21.90934 21.53309 [41] 21.70067 21.96786 21.52007 19.76650 21.11249 23.15418 23.73335 22.30739 [49] 21.78643 23.29267 22.70449 22.91519 21.53917
And a shorter alternative, but perhaps initially harder to read, approach that I include here to demonstrate a general way to calculate values across these types of pdbs objects.
xyz <- pdbs$xyz[, atom2xyz(which(pdbs$resno["1TAG_A",] %in% c(112,178))) ] ##- A simple distance function mydist <-function(xyz) { dist(matrix(xyz, ncol = 3, byrow = TRUE)) } ##- Apply the function to each structure (row) answer2 <- apply(xyz,1, mydist) > answer2 1TND_A 1TND_B 1TND_C 1TAD_A 1TAD_B 1TAD_C 1TAG_A 3V00_C 19.52610 19.72753 19.60729 20.76520 20.89739 21.12701 18.53349 20.43393 3V00_B 3V00_A 1FQJ_A 1FQJ_D 1FQK_A 1FQK_C 1GOT_A 2XNS_A 20.99625 19.75095 20.05707 20.29599 20.12525 20.24655 18.82836 19.71828 2XNS_B 1KJY_A 1KJY_C 2OM2_A 2OM2_C 4G5Q_A 4G5Q_D 1GP2_A 19.80987 20.55976 20.96552 20.20745 20.32955 21.61822 22.89586 18.44744 1AGR_A 1AGR_D 1CIP_A 1GFI_A 1GIA_A 2GTP_B 2ZJY_A 3ONW_A 20.81348 20.74089 21.81194 22.22241 21.50130 21.95177 22.15232 21.53123 3ONW_B 1BH2_A 1GG2_A 1GIT_A 3QI2_A 3QI2_B 1SVK_A 1SVS_A 20.00782 21.53460 18.28714 19.26866 20.41942 21.26434 21.90934 21.53309 3FFA_A 1GIL_A 1AS0_A 1AS2_A 3QE0_B 2ODE_A 2ODE_C 2V4Z_A 21.70067 21.96786 21.52007 19.76650 21.11249 23.15418 23.73335 22.30739 4G5R_A 2IHB_A 4G5O_A 4G5O_D 3C7K_C 21.78643 23.29267 22.70449 22.91519 21.53917
-
- changed status to resolved
please re-open if related questions arise.
- Log in to comment
Jepps. You'll need to dig into the xyz component of your pdbs object. (pdbs is just a multiple sequence alignment together with aligned coordinate data and associated attributes). Thus to fetch the coordinates of individual residues you can access the pdbs$xyz matrix like this:
You can then use dist.xyz(). e.g.:
which will give you the pair-wise distances between residues 50 for the chosen structures (here structure 1-5).
Basically, by understanding how the pdbs$xyz matrix is composed you should be able to do the things you aim for here.