RMSD of several pdbs from data path

Barry Grant

changed status to resolved

I am marking this as resolved. As Xinqiu points out the answer to the first question is to use the list.files() function, e.g.

> list.files(mypath, ".pdb$")
 [1] "1RPI.pdb" "1RQ9.pdb" "1RV7.pdb" "1TW7.pdb" "3OQ7.pdb" "3OQA.pdb"
 [7] "3OQD.pdb" "3OTS.pdb" "3OTY.pdb" "3OU1.pdb" "3OU3.pdb" "3OU4.pdb"
[13] "3OUA.pdb" "3OUB.pdb" "3OUC.pdb" "3OUD.pdb" "3PJ6.pdb" "3R0W.pdb"
[19] "3R0Y.pdb" "3SO9.pdb" "3SPK.pdb" "4EYR.pdb" "4FAE.pdb" "4FAF.pdb"
[25] "4GYE.pdb" "4GZF.pdb"

The 'full.names=TRUE' might be required if the files live in a different directory, e.g.

> list.files(mypath, ".pdb$", full.names=TRUE)
 [1] "hivp/raw_pdbs//1RPI.pdb" "hivp/raw_pdbs//1RQ9.pdb"
 [3] "hivp/raw_pdbs//1RV7.pdb" "hivp/raw_pdbs//1TW7.pdb"
 [5] "hivp/raw_pdbs//3OQ7.pdb" "hivp/raw_pdbs//3OQA.pdb"
 [7] "hivp/raw_pdbs//3OQD.pdb" "hivp/raw_pdbs//3OTS.pdb"
 [9] "hivp/raw_pdbs//3OTY.pdb" "hivp/raw_pdbs//3OU1.pdb"
[11] "hivp/raw_pdbs//3OU3.pdb" "hivp/raw_pdbs//3OU4.pdb"
[13] "hivp/raw_pdbs//3OUA.pdb" "hivp/raw_pdbs//3OUB.pdb"
[15] "hivp/raw_pdbs//3OUC.pdb" "hivp/raw_pdbs//3OUD.pdb"
[17] "hivp/raw_pdbs//3PJ6.pdb" "hivp/raw_pdbs//3R0W.pdb"
[19] "hivp/raw_pdbs//3R0Y.pdb" "hivp/raw_pdbs//3SO9.pdb"
[21] "hivp/raw_pdbs//3SPK.pdb" "hivp/raw_pdbs//4EYR.pdb"
[23] "hivp/raw_pdbs//4FAE.pdb" "hivp/raw_pdbs//4FAF.pdb"
[25] "hivp/raw_pdbs//4GYE.pdb" "hivp/raw_pdbs//4GZF.pdb"

Then if the PDB files are of different composition (i.e. contain a different number of atoms, residues etc. ), the pdbaln() function will find equivalent positions and produce an object suitable for further analysis.

> myfiles <- list.files(mypath, ".pdb$", full.names=TRUE)
> pdbs <- pdbaln(files)

If the PDB files have the same number of atoms then you can skip the alignment step of the pdbaln() call and use a simple loop, perhaps something like this:

myfiles <- list.files(mypath, ".pdb$", full.names=TRUE)
xyz <- NULL
for(i in 1:length(myfiles)) {
    xyz <- rbind(xyz, read.pdb(myfiles[i])$xyz)
}
pdb <- read.pdb(myfiles[1])
ca.inds <- atom.select(pdb, "calpha")
rmsd(xyz[,ca.inds$xyz], fit=TRUE)

2014-05-14T21:10:52+00:00

Comments (2)