How to get per residue rmsf including side chain contributions?

Issue #198 resolved
Former user created an issue

How can I calculate average RMSF per residue that includes side chain contributions and not just rmsf of CA atoms?

For example for transducin example below. How can I get RMSF that includes side chain contributions or backbone fluctuations summed into per residue fluctuations?

attach(transducin)

# Ignore Gaps
gaps <- gap.inspect(pdbs$ali)

r <- rmsf(pdbs$xyz)
plot(r[gaps$f.inds], typ="h", ylab="RMSF (A)")

detach(transducin)

Comments (7)

  1. Lars Skjærven

    hmm.. you'll have to re-read all PDBs, and go through them to fetch side chain beta factors:

    # fetch files
    ids <- c("1a70_A", "1czp_A", "1frd_A")
    files <- get.pdb(ids, split = TRUE)
    
    # Sequence Alignement
    pdbs <- pdbaln(files)
    
    # new matrix to store results
    fluct <- pdbs$b * 0
    
    # go through each PDB
    for( i in 1:length(pdbs$id) ) {
        # read it
        pdb <- read.pdb(pdbs$id[i])
    
        # find residues
        ca.inds <- atom.select(pdb, "calpha")
    
        # residue identifyer
        resid <- paste(pdb$atom$resno[ca.inds$atom],
                       pdb$atom$chain[ca.inds$atom],
                       sep="-")
    
       # sum over side chains within a residue
        bsum <- unlist(lapply(resid, function(x) sum(pdb$atom$b[ resid == x ])))
    
        # store data in matrix
        fluct[i, !is.na(fluct[i,])] <- bsum
    }
    
  2. Xinqiu Yao

    Hey,

    You can also check out the read.all() function, although it is still experimental. Remember that for heterogeneous sequences, alignment for side-chain heavy atoms is simply based on atom names, which may or may not represent the chemical "equivalence". With this function, you can calculate e.g. average rmsf of backbone atoms:

    # use the example alignment file containing 4 kinesin structures
    file <- system.file("examples/kif1a.fa",package="bio3d")
    aln  <- read.fasta(file)
    
    # read backbone atoms for each residue
    sel <- c("N", "CA", "C", "O")
    pdbs <- read.all(aln, sel=sel)
    
    # fit structures based on all backbone atoms
    xyz <- fit.xyz(pdbs$all[1, ], pdbs$all)
    
    # calculate rmsf for each atom of non-gapped residue position
    r <- rmsf(xyz[, gap.inspect(xyz)$f.inds])
    
    # group atoms based on residue number
    resno <- as.numeric(pdbs$all.resno[1, gap.inspect(pdbs$all.resno)$f.inds])
    
    # calculate mean rmsf for each residu
    r.mean <- tapply(r, as.factor(resno), mean)
    
    # plot
    plot(r.mean, typ='h', ylab="Mean RMSF of backbone atoms (A)")
    
  3. Lars Skjærven

    Nice. This is indeed cuter than my suggestion. Perhaps read.all could be renamed to make it more visible? This could perhaps also take a place in one of the vignettes(?).

  4. Xinqiu Yao

    Thanks! Not sure which name is more suitable; it basically does similar work as read.fasta.pdb and so maybe read.fasta.all?

    This function hasn't been updated for a long time (that is why I said 'experimental' considering we have a lot updates for other 'core' functions. But it still can run without problem). We probably should pay more attention to it and illustrating it in vignette is a good idea. Will see what we can do.

  5. Barry Grant

    The read.all() function dates back to the very early days of the package and updates and simplifications would be welcome to make it's output more consistent with recent updates elsewhere.

    Its main use was to enable equivalent 'all-atom' reading and subsequent analysis of heterogenous structure sets - just as asked here.

    Of course if you have homogenous structures (e.g. the same atoms across your set, as in a simulation output for example or an NMR ensemble) the answer to this question is straightforward - just calculate the RMSF of all your chosen atoms and take the mean based on residue number.

    I would resist renaming until function updates have been finalized.

  6. Log in to comment