Homodimer ensemble NMA

Issue #536 resolved
jay created an issue

For a homodimer enzyme, I have 8 crystal structures.

I wanted to compare the flexibilty profiles by way of ensemble NMA.

I did the following steps :

  1. ids<-c("pdb1", "pdb2", "pdb3", "pdb4", "pdb5","pdb6","pdb7","pdb8")
  2. files<-get.pdb(ids,"path_to_pdb_files")

  3. pdbs=pdbaln(files)

  4. pdbs$xyz = pdbfit(pdbs, outpath="pdbfit")
  5. modes<-nma(pdbs,rtb=FALSE,rm.gaps=FALSE)

  6. print(modes$fluctuations)

I copied the fluctuations to an excel worksheet intending to plot graphs with Matplotlib.(Fluctuations Vs. Alignment position)

Before proceeding, I wanted to check out the multiple sequence alignment. I got the alignment printed by saying in the R console,

  1. pdbs

I copied this MSA (multiple sequence alignment) to a text file and found that the end of the A chains (C-terminal) and the starting of the B chains (N-terminal) had errors in alignment. And since the protein is a homo-dimer, A and B chains have identical sequences.(Please refer to attached file) I edited the MSA manually by using the starting alignment of A chains (N-terminals of A chains) and ending of B chains (C-terminals of B chains) which were aligned correctly. I adjusted the fluctuations data in the fluctuation excel sheet accordingly.

Now my question is, whether I can use the fluctuations data which has been adjusted in columns as per the edited MSA as elaborated above to plot a graph (with Matplotlib) ?

This question arises as normal modes have been calculated by Bio3D based on the multiple sequence alignment that it generated without manual corrections.

Suppose if these fluctuation values cannot be used, then how to read in the manually edited multiple sequence alignment(in which format) and suppose if I store the edited MSA in obj, pdbs2, then

are the following steps correct ?

pdbs2$xyz = pdbfit(pdbs2, outpath="pdbfit") pc.xray = pca(pdbs2) modes1<-nma(pdbs2,rtb=FALSE,rm.gaps=FALSE) print(modes1$fluctuations)

Now, I also would like to filter out the faster modes and retain only the slowest modes across the ensemble. How to specify this during NMA?

Please revert back in case of any queries. I intend including this analysis in a paper.

Thank you for your time.

Comments (10)

  1. Lars Skjærven

    This relates to a similar issue (https://bitbucket.org/Grantlab/bio3d/issues/527/pca-of-multichain-protein) where I suggested the following work around for alignment errors:

    files <- get.pdb(ids)
    
    # alignment of monomers
    splitfiles <- pdbsplit(files)
    pdbs0 <- pdbaln(splitfiles)
    
    # combine monomers to dimers
    ali = NULL
    for( i in seq(1, nrow(pdbs0$ali), by=nrow(pdbs0$ali)/length(files))) {
      f = c(pdbs0$ali[i,], pdbs0$ali[i+1,])
      ali = rbind(ali, f)
    }
    
    # convert matrix to fasta format
    ali = as.fasta(ali)
    ali$id = files
    gaps = gap.inspect(ali$ali)
    ali$ali = ali$ali[, gaps$col < nrow(ali$ali)]
    
    # read PDB files with corrected alignement
    pdbs = read.fasta.pdb(aln = ali)
    
    # redo NMA 
    modes = nma(pdbs)
    

    Note that you can also correct your alignment manually, and then use read.fasta.pdb() to read pdbs:

    files <- get.pdb(ids)
    pdbs <- pdbaln(files)
    write.fasta(pdbs, file="pdbs.fasta")
    
    aln2 <- read.fasta("pdbs_corrected.fasta")
    pdbs2 <- read.fasta.pdb(aln2)
    

    Hope it helps.

  2. jay reporter

    Thank you for your detailed reply. I have one more question:

    What is the units of fluctuations ?

    Thank you for your help.

  3. Xinqiu Yao

    Ideally it should be 'angstrom square' but you can't take it seriously because ENMs do not predict the global amplitude accurately. Instead, compare patterns or relative fluctuations. A convenient way is to normalize the fluctuation vector with the function normalize.vector() (See also help(normalize.vector) for more detail).

  4. Xinqiu Yao

    It is unusual to get negative fluctuations. Could you provide example data and codes that we can reproduce your errors?

  5. Yazhini arangas

    Is fluctuation value represent root mean square fluctuation of an atom?. If not, how is it calculated? Also, does normalize.vector() function normalize a vector V as standard vector normalization V/|V| or doing some other type of normalization?

    Please respond as soon as possible.

    Thank you in advance

  6. Xinqiu Yao

    Fluctuation is mean square fluctuation and so is just square of RMSF. Please read relevant references for how it is calculated, e.g. Hinsen, K. et al. (2000) Chemical Physics 261, 25–37. Note that this issue tracker is for questions/suggestions/bug reports directly related to bio3d. It is not a right place for too general questions.

    'V/|V|' is the way of normalization in normalize.vector().

  7. Log in to comment