Problem when loading NetCDF trajectories

Issue #243 resolved
Juan Eiros created an issue

Hi,

Sorry for posting my problem here, but I haven't found a maling list for users of you program so I figured this was the best place to post my problem.

I have done MD runs with AMBER, and have several files that contain 50 ns of simulation (4000 frames each). I have managed to load into R both the prmtop and the trajectory files correctly, I think.

I'm trying to follow the following tutorial on trajectory analysis: http://thegrantlab.org/bio3d/tutorials/trajectory-analysis

But when I issue this command:

ca.inds <- atom.select(prmtop, elety ="CA")

I get the following error:

Error in as.pdb.default(xyz = crd$xyz, elety = prmtop$ATOM_NAME, resno = resno, : 'xyz' must be a numeric vector/matrix

I think it's failing to properly recognize the prmtop and the trajectory files, as in it's not recognizing some of the attributes those objects should have. Here's what I get when I print my prmtop and run1 (the trajectories) objects:

prmtop

Call: read.prmtop(file = "comphowcut_c_14Anowat.prmtop")

Class: amber, prmtop

System information: Total number of atoms: 7057 Solute residues: 667 Solute molecules: 3

Sequence: MDDIYKAAVEQLTEEQKNEFKAAFDIFVLGAEDGCISTKELGKVMRMLGQNPTPEELQEM IDEVDEDGSGTVDFDEFLVMMVRCMKDDSKGKSEEELSDLFRMFDKNADGYIDLDELKIM LQATGETITEDDIEELMKDGDKNNDGRIDYDEFLEFMKGVEQTEREKKKKILAERRKVLA IDHLNEDQLREKAKELWQSIYNLEAEKFDLQEKFKQQKYEINVLR...<cut>...XXXX

There were 50 or more warnings (use warnings() to see the first 50)

run1

Total Frames#: 20000 Total XYZs#: 21171, (Atoms#: 7057)

[1]  73.34  75.799  39.2  <...>  67.328  108.489  78.117  [423420000]
  • attr: Matrix DIM = 20000 x 21171

I think the warnings of the prmtop are due to it (and the trajectories) having Na+ and Cl- atoms in them.

Can anyone suggest me what to do to manage to do analysis on these MD runs I have?

Thank you for your time,

Juan

Comments (10)

  1. Barry Grant

    Thanks for posting Juan. This is indeed the place for issues and questions. I typically work with PDB format files that correspond to my trajectories so I will let others dig into your prmtop issues.

    To determine if it is really the prmtop that is leading to these problems can you perhaps try your selection with a AMBER generated PDB file? I will note that you can generate a PDB from your prmtop and inpcrd files with:

    $AMBERHOME/exe/ambpdb -p input.prmtop < input.inpcrd > output.pdb
    

    Thanks!

  2. Juan Eiros reporter

    Dear Barry,

    I generated a pdb file from one frame of my simulation, and loaded it using read.pdb. Now it seems to work, as I'm being able to follow the steps of the tutorial. And it's correctly identyfing the non-protein/nucleic residue values.

    > pdb
    
     Call:  read.pdb(file = "output.pdb")
    
       Total Models#: 1
         Total Atoms#: 7057,  XYZs#: 21171  Chains#: 1  (values: NA)
    
         Protein Atoms#: 6809  (residues/Calpha atoms#: 419)
         Nucleic acid Atoms#: 0  (residues/phosphate atoms#: 0)
    
         Non-protein/nucleic Atoms#: 248  (residues: 248)
         Non-protein/nucleic resid values: [ CAL (3), Cl- (122), Na+ (123) ]
    
       Protein sequence:
          MDDIYKAAVEQLTEEQKNEFKAAFDIFVLGAEDGCISTKELGKVMRMLGQNPTPEELQEM
          IDEVDEDGSGTVDFDEFLVMMVRCMKDDSKGKSEEELSDLFRMFDKNADGYIDLDELKIM
          LQATGETITEDDIEELMKDGDKNNDGRIDYDEFLEFMKGVEQTEREKKKKILAERRKVLA
          IDHLNEDQLREKAKELWQSIYNLEAEKFDLQEKFKQQKYEINVLR...<cut>...DLRA
    
    + attr: atom, helix, sheet, seqres, xyz,
            calpha, remark, call
    

    Thanks for your help! My guess is that there's some trouble with the prmtop when it tries to identify my Calcium atoms (which are named CAL) and the Na+ and Cl- atoms.

    > warnings()
    Warning messages:
    1: In FUN(c("MET", "ASP", "ASP", "ILE", "TYR", "LYS", "ALA",  ... :
      Unknown 3-letters code for aminoacid: CAL
    4: In FUN(c("MET", "ASP", "ASP", "ILE", "TYR", "LYS", "ALA",  ... :
      Unknown 3-letters code for aminoacid: Na+
    40: In FUN(c("MET", "ASP", "ASP", "ILE", "TYR", "LYS", "ALA",  ... :
      Unknown 3-letters code for aminoacid: Cl-
    

    Thanks for your suggestion,

    Juan

  3. Barry Grant

    Great thanks Juan, if you get a chance could you mail me your prmtop file so we can make sure we fix this bjgrant<at>umich.edu

  4. Juan Eiros reporter

    Hi,

    I'll send you the prmtop file right away. I have one last question though. At the end of the PCA section in the manual, I am not able to visualize the pdb structures that are generated with the following commands:

    p1 <- mktrj.pca(pc, pc=1, b=pc$au[,1], file="pc1.pdb")
    p2 <- mktrj.pca(pc, pc=2,b=pc$au[,2], file="pc2.pdb")
    

    How can I manage to achieve a similar viewing as what is shown in Figure 7 of the tutorial? There seem to be very large unusual bonds in my generated pdbs. Also, what is the color gradient applied in VMD to highlight the most flexible parts of the protein in red? I would be very interested in achieving such a representation for visualisation. I'm attaching an image of what I see when I open the pc1.pdb and pc2.pdb files and display them with the tube representation. PCA.png

    Thanks again for your time

  5. Barry Grant

    Here VMD is just linking up sequential consecutive residue numbers so if you have a gap region, or other type of chain break but with +1 following residue numbers then VMD will draw straight lines that appear to link the last residue of one segment with the first of another (even though they are not within 'bonding' distance.).

    If you pass a corrected 'resno' input to the mktraj.pca() functions that has these regions numbers not be consecutive (e.g. from the original PDB file) then VMD should not link these atoms.

    I will note in passing that the next version of the package (current development version) has inbuilt interactive 3D visualization included so you wont need VMD anymore to view these results (unless you want it of course).

    For your color gradient question you can map any numeric vector to the B-factor field of your PDB trajectory and then use VMD to color based on that with the "Beta" coloring option. You can change color pallets and ranges to some extent within VMD itself. Hope this helps!

  6. Barry Grant

    Indeed, thanks Lars, this looks to have been fixed already in the latest version:

    > p <- read.prmtop("~/Downloads/protein_ions.prmtop")
    > atom.select(p, elety="CA")
    
     Summary of PDB generation:
     .. number of atoms in PDB determined by 'xyz'
    
     .. 00000419 atom(s) from 'string' selection
     .. 00000419 atom(s) in final combined selection
    
     .. number of atoms in PDB:  7057
     .. number of calphas in PDB: 419
     .. number of residues in PDB: 667
    
    
     Call:  atom.select.prmtop(prmtop = p, elety = "CA")
    
       Atom Indices#: 419  ($atom)
       XYZ  Indices#: 1257  ($xyz)
    
    + attr: atom, xyz, call
    
  7. jbabula

    Hello,

    I have recently experienced the same error as above with the C terminal of one monomer connecting with the N terminal of the other monomer although they are not connected in my PDB file: pdb.png 3865,1 49% dimer.png

    Can you advise what other issues may cause this? Thank you!

  8. Log in to comment