How to extract part of protein complex with two molecules with Bio3D

Issue #853 resolved
Gundala Viswanath created an issue

I have a PDB file which contain two molecules (receptor and ligand).
I have also attached the file with this post.
Each molecule will have its own header. All in one PDB file

HEADER rec.pdb
HEADER lig.000.00.pdb

My question is how can I extract each molecule and store it into two Bio3D objects?

G.V.

Comments (7)

  1. Xinqiu Yao

    You don’t need to extract them. Just read it as a whole and use atom.select()function of bio3d to separate them. See help(atom.select) for more detail.

  2. Gundala Viswanath reporter

    Hi Xinqiu,

    Thank you.

    The header of receptor section looks like this (line 1-6 of the PDB file):

    HEADER rec.pdb
    REMARK original generated coordinate pdb file
    ATOM      1  N   GLY A   1     -51.221 -13.970  37.091  1.00  0.00      RA0  N
    ATOM      2  H   GLY A   1     -50.383 -13.584  37.482  1.00  0.00      RA0  H
    ATOM      3  CA  GLY A   1     -50.902 -15.071  36.197  1.00  0.00      RA0  C
    ATOM      4  C   GLY A   1     -49.525 -15.659  36.443  1.00  0.00      RA0  C
    

    and this ligand section (line 11435 to 11440) of the PDB file

    HEADER lig.000.00.pdb
    ATOM      1  N   MET A   1      27.318 -26.957  12.663  1.00  0.00      LA0  N
    ATOM      2  H   MET A   1      27.313 -27.570  11.870  1.00  0.00      LA0  H
    ATOM      3  CA  MET A   1      28.374 -27.102  13.668  1.00  0.00      LA0  C
    ATOM      4  CB  MET A   1      28.531 -28.564  14.090  1.00  0.00      LA0  C
    ATOM      5  CG  MET A   1      27.224 -29.154  14.628  1.00  0.00      LA0  C
    

    Note that the receptor and ligand section also contain the string RA0 and LA0
    at the 11th column of PDB file.

    Then, I tried with this code, where I tried to capture based on RA0 and LA0 column.

    pdb <- read.pdb(model.000.000.pdb)
    
    receptor_segment.sele <- atom.select(pdb, segid = "RA0",  verbose = TRUE)
    receptor_pdb <- trim.pdb(pdb, receptor_segment.sele)
    ligand_segment.sele <- atom.select(pdb, segid = "LA0", inverse = TRUE,  verbose = TRUE)
    ligand_pdb <- trim.pdb(pdb, ligand_segment.sele)
    
    # ligand_pdb show nothing
    

    But ligand_pdb show nothing:

    Call:  trim.pdb(pdb = pdb, ligand_segment.sele)
    
       Total Models#: 1
         Total Atoms#: 0,  XYZs#: 0  Chains#: 0  (values: )
    
         Protein Atoms#: 0  (residues/Calpha atoms#: 0)
         Nucleic acid Atoms#: 0  (residues/phosphate atoms#: 0)
    
         Non-protein/nucleic Atoms#: 0  (residues: 0)
         Non-protein/nucleic resid values: [ none ]
    
    + attr: atom, helix, sheet, seqres, xyz,
            calpha, call
    

    What’s the right way to do it?

    G.V.

  3. Xinqiu Yao

    You have END in the middle of PDB. Bio3D will stop reading at the keyword. Delete END and try it again.

  4. Xinqiu Yao

    You can use R base functions to do it. For example,

    lines <- readLines("my.pdb")
    inds <- grep("^END", lines)
    inds <- inds[-length(inds)] # keep the last END
    lines <- lines[-inds]
    cat(lines, file="my-modified.pdb", sep="\n")
    

  5. Log in to comment