read.pdb gives error in scan

Issue #220 resolved
Former user created an issue

Some times atom names might contain comma. For example, PDB: 1H5T, contain ligand DAU with atom names like "C5,". This cause an error in read.pdb with a message "Error in scan". This is because the read.pdb also use comma as delimiter. I would suggest to change the comma delimiter to something else so that these errors are avoided.

Comments (4)

  1. Xinqiu Yao

    That is really a thing that we never noticed. Thanks for reporting! I noticed that there are two places related in 'read.pdb()': split.fields() and read.table(..., sep=",", ...). We may need to replace the comma with something very unlikely to appear in PDB files. What about "$"? Any other suggestion?

  2. Lars Skjærven

    The last bugfix in releases should work for you now. To install this version you do:

    install.packages("devtools")
    library(devtools)
    install_bitbucket("Grantlab/bio3d", ref="releases", subdir = "ver_devel/bio3d/")
    
    > library(bio3d)
    > read.pdb("1H5T")
      Note: Accessing on-line PDB file
      HEADER    TRANSFERASE                             25-MAY-01   1H5T               
    
     Call:  read.pdb(file = "1H5T")
    
       Total Models#: 1
         Total Atoms#: 9839,  XYZs#: 29517  Chains#: 4  (values: A B C D)
    
         Protein Atoms#: 9089  (residues/Calpha atoms#: 1160)
         Nucleic acid Atoms#: 0  (residues/phosphate atoms#: 0)
    
         Non-protein/nucleic Atoms#: 750  (residues: 510)
         Non-protein/nucleic resid values: [ DAU (4), HOH (501), SO4 (1), TYD (4) ]
    
       Protein sequence:
          KMRKGIILAGGSGTRLYPVTMAVSKQLLPIYDKPMIYYPLSTLMLAGIRDILIISTPQDT
          PRFQQLLGDGSQWGLNLQYKVQPSPDGLAQAFIIGEEFIGGDDCALVLGDNIFYGHDLPK
          LMEAAVNKESGATVFAYHVNDPERYGVVEFDKNGTAISLEEKPLEPKSNYAVTGLYFYDN
          DVVQMAKNLKPSARGELEITDINRIYLEQGRLSVAMMGRGYAWLD...<cut>...MTKD
    
    + attr: atom, helix, sheet, seqres, xyz,
            calpha, remark, call
    
  3. Log in to comment