Convert.pdb incorrectly assesses chain when converting helix/sheet

Issue #755 resolved
Former user created an issue

Convert.pdb introduces NAs during helix/sheet conversion. I believe this occurs because the order of unique(pdb$sheet$chain) and pdb$atom$chain[s.ind] differ (see below, pdb=6pwu). I think the most straightforward fix is to assign chs <- pdb$atom$chain[s.ind][i] instead of chs <- unique(pdb$sheet$chain)[i].

unique(pdb$sheet$chain) [1] "A" "H" "C" "E" "L"

pdb$atom$chain[s.ind] [1] "A" "C" "E" "H" "L"

Comments (8)

  1. Xinqiu Yao

    Hi,

    Can you provide the command of using convert.pdb() for me to reproduce the errors? What were you “converting” here?

  2. Nicholas Garcia

    I was trying to renumber the structure file 6pwu.pdb, using the following commands:

    pdb <- read.pdb('6pwu.pdb', ATOM.only=F)
    pdb_renum <- convert.pdb(pdb, type='original', renumber=T, consecutive=F)
    

    The output:

    > pdb$sheet$start
                                                                   E                                                                                                                                                                                                                                            
      35  490   46  482  226  245   83   91  239  130  155  168  100  100  181  194  204  428  418  274  286  449  462  360  382  408  332  297  438  308  317  489   35 1458   46  483  226  245   84   75   53  219  155  169  181  194  204  428  274  286  448  462  360  374  381  409  332  296  438  308 
    
     317   35  490   46  481  226  245   84   55   75   91  240  131  155  169  181  194  204  428  274  286  446  462  360  374  381  409  332  297  438  308  317    3   19   77   67   11  107   88   35   45   57  120  135  176  163  169  151  194  205    9  102   84   33   45   96   19   70   62  116 
    
     130  172  159  165  154  145  191  201
    
    > pdb_renum$sheet$start
                                                      E                                                                                                                                                                                                                                                         
      4 459  15 451 195 214  52  60 208  99 124 137  69  69 150 163 173 397 387 243 255 418 431 329 351 377 301 266 407 277 286 458   4 555  15 452 195 214  53  44  22 188 124 138 150 163 173 397 243 255 417 431 329 343 350 378 301 265 407 277 286  35  NA  46  NA  NA  NA  88  56  76  95  NA 155 179 193 
    
    205 218 228  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  46  36  NA  76  57   4  14  26  89 104 145 132 138 120 163 174   9 105  86  35  47  99  18  72  64 120 134 176 163 169 158 149 195 205
    

    As you can see, there are two issues. The first is that NAs are introduced during renumbering (I noted the suspected cause above), and the second is that the insertion codes not removed.

  3. Nicholas Garcia

    Coming back to this after a few months away from it. Based on my understanding, the insertion codes should not be retained when renumbering the atoms/residues, because they are only needed when there are multiple residues with the same numerical ID. Since each residue should have a unique numeric identifier after renumbering, the insertion codes should be cleared on the modified pdb

  4. Xinqiu Yao

    We will look into it. As an alternative solution, can you try clean.pdb() and let me know if it works?

  5. Log in to comment