clean.pdb()

Issue #178 resolved
Lars Skjærven created an issue

not working as expected:

pdb1 = trim.pdb(read.pdb("2pah"), chain="A")
pdb2 = read.pdb("1j8u")

# align 1j8u to 2pah
pdb2$xyz = struct.aln(pdb1, pdb2)$xyz

# get rid of non-protein atoms
ligs = trim.pdb(pdb2, resid=c("FE2", "H4B", "HOH"))
cd1 = trim.pdb(pdb2, resid=c("FE2", "H4B", "HOH"), inverse=TRUE)
t1 = trim.pdb(pdb1, resno=425:452)

# merge cd1, t1, and ligands
new = cat.pdb(cd1, t1, ligs, renumber=TRUE, rechain=FALSE)

# residue re-numering is not performed
new$atom$resno[2717:2750]

this is due to the try statement in cat.pdb() (line ~70). omitting try gives the following error:

Error in vec2resno(1:length(unique(res)), res) : 
  Length miss-match of 'vec' and concetive 'resno'

Comments (4)

  1. Xinqiu Yao

    Thanks for spotting this and sorry for the problem!

    I think the reason is without 'rechain=TRUE', some residues will have the same identifier, which will be a problem for 'clean.pdb()' to do renumbering. I would like to suggest we always do 'rechain' internally (in 'cat.pdb()') and, after renumbering, bring back original chain ID if rechain=FALSE. Will try and be back soon.

  2. Xinqiu Yao

    Hi,

    I've made a update to fix this bug, see commit.

    Basically, I wrote a simple internal function to do updating of chain labels in a pdb object. In the main function, we always do 'rechain' for the input pdb objects. After renumbering (if renumber=TRUE), we copy back the original chain IDs if rechain=FALSE.

    Here is the effect:

    new = cat.pdb(cd1, t1, ligs, renumber=TRUE, rechain=FALSE)
    new$atom$resno[2717:2750]
    
     [1] 334 334 334 334 335 335 335 335 335 335 335 335 335 335 336 337 337 337 337
    [20] 337 337 337 337 337 337 337 337 337 337 337 337 337 338 339
    
    unique(new$atom$chain)
    [1] "A"
    

    Let me know if there is still problem. Thanks!

  3. Log in to comment