cmap appears for force.renumber atoms

Issue #445 new
Ron Ayoub created an issue

I would like to create many pairs of contacts for many pdb files and I would like those indices in the cmap matrix to correspond to the resno fields in the atom records. As it stands, it appears the cmap forces a renumbering of the indices so that they start at 1. I'm not sure if it also compresses in the case of gaps. Currently, I have to force a renumbering of the pdb via trimming and then cleaning with the force.renumber options. Is there a better way to approach this?

Comments (2)

  1. Xinqiu Yao

    Hi,

    cmap() function takes care of gaps but ignores the resno field. Actually, I don't think it is a good idea to build a cmap matrix matching 'resno'. In many pdb files, 'resno' is not a unique identifier to distinguish residues. For example, residues in different chains can have the same resno. Residues can even have same chain ID and resno (with different 'insert' codes).

    I suggest name rows/columns of the cmap matrix by corresponding resno (or a combination of chain ID, resno, and insert code). Then, you can map residues to contact map without renumbering the pdb.

    Let me know if it works for you.

  2. Ron Ayoub reporter

    Thanks for the response.

    For the discussion below, assume I ignore inserts and have already handled multiple chains.

    I'm interested in extracting contact differentials between residues. I have selected only carbon alpha atoms and hence have only 1 atom per residue. The distance between residues as numbered is important to maintain in the case of missing residues. For instance, if resno 4 contacts resno 8, even if resno 6 and 7 are absent, that differential should still be maintained as 8 - 4 = 4 since those intervening residues do exists in the chain whether or not their position was discovered by crystallography.

    After melting the cmap matrix I store a data frame called cmap.pairs. I just convert the serial indices in the cmap to the actual resnos obtained from the selection of carbon alphas. above.

        cmap.pairs$Var1 <- cmap.props$resno[cmap.pairs$Var1]
        cmap.pairs$Var2 <- cmap.props$resno[cmap.pairs$Var2]
    

    I think this is sufficient for now. Inserts are an issue and occasionally I see an old pdb where the resnos are not consecutive and not because of missing residues. I have not determined how often that happens and if I should worry about that yet.

    I do think that the way I solved my issue may be the prescriptive way to solve it using bio3d and I'm not sure it is the responsibility of bio3d to handle this.

    Thanks again for a great tool to work with.

  3. Log in to comment