Distance Matrix for antibody with insertion sites (H100A, H100B...)
Hi,
On calculating a distance matrix as below, it seems to miss out the insertion sites in an antibody pdb. There are 227 residues in total, but nrow only shows 222. There are 5 insertion sites that are missed out:
pdbInterest <- read.pdb("pdb-files/AB0194.pdb")
k <- dm(pdbInterest,mask.lower=FALSE)
print(ncol(k))
I have attached AB0194.pdb for you to look at. I essentially just want calpha's and it should give 227 in total.
Thanks, Daniel
Comments (9)
-
-
Yes, it is indeed a bug and we will fix it soon. Thanks for the report! Please keep watching on the releases or master branch for the update. Also refer to this page for how to download and install the development version.
Renumbering all residues before the calculation is an alternative method and should work well. I recommend use the clean.pdb() function, which can do the renumbering and also many other checking for your pdb. Let me know if you have any question or problem.
-
-
assigned issue to
- changed component to ToDo
- changed version to v2.2 [devel]
-
assigned issue to
-
Ok. Thanks! Another thing to think about is when carrying out a difference matrix (subtracting a dmat from another), should bio3d have a default in place if matrices are of different size.
Currently, I have been calculating distance matrices between antibodies of different length, and so to get around the error that ncol and nrow are not equal, I add extra columns/rows so the dmats have same dimensions. Maybe this is something that bio3d could do, adding cols and rows with 'NA' to indicate that a calculation couldn't be made. This would be very helpful.
Daniel
-
Thanks for catching this Daniel. Looks like Xinqiu fixed this bug here d05f82c by including the insert record along with residue number and chain entries in our
grpby
command.I like the suggestion for not failing by default when different size distance matrices are to be compared. However, to me this would only make sense if one protein had a C-terminal extension relative to the other. In most other cases such a comparison would likely be a mistake as we would probably not be subtracting elements for equivalent residue pairs.
If you are only interested in C-alpha distance matrices then perhaps using an aligned
pdbs
object as input for thedm
calculation would be best. Should we have a newdm.pdbs()
function to look after this and thus have the NAs in the correct gap positions etc.?Note that this
dm.pdbs
might help when it comes to assess the correctness of our sequence based structural alignment columns. -
Yes, I agree. Having a new dm.pdbs() function is a good idea to deal with such comparison. Will put to the ToDo list.
-
- changed version to v2.3 [future]
- marked as minor
- marked as enhancement
-
- changed status to resolved
Closed because a simple S3 method for dm.pdbs() fulfills the purpose. (See this commit).
-
- changed version to v2.3 [devel]
- Log in to comment
Do you recommend simply using
?
So:
Then if I export I can re-add the proper row and column names to the matrix later?