Searching through protiens (PDB files) with R

Issue #330 resolved
Former user created an issue

I am working on a project in which we are attempting to generate frequency plots for all the Chi1, Chi2, and Chi3 torsional angles for disulfide bridges. In order to do this we need to be able to write a program that can autonomously search through proteins that we know have disulfide bridges, and locate the cysteines that are disulfide bonded. We know how to download each file with 1 or more disulfide bridges and we plan to put this data in an external hard drive. We know how to change the directory so R can pull these files from this external hard drive and place each set of data (PDB file) into a vector. We have information to be able to perform a function on each file in the vector. We hope that this applied function will determine the chi torsional angles of each protein and place them in a chi1 vector, chi2, vector, etc, respectively. Even though we can find the cysteines in a protein or even find the sulfides, we don’t know how to check to see if this cysteine is forming a disulfide bridge. Once we are able to do this we believe we can locate the alpha carbon on that amino acid and then start the recording of torsional angles. Is it possible to determine what atoms are bonded to, or what amino acids are bonded to?

R version 3.2.3 (2015-12-10) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 8.1 x64 (build 9600)

locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

Comments (3)

  1. Lars Skjærven

    Hi there, Bio3d does currently not hold functionality to directly determine disulfide bridges, but I image this could be done using various geometrical measures.

    Is the SSBOND record in the PDB files not sufficient for you? We do not read these into the pdb object, but they can easily be fetched with

    > raw = readLines("4fc1.pdb")
    > inds = grep("SSBOND", raw)
    > inds
    [1] 230 231 232
    > raw[inds]
    [1] "SSBOND   1 CYS A    3    CYS A   40                          1555   1555  2.03  "
    [2] "SSBOND   2 CYS A    4    CYS A   32                          1555   1555  2.10  "
    [3] "SSBOND   3 CYS A   16    CYS A   26                          1555   1555  2.09  "
    

    You can then parse these lines and couple this to the bio3d pdb object..

  2. Christien Williams

    This is very helpful! I didn't realize you could access that SSBOND information like that.

    Also, as we are trying to calculate the torsional angles across that SSBOND, when you use

    tor <- torsion.pdb(pdb) it gives you the following result: pdb$tor.png

    Are the values on the far left column the residue/amino acid number in the chain? If so we could use the code you provided and go to the corresponding row of the cystines in which there is a disulfide bond (for ex. rows 3 and 40, per the data you provided) and record the chi values from that row.

  3. Log in to comment