-
assigned issue to
Create a data file for residue/value mappings
This would be called by aa2mass() and possibly aa321() and aa123().
File should be called resid.mat and contain "aa1", "aa3", and "aaMass" cols, It should be placed in the "bio3d/inst/matrices/" dir.
We then need to update the previously mentioned functions aa2mass() and possibly aa321() and aa123()*.
Comments (11)
-
-
I tried with a version in inst/matrices. Sure we shouldn't put in data/?
Also, I put previously atom.index.R and sdENM.RData in data/. Is this ok?
-
reporter Depends, if they are binary .rda or as I call them .RData files then put them in data/ if they are scripts, tables or text etc. put them matrices/ Does this not sound sensible?
-
aa.mass was included some time ago for this purpose. it's looked up by aa2mass(), but not aa321 and aa123.
> aa.mass aa3 aa1 aaMass formula name ALA ALA A 71.078 C3 H5 N O1 Alanine ARG ARG R 157.194 C6 H13 N4 O1 Arginine ASN ASN N 114.103 C4 H6 N2 O2 Asparagine ASP ASP D 114.079 C4 H4 N O3 Aspartic Acid CYS CYS C 103.143 C3 H5 N O1 S Cystein GLN GLN Q 117.126 C4 H9 N2 O2 Glutamine GLU GLU E 128.106 C5 H6 N O3 Glutamic Acid GLY GLY G 57.051 C2 H3 N O1 Glycine HIS HIS H 137.139 C6 H7 N3 O1 Histidine ILE ILE I 113.158 C6 H11 N O1 Isoleucine LEU LEU L 113.158 C6 H11 N O1 Leucine LYS LYS K 129.180 C6 H13 N2 O1 Lysine MET MET M 131.196 C5 H9 N O1 S Methionine PHE PHE F 147.174 C9 H9 N O1 Phenylalanine PRO PRO P 97.115 C5 H7 N O1 Proline SER SER S 87.077 C3 H5 N O2 Serine THR THR T 101.104 C4 H7 N O2 Threonine TRP TRP W 186.210 C11 H10 N2 O1 Tryptophan TYR TYR Y 163.173 C9 H9 N O2 Tyrosine VAL VAL V 99.131 C5 H9 N O1 Valine ABA ABA C 85.104 C4 H7 N1 O1 alpha-aminobutyric acid ASH ASH X 115.087 C4 H5 N O3 Aspartic acid Neutral CME CME C 179.260 C5 H9 N O2 S2 s,s-(2-hydroxyethyl)thiocysteine CMT CMT C 117.169 C4 H7 N O1 S o-methylcysteine CSD CSD C 133.126 C3 H3 N O3 S s-cysteinesulfinic acid CSO CSO C 119.142 C3 H5 N O2 S s-hydroxycysteine CSW CSW X 135.142 C3 H5 N O3 S cysteine-s-dioxide CSX CSX C 119.142 C3 H5 N O2 S s-oxy cysteine CYM CYM C 102.135 C3 H4 N O1 S Cystein Negative CYX CYX C 102.135 C3 H4 N O1 S Cystein SSbond GLH GLH X 129.114 C5 H7 N O3 Glutatmic acid Neutral HID HID H 137.139 C6 H7 N3 O1 Histidine HIE HIE H 137.139 C6 H7 N3 O1 Histidine HIP HIP H 138.147 C6 H8 N3 O1 Histidine Positive HSD HSD H 137.139 C6 H7 N3 O1 Histidine HSE HSE H 137.139 C6 H7 N3 O1 Histidine HSP HSP H 138.147 C6 H8 N3 O1 Histidine Positive IAS IAS D 115.087 C4 H5 N O3 beta-aspartyl KCX KCX X 172.182 C7 H12 N2 O3 lysine nz-carboxylic acid LYN LYN X 128.172 C6 H12 N2 O1 Lysine Neutral MHO MHO M 147.195 C5 H9 N O2 S s-oxymethionine MLY MLY K 156.225 C8 H16 N2 O1 n-dimethyl-lysine MSE MSE M 131.196 C5 H9 N O1 SE selenomethionine OCS OCS X 169.156 C3 H7 N O5 S cysteinesulfonic acid PFF PFF Y 165.164 C9 H8 F N O1 4-fluoro-l-phenylalanine PTR PTR X 243.153 C9 H10 N O5 P o-phosphotyrosine SEP SEP S 167.057 C3 H6 N O5 P phosphoserine TPO TPO T 181.084 C4 H8 N O5 P phosphothreonine
-
reporter I guess they should all point to the same file/table. However, aa321 etc. are not broken currently so I sugest we just leave this migration on the back burner for whenever we next need to update for other reasons here. One reason to hold off is that for aa123 the table pasted above contains multiple "X" mapping to different residues where is shoukd rtn UNK I think
-
right, but can map KCX to K, LYN to L, etc in this table? (as we apparently do here for TPO to T)
-
- marked as minor
-
reporter Yes, if we want to use this for aa321() we should have KCX to K and all the other common non-standard modified residue mappings we have in aa321() currently.
Note that we can do the mapping of 3-to-1 like this but not uniquely the other way (1 to 3) unless we demand that only the first instance of a single amino acid code maps to the standard 3 letter version (i.e. H to HIS before H to HSD etc. (as it is in the pasted table in the earlier message)).
I see you have linked this issue to issue #82, for that are you proposing that atom.select() use this list of of protein resid's also in place of the hardcoded 'prot.aa' variable on line 71 of that function?
-
The list of residue names should perhaps primarily be used for sequence stuff (e.g. mapping 321), while the VMD approach for atom.select(). We should anyway aim to have only one list of residue names which I was hoping would solve some of these issues, and at the same time be easier to maintain. In the current version there are three slightly deviating lists (atom.select, aa321/aa123, and table aa.mass).
-
reporter Lets start by using this list for atom.select() and 321 conversion. Migrating atom.select() to the VMD approach, which is based on atom names within a residue, will likely cause other issues. Having a single consistent list as you say will be an improvement.
-
- changed status to resolved
I think this is solved with the aa.table data file. aa321() has been updated
- Log in to comment