PCA-app, tab #4: PCA

Issue #233 resolved
Lars Skjærven created an issue
  • The regular PC1vPC2 plot should have increased xlim and ylim values so labels are not clipped.
  • The interactive scree plot needs the cumulative variance as labels for the points (or potentially better as a separate axis in position 4).

  • We should implement PC subspace clustering here (in addition to RMSD clustering).

  • We could have toggling between RMSD clustering and PC based clustering.
  • Initial implementation could be confined to PC1-2 or 1-3 space distances with a letter update to include PCs covering 90% of original variance.

  • The Residue loadings plot should be called 'Residue Contributions'.

  • It should have RMSF on by default and have the axis labeled.
  • The dropdown could have a a '1,2,3' option to see the top 3 PCs together in one plot (and '1 to 10' option also).

  • We should explore having an interactive alternative so positions can be identified more easily.

  • See previous comments about the DT table (https://bitbucket.org/Grantlab/bio3d/issue/230/pca-app-tab-1-search). The "RMSD cluster" col should be called just "Cluster" or "RMSD/PCA cluster"

Comments (13)

  1. Barry Grant

    Nice set of recent updates Lars. Have you come across this error with increasing the number of structure hits?

    Error: 'getCharCE' must be called on a CHARSXP
    
  2. Barry Grant

    I saw it previously with 2LUM on your version of the app when increasing the 'Limit hits' number.

    I was trying to reproduce just now with less hits but got a different error. It turns out N=43 works ok.

    However, with N=66 and above I now get a different error on the Fit tab "Error: invalid 'times' argument".

    Also, using a old version of PyMol (v1.5) I see the attached interesting red background with the fit N=44 pse download.

    With the current PyMol 1.7.6.0 version I see a regular black background but only two molecules are colored (red and green) the rest gray. All molecules are shown as 'all atom' (what PyMol calls 'lines') and not the calpha trace colored by molecule that I was expecting. It is easy to put on 'ribbon' to see things more clearly I guess but how do you color by molecule?

    Basically, I think it would be good to have the 'color by molecule' (and perhaps 'ribbion' display) setup in the downloaded script.

    For VMD I was considering having the occupancy field changed to alignment position so the user could select and display equivalent positions across the ensemble more easily. Do you think will work easily in PyMol also? Or should we leave the original files alone?

    pymol_wierd.png

  3. Lars Skjærven reporter

    Your first error might have been a segmentation fault (?), or did the error show up with red text in the browser? I've seen the seg faults a couple of times when running over many structures, but I think I've managed to limit them now by adding a few calls to gc().

    Invalid times argument comes from core.find() function when length(res.still.in) <= stop.at. I've added a fix (throwing an error) on this now for the master branch, and added a more meaningful error message for the shiny app.

    Pymol 1: I think we will have some problems with version compatibility issues of these session files. However, you can probably just do a "bg black" and it will show as expected. We can also provide the pymol script files which is used to make the session files. these are less prone to compatibility issues.

    Pymol 2: they are colored by their cluster ID. you can choose this in the structure color in the PDBs Viewing Options panel.

    Pymol 3: we can probably use 'ribbon' as default as you suggest. we could also add this as an option in the pca-app (e.g. ribbon, cartoon, lines), but this is probably not needed since it's relatively straightforward to change representation in pymol.

    VMD: Good idea. This should work in pymol as well ! We can try and see how it works in practice

  4. Barry Grant

    Minor things:

    On the PCA tab the conformer plot and the table with annotations need to be one after the other.

    Also now the table selection with highlighting in the conformer plot no longer works. Was this removed on purpose and if so why?

    The loadings plots should have a default option to plot multiple PC 'Residue contributions' all on the same graph (just like the NMA 'Residue fluctuations') and a similar 'Spread lines' option but without the axis(2) being shown.

    I vote for dropping the '3D scatter (three-js)' option.

  5. Lars Skjærven reporter

    Fixed. Note that for multiple PC residue contributions on the same graph I removed the RMSF line.

  6. Log in to comment