How to extract data from PCA?

Issue #840 resolved
Vinnarasi created an issue

I did PCA analysis for my protein system. I could save only the bio3d image format. But, I'm unable to save data file format. Could you help how to extract data files for PCA from bio3d?

Comments (11)

  1. Xinqiu Yao

    It’s not clear what you are looking for. Are you talking about the file format of the figure or the format of the PCA results such as the conformational projections, scree, etc.?

  2. Vinnarasi reporter

    Thank you for your reply. I have got PC1 vs PC2, PC1 vs PC3, PC2 vs PC3 and eigenvalue rank vs proportion of variance figures from bio3d by using below comment
    ”plot(pc, col=grps)”

    How to save their (PC1 vs Pc2, PC1 vs PC3..,) respective data format through R programming (for example .dat or .csv format)?

  3. Xinqiu Yao

    The raw data are stored in the ‘pc’ object. For example, to save the data for PC1 vs PC2, use write.table(pc$z[, 1:2], file='pc1_pc2.dat', col.names=FALSE, row.names=FASLE). The first column is for PC1 and the second for PC2. Columns are separated by space. If you want the CSV format, use write.csv() instead.

  4. Vinnarasi reporter

    Thank you for your fast reply. It is working fine now. Can I get raw data for Eigenvalue Rank vs Proportion of variance (%)?

  5. Xinqiu Yao

    Eigenvalues are in ‘pc$L’, and the proportion is simply the eigenvalue divided by the sum of all eigenvalues. For example, you could write a table with the first column the eigenvalues and second the proportions by write.table(cbind(pc$L, pc$L/sum(pc$L)), file='eigen.dat', row.names=FALSE, col.names=FALSE)

  6. Vinnarasi reporter

    I used command which you have provided me (write.table(cbind(pc$L, pc$L/sum(pc$L)), file='eigen.dat', row.names=FALSE, col.names=FALSE)) to get raw data for eigenvalue vs proportion of variance. But I got entirely different graph and their corresponding values are not match to original one. I have enclosed both plot for your reference.

  7. Xinqiu Yao

    Modify the code to write.table(cbind(1:length(pc$L), pc$L/sum(pc$L)), file='eigen.dat', row.names=FALSE, col.names=FALSE)

  8. Vinnarasi reporter

    Thank you for your reply. I used that command which you have recently provided me. Still, I’m getting different values and I have enclosed it for your reference. Thank you

  9. Xinqiu Yao

    Hi,

    It is actually the same, just x-scale is different. I don’t know what plotting software you use. You should adjust the x-scale (0 to 20), and possibly make it plotting a mixture of lines and points. Then, you probably get the same or similar figure. On the other hand, if you want the same plot, why not just use the one output by bio3d? You can always generate a PDF file and edit use Illustrator for example.

  10. Log in to comment