Can PCA be used to draw a phylogenetic tree?

Issue #601 resolved
Cheng created an issue

I combined dcd data of multiple mutants into one, then run the PCA.

> plot(pc, col=bwr.colors(nrow(xyz)) ) Picture1.png

There are 5454 dots, every 303 dots is from a mutant. Can I use the PCA information to draw a phylogenetic tree? e.g.

e2f14087f4afec5fd573b68f8c03cd4b0b35c60d.jpg

Because there are too many dots, I think the PCA plot could not capture their similarity information quite well.

Thank you!

Comments (8)

  1. Xinqiu Yao

    Hi Cheng,

    What do you want to do a phylogenetic analysis over? Are "A", "B", "C", etc. individual conformations or conformational ensembles from different mutant/WT simulations?

  2. Cheng reporter

    Hi Xin-Qiu, in my case, I would like "A", "B", "C" to be every 303 dots, as they belong to the same mutant. A mutant of 303 frames includes 3 repeats x 101 trajectories.

  3. Cheng reporter

    I replotted it, with each colour representing each mutant. So 303 dots * 18 mutants = 5454 dots. So can I analyse the similarities (e.g. a phylogenetic tree) among the mutants?

    overall.png

  4. Cheng reporter

    I already assign a unique colour for each mutant, as shown in the last plot. Is that as you said for "clustering in the PC space"?

    I agree the scattering of different colours could already provide some information about the similarity among the mutants.

  5. Barry Grant

    I mean cluster on the pc$z component of the pca output. Read some of the online vignettes to get a feel for this.

    Using standard R things like dist(), hclust() with Ward.D2 method and plot() to get your tree like dendrogram if that is what you really ;-)

  6. Cheng reporter

    yes, plot(pc$z[,c(1,2)]) is same as the first plot in plot(pc).

    I will read the standard R things, thank you!

  7. Log in to comment