"Overlap" function to calculate the similarity of two principal components
Hello, I would like to calculate the similarity of the two PCs in PCA analysis. Here is my code:
pc <- pca.xyz(xyz[,ca.inds$xyz])
combine <- rbind(pc$au[, 1],pc$au[, 2]) #combine the eigenvector1 and eigenvector2
dv <- difference.vector(combine)
nor_dv <- dv / sqrt(sum(dv^2)) #normalize dv
o <- overlap(pc$au, nor_dv,nmodes=30)
But when I checked the “overlap“ function carefully in the tutorial, I found it might be wrong to do like this way. In the document, it says: “a numeric vector of the squared dot products (overlap values) between the (normalized) vector (dv) and each mode in mode.“
Is there a way to calculate the similarity of two PCs eigenvectors for one ensemble system? Does “overlap” calculate the square dot products of the pc$au and dv in my code?
Comments (4)
-
-
- changed status to new
-
reporter Hi Xinqiu,
Thank you so much for clarifying it! I misunderstood the pc$au, and I thought it means normalized vectors. Are “U“ normalized vectors?
If I edit it as:
pc <- pca.xyz(xyz[,ca.inds$xyz])
combine <- rbind(pc$U[, 1],pc$U[, 2]) #combine the eigenvector1 and eigenvector2
dv <- difference.vector(combine)
nor_dv <- dv / sqrt(sum(dv^2)) #normalize dv
o <- overlap(pc$au, nor_dv,nmodes=20)Is this correct to calculate the similarity of two PCs? and may I ask what is the formula inside the “overlap” function?
-
I don’t see the point to call
difference.vector()
. If you want to simply compare all eigenvectors to PC1, typeoverlap(pc, pc$U[, 1])
. But, of course, the result is trivial, because al eigenvectors are orthogonal, so you will see either 1 or very tiny numbers close to 0.
I think the function calculates a squared dot product between two vectors. You can type
overlap
and return to check the code.
- Log in to comment
First, “pc$au” is the loading of atoms, not the eigenvector. “U” contains eigenvectors (columns 1, 2, etc. are PC axis 1, 2, etc.).
Second, to calculate overlap, provide ‘pc’ as the first argument, not its components. For example,
overlap(pc, dv, nmodes=XXX).
Your definition of “dv” is a bit confusing. What are you seeking here? Normally, “dv” is some vector you are interested in, not necessarily always a difference between two vectors.
Hope it helps.