How to identify which pdbs are which on PCA and NMA plotting

Issue #756 new
Former user created an issue

Hi everyone, I was just wondering if there is an easy way to show which PDB numbers are showing at which point in my PCA plot.

I've tried using the command: "identify(pc.xray$z[,1:2], labels=basename.pdb(pdbs$id))" The command above does work, however, my PCA graph has clustered points on it, so it is hard to tell which PDB numbers show up where because it just makes a whole mess after clicking on each point to identify.

Please feel free to reach out and let me know.

Comments (2)

  1. Xinqiu Yao

    Hi,

    You might want to try the ggplot2 and ggrepel packages for this purpose. The following example can label all structures automatically in a “self-repulsive” manner and so labels are not overlapped:

    # An example PCA
    library(bio3d)
    attach(transducin)
    pc <- pca(pdbs, fit=TRUE)
    
    # First, create a data frame for plotting.
    dat <- data.frame(x=pc$z[, 1], y=pc$z[, 2], 
       lab=pdbs$id, col=annotation[, "color"])
    
    # Load packages. Need to install first if absent.
    library(ggplot2)
    library(ggrepel)
    
    # Plot
    ggplot(dat, aes(x=x, y=y, color=col, label=lab)) +
       geom_point() +
       geom_text_repel(size=3)
    

    It may not always perfect, but after a few tweaking and twisting, you may get a pretty descent result. See manuals of both packages for more detail.

    Hope it helps.

  2. Log in to comment