Is it possible to do charge and/or hydrophobicity analysis of multiple aligned sequences in Bio3d?

Issue #175 resolved
Former user created an issue

No description provided.

Comments (3)

  1. Barry Grant

    Yes you can look at over 500 published physicochemical and biological property scales using the aa.index data and bio3d.

    For example below we examine the hydrophobicity of aligned sequences (using the Kyte-Doolittle Hydropathy index). For simple net charge you could use the scale from Klein et al., (1984) i.e. aa.index[["KLEP840101"]] See http://www.genome.jp/aaindex/ for more details on available scales/indexes

    library(bio3d)
    
    ## Read an example alignment
    file <- system.file("examples/hivp_xray.fa",package="bio3d")
    aln <- read.fasta( file )
    
    ## Load AAindex data
    data(aa.index)
    
    ## Find Kyte-Doolittle Hydropathy index (Kyte-Doolittle, 1982)
    ##  grep for author "Kyte" to loacte this entry 
    ind <- which(sapply(aa.index, function(x) length(grep("Kyte", x$A)) != 0))
    
    ## Store the 'index' values for the 20 standard amino acids
    index <- aa.index[[ind]]$I
    
    # print these values
    index
    
    ## Plot vector of hydropathy vales for the first sequence
    plot( index[ aln$ali[1,] ], typ="l", 
         xlab="Alignment position", ylab="Hydropathy (Kyte-Doolittle, 1982)")
    
    ## Store values for all alignment sequences
    hydro.aln <- NULL
    for(i in 1:nrow(aln$ali)) {
      hydro.aln <- rbind(hydro.aln, index[ aln$ali[i,] ])
    }
    
    ## Plot mean value per alignment position
    hydro.ave <- apply(hydro.aln,2,mean,na.rm=TRUE)
    plot(hydro.ave, typ="o",
         xlab="Alignment position", ylab="Hydropathy (Kyte-Doolittle, 1982)")
    

    Looking at larger alignments can be helpful, e.g.

    ## Example 2. Look at a larger PFAM alignment and remove gap positions from plot
    file <- "http://pfam.xfam.org/family/PF00503/alignment/seed/format?format=fasta"
    aln <- read.fasta( file )
    
    hydro.aln <- NULL
    for(i in 1:nrow(aln$ali)) {
      hydro.aln <- rbind(hydro.aln, index[ aln$ali[i,] ])
    }
    
    hydro.ave <- apply(hydro.aln,2,mean,na.rm=TRUE)
    ngaps <- gap.inspect(aln)$f.inds
    
    plot(hydro.ave[ngaps], typ="o")
    
  2. Log in to comment