Calculating abundance for each bin

Issue #25 resolved
Camilla Nesbø created an issue

Is there a way to get median depth for each bin? To use as abundance? (i.e. is there already a script or function for this) This would be a great addition to the output.

Comments (3)

  1. Don Kang

    Thanks for the suggestion. Meanwhile, you can calculate it with depth file as shown below (R code). Make it sure to have correct depth file and bin directory.

    library(foreach)
    library(matrixStats)
    
    depth <- read.table('depth.txt', as.is=T, header=T)
    depth <- depth[,-which(grepl('var$',colnames(depth)))][,-c(2,3)]
    bins <- list.files('bins_dir', pattern='.fa', full.name=T)
    mems <- foreach(f = bins) %do% {
        sub('>','',system(sprintf("grep '^>' %s", f), intern=T))
    }
    names(mems) <- sub('.fa','',basename(bins)) 
    
    bins.cv <- do.call(rbind, foreach(m=mems) %do% {
        colMedians(as.matrix(depth[match(m, depth$contigName),-1]))
    })
    dimnames(bins.cv) <- list(names(mems), colnames(depth)[-1])
    write.table(bins.cv, file='MedianCoverage.txt', row.names=T, col.names=T, quote=F, sep='\t')
    
  2. Log in to comment