Weighted RMSF
Hello, I would like to know if it is possible to calculate RMSF by taking into account a weight-variable associated with each frame. I have performed accelerated molecular dynamics simulations, where a boost potential is applied to lower energy barriers and increase conformational sampling. A re-weighting process (https://github.com/jeeberhardt/reweight) is applied to output a weight for each frame and I want to take this into account when calculating the RMSF for each carbon alpha as in the tutorial: rf <- rmsf(xyz[,ca.inds$xyz]). Please let me know if there is a way to do this.
Comments (3)
-
-
Hi Michael,
Thanks for the help.
In my.sd(),
sqrt(sum(w * (x - mean(x))^2) / sum(w))
Shouldn’t the “mean” also be weighted?
-
Hi Xinqiu,
You are correct. It should be:
weighted_mean <- sum(x * w) / sum(w)
sqrt(sum(w * (x - weighted_mean)^2) / sum(w))
- Log in to comment
I believe you can modify the RMSF source code to include weights in the RMSF calculation through a new function wrmsf:
wrmsf<-function (xyz, grpby = NULL, average = FALSE, weights = NULL)
{
if (is.null(dim(xyz)))
stop("input 'xyz' has NULL dimension")
if (!is.null(grpby)) {
if (is.pdb(grpby))
grpby <- paste(grpby$atom$resno, grpby$atom$chain,
sep = "-")
if (length(grpby) != ncol(xyz)/3)
stop("Length of 'grpby' doesn't match 'xyz'")
}
if (!is.null(weights)) {
if (length(weights) != nrow(xyz))
stop("Length of 'weights' doesn't match 'xyz'")
}
my.sd <- function(x, na.rm = FALSE, w = NULL) {
if (is.matrix(x))
apply(x, 2, my.sd, na.rm = na.rm, w = w)
else if (is.vector(x)) {
if (na.rm) {
x <- x[!is.na(x)]
w <- w[!is.na(x)]
}
if (length(x) == 0)
return(NA)
sqrt(sum(w * (x - mean(x))^2) / sum(w))
}
else if (is.data.frame(x))
sapply(x, my.sd, na.rm = na.rm, w = w)
else {
x <- as.vector(x)
my.sd(x, na.rm = na.rm, w = w)
}
}
if (is.null(weights)) {
fluct = rowSums(matrix(my.sd(xyz, na.rm = TRUE), ncol = 3,
byrow = TRUE)^2, na.rm = TRUE)
}
else {
fluct = rowSums(matrix(my.sd(xyz, na.rm = TRUE, w = weights), ncol = 3,
byrow = TRUE)^2, na.rm = TRUE)
}
if (average) {
if (ncol(xyz)%%3 == 0)
d = ncol(xyz)/3
else d = ncol(xyz)
return(sqrt(sum(fluct)/d))
}
else {
if (is.null(grpby))
return(sqrt(fluct))
else return(as.vector(tapply(sqrt(fluct), as.factor(grpby),
mean)))
}
}