DCCM (dccm.xyz.R and cov2dccm.R) - Mean substraction

Hello all,

I have been looking at the code which generates the DCCM plot from cartesian coordinates. I noticed that in dccm.xyz.R, there is an early call for subtraction of the mean of each column to each individual entry in that column.

  if(is.null(reference)) ref = colMeans(xyz)
  else ref = reference
  dxyz  <- sweep(xyz, 2, ref)

The matrix dxyz is then fed to the cov() function, generating variable covmat, which in turn is feed into the function cov2dccm as ccmat.

  covmat <- cov(dxyz)

  ccmat <- cov2dccm(covmat, ncore = ncore)

If what I have stated above is correct, then I have two questions:

1) This question pertains to the use of the R cov() function on matrix dxyz. I have been looking at the R code and what functions are called by cov(). It appears to me that cov() (in R/cov.c) goes ahead and performs a mean subtraction on the data set. If so, does this imply that the mean of each column is subtracted twice? Once generating dxyz (in dccm.xyz.R) and a second time by the cov() call? (I've included the code from cov.c that I found at the end of this message, or you can find it here: https://github.com/SurajGupta/r-source/blob/56becd21c75d104bfec829f9c23baa2e144869a2/src/library/stats/src/cov.c)

2) I noticed that the syntax for :

  if(is.null(reference)) ref = colMeans(xyz)
  else ref = reference
  dxyz  <- sweep(xyz, 2, ref)

is missing { } (after the if statement, else, and closing). Is this a problem? I can't seem to run that part of the code on my end without the brackets.

Lastly, hats off to you guys for the invaluable work you are doing for the community. I have written my own collection of scripts in different languages and programs to do analyses that are restricted to a gazillion limitations (trajectory type, file requirements, etc, etc), it is extremely nice to find such an efficient and elegant analysis suite. Many thanks again.

Carlos Andres, Sham Lab, Gao Lab, U of Minnesota

#define COV_SUM_UPDATE              \
            sum += xm * ym;     \
            if(cor) {           \
            xsd += xm * xm;     \
            ysd += ym * ym;     \
            }

#define ANS(I,J)  ans[I + J * ncx]

/* Note that "if (kendall)" and  "if (cor)" are used inside a double for() loop;
   which makes the code better readable -- and is hopefully dealt with
   by a smartly optimizing compiler
*/

/** Compute   Cov(xx[], yy[])  or  Cor(.,.)  with n = length(xx)
 */
#define COV_PAIRWISE_BODY                       \
    LDOUBLE sum, xmean = 0., ymean = 0., xsd, ysd, xm, ym;  \
        int k, nobs, n1 = -1;   /* -Wall initializing */        \
                                    \
        nobs = 0;                           \
        if(!kendall) {                      \
        xmean = ymean = 0.;                 \
        for (k = 0 ; k < n ; k++) {             \
            if(!(ISNAN(xx[k]) || ISNAN(yy[k]))) {       \
            nobs ++;                    \
            xmean += xx[k];                 \
            ymean += yy[k];                 \
            }                           \
        }                           \
        } else /*kendall*/                      \
        for (k = 0 ; k < n ; k++)               \
            if(!(ISNAN(xx[k]) || ISNAN(yy[k])))         \
            nobs ++;                    \
                                    \
        if (nobs >= 2) {                        \
        xsd = ysd = sum = 0.;                   \
        if(!kendall) {                      \
            xmean /= nobs;                  \
            ymean /= nobs;                  \
            n1 = nobs-1;                    \
        }                           \
        for(k=0; k < n; k++) {                  \
            if(!(ISNAN(xx[k]) || ISNAN(yy[k]))) {       \
            if(!kendall) {                  \
                xm = xx[k] - xmean;             \
                ym = yy[k] - ymean;             \
                                    \
                COV_SUM_UPDATE              \
            }                       \
            else { /* Kendall's tau */          \
                for(n1=0 ; n1 < k ; n1++)           \
                if(!(ISNAN(xx[n1]) || ISNAN(yy[n1]))) { \
                    xm = sign(xx[k] - xx[n1]);      \
                    ym = sign(yy[k] - yy[n1]);      \
                                    \
                    COV_SUM_UPDATE          \
                }                   \
            }                       \
            }                           \
}

Comments (4)