Parallelization: Several things to do to improve the codes

Issue #54 resolved
Xinqiu Yao created an issue
  • The 32-bit integer limitation has been removed since R 3.0.0. So, we could remove the part that split data matrix to avoid serialization problem and this will simplify codes substantially. See following message from CRAN:

LONG VECTORS

This section applies only to 64-bit platforms. There is support for vectors longer than 2^31 - 1 elements. This applies to raw, logical, integer, double, complex and character vectors, as well as lists. (Elements of character vectors remain limited to 2^31 - 1 bytes.)...serialize() to a raw vector is unlimited in size (except by resources).

  • Write a function (possibly named setup.ncore) to routinely check packages installation, cpu cores, and user input argument, and set ncore value automatically. In other functions, we call this function for multicore staffs instead of writing repeated codes.

  • The combination of multicore and bigmemory has the potential to further enhance the performance. However, some issues need to be resolved including how to deal with non-matrix data.

  • "Parallel" package is shipped with R and has the same "mclapply()" function. Hopefully we can safely remove the requirement on the "multicore" package.

  • A nicer message displaying with ncore>1

Comments (5)

  1. Barry Grant

    This is good news and removing the multicore package dependency is desirable.

    A single setup function for this stuff would be a good idea but perhaps we could again keep this on a feature branch for a later release as it sounds like a substantial revision???

    Not sure how many 32-bit users we might have but the easy route might be to turn off multicore for these systems and just support 64-bit thus simplifying the setup... Thanks!

  2. Xinqiu Yao reporter

    Thanks Barry! I think it should be in next release. Current version has been tested and it works well. If we change too much, we need extra tests which will possible delay the 2.0.0 release.

  3. Lars Skjærven

    note that function dccm.nma() uses the

    multicore::parallel()
    

    function. thus before switching to package parallel we should either recode this bit, if package parallel does not have function parallel() (or equivalent)

  4. Xinqiu Yao reporter

    Thanks for mentioning. We may consider recoding with mclapply but not for this release.

  5. Log in to comment