multiple calls to setup.ncore()

Issue #98 resolved
Lars Skjærven created an issue

I noticed that after commit cf07030 nma.pdbs() runs only on 1 core. The reason seems to be be that fit.xyz() is called after setup.ncore() in nma.pdbs(). fit.xyz() calls setup.ncore() over again and resets the mc.cores to 1. The workaround is to call fit.xyz() with argument ncore=ncore which makes nma.pdbs() run with the desired number of cores. Nonetheless, I don't think this behaviour is any good since we can't adjust mc.cores for individual function calls. Any thoughts?

# will run only with 1 core
modes <- nma.pdbs(pdbs, ncore=4, fit=TRUE)

# runs on 4 since fit.xyz() is not called
# setup.ncore() is called twice
modes <- nma.pdbs(pdbs, ncore=4, fit=FALSE)

Comments (9)

  1. Barry Grant

    Do you think we need the flexibility to set 'ncores' to different values within a single function like nma.pdbs()? I am fan of keeping things simple and using the 'ncore=ncore' call to fit.xyz() within nma.pdbs(). If someone wants to use a different number of cores for each step then they can call the individual functions in turn as desired. Perhaps I am missing your point here - is it the second call to setup.ncore() that we need to disable and force the second function to respect the earlier call? If so we could check the local environment for a existing 'mc.cores' and use it if present.

  2. Xinqiu Yao

    In future release, we will use ncore=NULL as default for all multicore-supported function. So, what about in setup.ncore() we check ncore value: if ncore==NULL and mc.cores is existing, we set ncore=mc.cores; otherwise, reset ncore as before.

  3. Lars Skjærven reporter

    Ok. let's keep it the way it is. in any case we could just use option(mc.cores) again after the call to fit.xyz(). In this case ncore=ncore works fine, but specifying the ncore for a particular function can be useful where only a part of the calculation can be done in parallel due to memory (e.g. the RMSIP calculation in nma.pdbs() are in many cases too memory expensive in multicore).

  4. Xinqiu Yao
    • changed status to open

    Hi all,

    I reopen this issue because I found it is still worth further discussion.

    Recently, we had a similar problem to Lars' in the new function dccm.local.xyz(): The multicore didn't work even if we specified ncore>1. It was found due to the calling of cmap() function, which reset mc.cores to 1. It was not resolved even with cmap(ncore=ncore), because the resetting occurred deeper in another function dm.xyz() called in cmap().

    The problem here is dm.xyz() has become supporting ncore only recently and it is usually hard to trace back "parent" functions and update accordingly before we actually encountered problems. So, although we can solve current problem by setting dm.xyz(ncore=ncore), we are likely to have similar bugs in future when more functions are parallelized, if we keep current strategy going.

    One possible solution is, instead of using global variable, we set explicitly mc.cores=ncore in each call of mclapply(). This will bring a little bit additional work for developers but make each function self-contained.

    Any other idea? Let me know what you think!

  5. Lars Skjærven reporter

    Agree.. these nested function calls argues in favor of your solution, i.e. providing mc.cores explicitly.

  6. Lars Skjærven reporter

    any conclusion on this issue? should we aim for setting mc.cores=ncore explicitly ?

  7. Xinqiu Yao

    Yes, that would be the proper way to solve the issue as far as I can figure. I think we can put it to the To-Do list and close this issue. Thanks!

  8. Log in to comment