use different names for the parallelized versions of functions

Issue #4 wontfix
Thomas Gilgenast created an issue

description of pattern

this pattern was first applied to lib5c.algorithms.qnorm.qnorm() and looks like this:

def qnorm(...):
     ....

qnorm_parallel = parallelize_regions(qnorm)

the original function's docstring may contain a note indicating that a parallelized version exists

advantages

  • removes confusion about whether or not a function invocation is invoking the actual function or some random altered version with a completely different signature
  • puts the burden of explicitly stating which version you want on the client, which will prevent bugs due to clients calling the wrong version and getting unexpected behavior
  • the original function never gets decorated, which prevents signatures from being stripped from docs, etc.

what needs to be done

  • for each function wrapped with @parallelize_regions
    • remove the decorator
    • add the new decorated function immediately below the original function definition
    • add a note to the docstring
    • edit the autodocs appropriately
  • for each potentially parallel usage of a function wrapped with @parallelize_regions
    • call the appropriate explicitly parallelized (or not) version, typechecking the inputs or integrating into logic ladders as appropriate
  • edit the scripting tutorial in the docs to explain the new behavior

why this needs to be done

this will make the parallelization way less confusing and mysterious to the end user

Comments (4)

  1. Thomas Gilgenast reporter

    as an update to this, with the addition of @pretty_decorator, parallelizing a function no longer strips its signature from either the docs or from code completion help (e.g. in vim)

    this somewhat reduces the severity of the issue I think

  2. Thomas Gilgenast reporter

    over time, consensus is changing to suggest that the “modern“ replacement for @parallelize_regions will be an explicit parallelization helper like that implemented in hic3defdr.util.parallelization

    this API simply exposes parallel_map() and parallel_apply() functions that can be used directly by clients specifically when they want their computations to be parallelized

    it makes parallelization easy-to-use in a generic context (not dependent on a dict-like data structure)

    old-style client code that relies on dict-like data structures can manually repackage the results of a parallel_map() into the values of a new dict without too much extra effort

    the syntactic sugar for per-region parameter specification provided by @parallelize_regions would be lost, but again clients can manually repackage the parameters before calling

    hic3defdr.util.parallelization is much simpler than @parallelize_regions: forcing clients to pass only kwargs is a super simple and intuitive way to obviate the need for any function introspection, default parameter fill-in, functools.partials shenanigans, or limitations on what functions can be parallelized. it includes progress bar support, has a simple API, and can be applied to basically any parallelization use case. finally, since it is not implemented as a decorator, it obviates the need to use @pretty_decorator to repair function signatures and docstrings.

    in summary, we no longer recommend using @parallelize_regions for parallelization, instead we recommend using hic3fedr.util.parallelization, whose explicit parallelization semantics make this issue obsolete

  3. Log in to comment