Assertion failure in segmap_cache::lookup_at_idx for multi-threaded CCS

Issue #573 resolved
Dan Bonachea created an issue

Internal testing has discovered a defect in the CCS support for multi-threaded multi-segment applications for v2022.9.0, which can result in a runtime assertion for debug codemode of the form:

/////////////////////////////////////////////////////////////////////
UPC++ assertion failure:
 ...
 at /[redacted]/upcxx.assert1.optlev0.dbgsym1.gasnet_par.ibv/include/upcxx/ccs.hpp:400
 in function: static const upcxx::detail::segmap_cache::segment_lookup_idx& upcxx::detail::segmap_cache::lookup_at_idx(int16_t)

This operation requires the master persona to appear in the persona stack of the calling thread

This is believed to be a regression relative to 2022.3.0, and was missed due to insufficient test coverage.

Comments (3)

  1. Dan Bonachea reporter

    For users experiencing this problem, the following patch is believed to be a sufficient/non-optimal workaround: (also requires a library rebuild)

    --- a/src/ccs.hpp
    +++ b/src/ccs.hpp
    @@ -397,7 +397,7 @@ namespace detail {
    
       inline const segmap_cache::segment_lookup_idx& segmap_cache::lookup_at_idx(int16_t idx)
       {
    -    UPCXXI_ASSERT_MASTER(); //potential data race on segment_vector_
    +    std::lock_guard<std::recursive_mutex> lock(mutex_); //potential data race on segment_vector_
         UPCXX_ASSERT(static_cast<int64_t>(idx) < static_cast<int64_t>(segment_vector_.size()));
         return segment_vector_[idx];
       }
    
  2. Log in to comment