REGRESSION: SEQ mode incorrectly requires master as current_persona for shared allocation
Release 2021.3.0 strengthened the documented requirements and debug checking on use of the master persona in threadmode=seq mode, requiring it to be the current persona for most functions with internal or user-progress (ie all communication). However shared-heap allocation (which has progress-level none) was documented as subject to a weaker requirement, only requiring the calling thread to hold the master persona (not necessarily current). Unfortunately, the debug-mode assertions in the release incorrectly enforced the stronger requirement on threadmode=seq callers. Example program:
#include <upcxx/upcxx.hpp> #include <iostream> #include "../util.hpp" using namespace upcxx; int main() { upcxx::init(); print_test_header(); { persona p; persona_scope ps(p); auto gptr = upcxx::new_<int>(42); } print_test_success(); upcxx::finalize(); return 0; }
Output on 2021.3.0 with debug/smp:
*** FATAL ERROR (proc 0): ////////////////////////////////////////////////////////////////////// UPC++ assertion failure: on process 0 (pcp-d-10) at /tmp/upcxx-nightly-dirac-gcc-cuda11/bld/upcxx_install/upcxx-2021.3.0/src/backend/gasnet/runtime.cpp:1135 in function: void* upcxx::backend::gasnet::allocate(size_t, size_t, upcxx::backend::gasnet::sheap_footprint_t*) This operation requires the primordial thread using the master persona, when compiled in threadmode=seq. Multi-threaded applications should compile with `upcxx -threadmode=par` or `UPCXX_THREADMODE=par`. For details, please see `docs/implementation-defined.md` To have UPC++ freeze during these errors so you can attach a debugger, rerun the program with GASNET_FREEZE_ON_ERROR=1 in the environment. //////////////////////////////////////////////////////////////////////
This represents a regression in behavior relative to the previous 2020.{10,11},0 releases.
Comments (5)
-
-
reporter @Paul Hargrove The "good" fix I have in the works is not a small isolated change.
However I can additionally produce a hotfix-friendly patch to disable the problematic assertions entirely.
-
@Dan Bonachea Thanks for the clarification.
I mistakenly envisioned the resolution as localized: trading one assertion for a different one.
Unless this is a user-reported issue, please focus on the "good" fix for now.
We can produce a hotfix-friendly patch as a secondary output from this issue. -
reporter -
reporter - changed status to resolved
issue482: Split UPCXX_ASSERT_MASTER_IFSEQ into two flavors
In SEQ mode, we actually have two different preconditions wrt the master persona, so we need a macro for each.
Add comments to explain correct usage.
Update calls accordingly.
Fixes issue
#482.→ <<cset 879bcb057df1>>
- Log in to comment
The following may go without saying, but just in case:
Given the nature of the problem I'd very much like the resolution to include a patch that can be applied to the release by any end-user who encounters the problem. For instance the ChangeLog update should be a distinct commit from the code changes. I will plan to rebuild our stable installs at the DOE centers with such a fix applied.