REGRESSION: SEQ mode incorrectly requires master as current_persona for shared allocation
Release 2021.3.0 strengthened the documented requirements and debug checking on use of the master persona in threadmode=seq mode, requiring it to be the current persona for most functions with internal or user-progress (ie all communication). However shared-heap allocation (which has progress-level none) was documented as subject to a weaker requirement, only requiring the calling thread to hold the master persona (not necessarily current). Unfortunately, the debug-mode assertions in the release incorrectly enforced the stronger requirement on threadmode=seq callers. Example program:
#include <upcxx/upcxx.hpp>
#include <iostream>
#include "../util.hpp"
using namespace upcxx;
int main() {
upcxx::init();
print_test_header();
{ persona p; persona_scope ps(p);
auto gptr = upcxx::new_<int>(42);
}
print_test_success();
upcxx::finalize();
return 0;
}
Output on 2021.3.0 with debug/smp:
*** FATAL ERROR (proc 0):
//////////////////////////////////////////////////////////////////////
UPC++ assertion failure:
on process 0 (pcp-d-10)
at /tmp/upcxx-nightly-dirac-gcc-cuda11/bld/upcxx_install/upcxx-2021.3.0/src/backend/gasnet/runtime.cpp:1135
in function: void* upcxx::backend::gasnet::allocate(size_t, size_t, upcxx::backend::gasnet::sheap_footprint_t*)
This operation requires the primordial thread using the master persona, when compiled in threadmode=seq.
Multi-threaded applications should compile with `upcxx -threadmode=par` or `UPCXX_THREADMODE=par`.
For details, please see `docs/implementation-defined.md`
To have UPC++ freeze during these errors so you can attach a debugger,
rerun the program with GASNET_FREEZE_ON_ERROR=1 in the environment.
//////////////////////////////////////////////////////////////////////
This represents a regression in behavior relative to the previous 2020.{10,11},0 releases.
Comments (5)
-
-
reporter @Paul Hargrove The "good" fix I have in the works is not a small isolated change.
However I can additionally produce a hotfix-friendly patch to disable the problematic assertions entirely.
-
@Dan Bonachea Thanks for the clarification.
I mistakenly envisioned the resolution as localized: trading one assertion for a different one.
Unless this is a user-reported issue, please focus on the "good" fix for now.
We can produce a hotfix-friendly patch as a secondary output from this issue. -
reporter -
reporter - changed status to resolved
issue482: Split UPCXX_ASSERT_MASTER_IFSEQ into two flavors
In SEQ mode, we actually have two different preconditions wrt the master persona, so we need a macro for each.
Add comments to explain correct usage.
Update calls accordingly.
Fixes issue
#482.→ <<cset 879bcb057df1>>
- Log in to comment
The following may go without saying, but just in case:
Given the nature of the problem I'd very much like the resolution to include a patch that can be applied to the release by any end-user who encounters the problem. For instance the ChangeLog update should be a distinct commit from the code changes. I will plan to rebuild our stable installs at the DOE centers with such a fix applied.