- changed milestone to 2017.12.31 release
Implement gptr debug checking
Two early users have already separately expressed fundamental confusion about the behavior of shared memory and global pointers, and this confusion is being exacerbated by the lack of checks on global pointer downcasts that have undefined behavior. Lacking these checks, a new programmer is easily confused and doesn't understand if they did something wrong or if the implementation is malfunctioning.
Here is a trimmed-down example of a naive INCORRECT program:
#include <upcxx/upcxx.hpp>
int main() {
upcxx::init();
upcxx::global_ptr<int> gp;
// allocate a shared array on rank 0 and broadcast it
if (!upcxx::rank_me()) gp = upcxx::new_array<int>(upcxx::rank_n());
gp = upcxx::broadcast(gp,0).wait();
// downcast - undefined behavior on most non-zero ranks!
int *lp = gp.local();
// write a unique value to a unique element
lp += upcxx::rank_me();
*lp = 100+upcxx::rank_me();
// output result
if (!upcxx::rank_me()) {
for (int i=0; i < upcxx::rank_n(); i++) std::cout << lp[i] << std::endl;
}
upcxx::finalize();
return 0;
}
To make matters worse, the correctness of this type of program will eventually differ based on the job parameters and local_team layout - ie once we implement shared-memory downcasts, this program will be correct when run on a single-node/single-local_team, but incorrect for distributed memory.
We need to curtail this by generating fatal errors for this type of easily detectable UB, at least in debug mode and possibly also opt mode if we think it can be done without significant performance impact.
Some UB usage errors that can and should be detected:
- downcast of global pointer that is not
.is_local()
- upcast for a pointer outside the shared segment
- pointer arithmetic on null global pointers
Comments (12)
-
reporter -
reporter -
assigned issue to
In the 10/16 meeting, Brian agreed to do this.
-
assigned issue to
-
reporter - changed milestone to 2018.03.31 release
Mass roll-over of unresolved issues to the next milestone
-
reporter -
assigned issue to
see Pull request #14
-
assigned issue to
-
reporter Some of the UB gptr upcast and downcast operations are now diagnosed.
Would still be nice for assertions to detect the following in DEBUG mode:
- pointer arithmetic on null global pointers,
- (possibly) all arithmetic and accesses using invalid gptrs - those that do not point into a valid shared segment.
- This could provide a friendlier crash message than an attempt to use garbage gptrs (eg uninitialized variables), which otherwise will probably generate obscure segment bounds violations in GASNet
- We have a similar check in Berkeley UPC and it's historically been very effective at helping to diagnose such errors
-
reporter - changed milestone to 2018.09.30 release
Mass roll-over of unresolved issues to the next milestone.
-
reporter - marked as minor
This issue was triaged at the 2018-06-13 Pagoda meeting and assigned a new milestone/priority.
-
reporter - changed milestone to 2019.03.31 release
Mass roll-over of unresolved issues to the next milestone.
-
reporter - changed milestone to 2020.3.0 release
This issue was triaged at the 2019-07-24 Pagoda issue meeting and assigned a new milestone.
-
reporter -
assigned issue to
- changed milestone to 2020.9.0 release
This was discussed in the 2020-02-12 meeting and deferred to next release milestone.
-
assigned issue to
-
reporter - changed milestone to 2020.3.0 release
Proposed solution now in pull request #175
-
reporter - changed status to resolved
issue
#83: strengthen global_ptr correctness assertions-
Add an undocumented
global_ptr::check()
method that performs various sanity checks on the global pointer, to detect invariant violations that could be caused by mis-use or memory corruption. -
Adds a new
UPCXX_GPTR_CHK
macro that compiles away to nothing by default in production mode, but when assertions are enabled defaults to invokingglobal_ptr::check()
. -
Insert UPCXX_GPTR_CHK into all methods that create global_ptrs or manipulate their internal state.
-
The default checking behavior can be overriden (on a TU basis) with a new UPCXX_GPTR_CHECK_ENABLED define (undocumented for now).
Resolves issue
#83→ <<cset 6d2b9bb9b047>>
- Log in to comment