Extend PTS arithmetic to allow pointing to "one past" the end of a shared array
Issue #109
new
Originally reported on Google Code with ID 109
Steve's comments moved from issue 106:
1. Is pointing to one element past the end of a shared array object valid (as it is
for local objects by ISO/IEC 9899 6.5.6 8-9)? If so, we should be sure that we get
the expected behavior for those as well. Note that this is a much larger change, as
a lot of the spec assumes that any valid non-null pointer-to-shared points to an object.
This is trivial to express. The existing equations in 6.4.2 3 define the exact behavior
of upc_threadof() and upc_phaseof(). My proposal in comment 13 suffices to define
the behavior of upc_addrfield(), and can be trivially tweaked to define the local address
as well. Since you can't do pointer-to-shared arithmetic on generic pointers-to-shared,
nor on pointers-to-shared whose referenced type is incomplete, we don't need to worry
about what "one past" means in those cases, and it is well-defined for all others.
Reported by danbonachea
on 2013-03-01 18:25:33
Comments (12)
-
Account Deleted -
Account Deleted > Not for this "outlier" case they do not. > ... > AND THE SHARED ARRAY IS LARGE ENOUGH, THE RESULT POINTS TO AN ELEMENT OF THE SHARED ARRAY. I specifically said, "The existing equations in 6.4.2 3", not "UPC 6.4.2 2-3" specifically for that reason. I recognize that this is an extension. ;)
Reported by
sdvormwa@cray.com
on 2013-03-01 18:40:10 -
Account Deleted > Is pointing to one element past the end of a shared array object valid (as it is for local objects by ISO/IEC 9899 6.5.6 8-9)? > I recognize that this is an extension. ;) Sounds like we agree on the answer to your first question: this is clearly undefined in UPC 1.2. As such, proposals to add such semantics are orthogonal to a clarification to existing 1.2 semantics.
Reported by
danbonachea
on 2013-03-01 18:51:24 -
Account Deleted Reported by
danbonachea
on 2013-03-01 18:51:35 - Labels added: Type-Enhancement - Labels removed: Type-Defect -
Account Deleted > Sounds like we agree on the answer to your first question: this is clearly undefined in UPC 1.2. As I mentioned in issue 106, only for definitely blocked shared arrays. For indefinitely blocked shared arrays, such a pointer is valid, and doesn't currently work with a lot of language features due to the use of "pointed-to shared object".
Reported by
sdvormwa@cray.com
on 2013-03-01 18:55:14 -
Account Deleted > As I mentioned in issue 106, only for definitely blocked shared arrays. For indefinitely blocked shared arrays, such a pointer is valid, and doesn't currently work with a lot of language features due to the use of "pointed-to shared object". To clarify, I believe that supporting such pointers with definitely blocked shared objects is a new feature, which should NOT go in 1.3. However, I believe that they are permitted in 1.2 for indefinitely blocked shared objects, but don't work correctly, and thus we should fix those in 1.3.
Reported by
sdvormwa@cray.com
on 2013-03-01 19:04:12 -
Account Deleted > I believe that they are permitted in 1.2 for indefinitely blocked shared objects, > but don't work correctly, and thus we should fix those in 1.3. Please provide a concrete example of code and spec text from 1.2 to support this assertion.
Reported by
danbonachea
on 2013-03-01 19:22:12 -
Account Deleted UPC 1.2 6.4.2 2: ... If the shared array is declared with indefinite block size, the result of the pointer-to-shared arithmetic is identical to that described for normal C pointers in [ISO/IEC00 Sec. 6.5.6], except that the thread of the new pointer shall be the same as that of the original pointer and the phase component is defined to always be zero. ... shared [] int A[NELEMS]; shared [] int *P = &A[NELEMS]; // Points to just past end of A, permitted by C99 if ( MYTHREAD == 0 ) { int *lp = (int *)P; // Undefined, because P does not point to any object, and // 6.4.3 does not define the results of such a cast if ( upc_threadof( P ) == 0 ) { // Implied by 6.4.2 2, but upc_threadof() only has a defined value for // non-null pointers that point to an actual object, which P does not } }
Reported by
sdvormwa@cray.com
on 2013-03-01 20:00:49 -
Account Deleted > shared [] int A[NELEMS]; > shared [] int *P = &A[NELEMS]; // Points to just past end of A, permitted by C99 > int *lp = (int *)P; // Undefined, because P does not point to any object, and > // 6.4.3 does not define the results of such a cast Agreed, but I don't really see this as a problem. Casting a PTS to unallocated space to a PTL does not have any behavior guaranteed by the spec, and therefore has undefined behavior. It might be nice if this was guaranteed, but since it's not I don't see how this constitutes something that "don't work correctly". Also, if you really want to construct such a pointer, it is already very easy to do so without straying into undefined behavior. Namely: int *lp = ((int *)A) + NELEMS; > but upc_threadof() only has a defined value for > non-null pointers that point to an actual object I agree that upc_thread(pts) where pts points to unallocated space has undefined behavior. I don't really see this as a major problem either - it's perhaps less than ideal, but currently it's just a corner case the library semantics render undefined. The only case where pts could be a well-defined pointer value under 1.2 is if its referenced type is indefinitely-blocked, in which case the LOGICAL thread affinity specified by 6.4.2-2 is trivially identical to the pointer used in the expression used to create pts. The library function isn't guaranteed to give you a correct answer for that corner case, but I'm having trouble seeing why a real code would care to execute that query in the first place. Even if it did, there's an obvious workaround when you know you're in this case, which is to simply call upc_thread(pts-1) -- or alternatively allocate the original object with a trailing "fence" element so that pts meets the requirement of pointing to a shared object.
Reported by
danbonachea
on 2013-03-01 20:31:18 -
Account Deleted I'm just commenting on the rationale for extending the arithmetic to make computing the one-past address valid, not the details of how we would modify UPC to allow for it. It's the same rationale as in C, really, to allow the one-past address to act as the terminating value of a pointer loop... You could have a function like this: void copy_to_me( int* local, shared [B] int* p, int count ) { while ( count-- > 0 ) *local++ = *p++; } Or you might want this form: void copy_to_me( int* local, shared [B] int* start, shared [B] int* stop ) { while ( start < stop ) *local++ = *start++; } The pure C99 local pointer analogues of these two functions are legal, but calling the second version in UPC: shared int A[THREADS]; ... copy_to_me( my_buffer, &A[0], &A[THREADS] ); Is currently questionable for arbitrary block size B != 0 because &A[THREADS] is a one-past address.
Reported by
johnson.troy.a
on 2013-03-01 21:32:27 -
Account Deleted > Is currently questionable for arbitrary block size B != 0 because &A[THREADS] is a one-past address. Agreed - that code currently has undefined behavior in 1.2. One of my main concerns with modifying the spec to make this code well-defined for arbitrarily blocked arrays is outlined below. I'm not positive this is a "show-stopper" for this potential new feature, but it's at least food for thought. In C99, arrays are strictly linear in memory, and given pointers to any two disjoint elements, there is a natural total order on those elements defined by the linear memory address. It is therefore easy to unambiguously discuss the "last" element in the array object, because it is the one that is totally ordered after all the others. It doesn't matter if the array was declared statically or allocated dynamically, or the details of the pointer types used to access it, the "last byte" of the object is always a well-defined and unique location, as therefore so is "one past the end". This basic property of heap memory is not true in UPC. Given pointers to any two distinct elements in a blocked shared array with affinity to *different* threads, there may be no unique order between those elements (see comment #1 in issue 104 for details). The ordering of the two elements depends upon the blocksize of the pointers used to ask the question. There are even obscure cases where the question returns an undefined answer (comparison of indefinitely-blocked PTS or dephased blocked PTS). As a consequence, I'm not convinced that the "last" element in an UPC array is always unique and always well-defined, let alone "one-past" that location. For a statically-declared shared array, one could potentially fall back upon the blocksize in the declaration that creates the object and use that to uniquely (and somewhat arbitrarily) define the "last" element (although note that due to blocking, "one-past" that last element might actually have affinity to a different thread and have a lower "local address"). However I don't think this works for dynamically-allocated arrays at all, because there is no *array type* in the code to provide a well-typed blocking factor for use in defining a canonical "last element" and "one past it". There are only pointers to shared data, which may alias that data using different blocking factors. If some of the pointers used to access slices of that shared array are indefinitely-blocked, one could make an argument that there is a "last element" on several threads, one corresponding to each such pointer. Thus "one past the last element" is no longer a unique location. I'm worried about the semantic implications of that, and the complexity of the spec-speak that would be needed to unambiguously define this feature.
Reported by
danbonachea
on 2013-03-01 22:51:03 -
Account Deleted In the 3/15/13 telecon, we discussed this issue and decided that it should be deferred to a future spec revision, to allow further time for study.
Reported by
danbonachea
on 2013-03-16 01:16:40 - Log in to comment
Reported by
danbonachea
on 2013-03-01 18:37:22