Clarify contiguity of local slice of a shared array

Issue #106 new
Former user created an issue

Originally reported on Google Code with ID 106

This issue arose in email from Yili:

Are the blocks of a shared array with affinity to a thread required (or not required)
to store contiguously?

For example, shared [1] int a[4]; for 2 threads
Can the user expect a[0] and a[2] are stored in contiguous physical memory or not?

I can't find the definite answer in the spec.

Reported by danbonachea on 2013-02-25 13:20:05

Comments (91)

  1. Former user Account Deleted
    Some relevant discussion from email replies below:
    
    Paul
    ----
    I did not find an explicit "the blocks assigned to a thread must be contiguous", but
    the following is what the current 1.2 spec says on p.25:
    
        Elements of shared arrays are distributed in a round robin fashion, by chunks
        of block-size elements, such that the i-th element has affinity with thread
        (floor (i/block size) mod THREADS).
    
    I believe the "contigousness" is implied by that text.
    
    The text of paragraphs 4 and 5 on p18 also seem to imply that the elements must be
    continuous, but addfield is so under-defined that I'll top short of claiming that there
    is an unambiguous requirement here.
    
    In addition, there are numerous examples and tutorials outside of the spec that "privatize"
    a shared array with either of the following constructs:
        shared [1] int A[...]
        ...
        int *my_private_P = (int *)&A[MYTHREAD];
        shared [] int *my_shared_P = &A[MYTHREAD];
    Neither would be useful beyond the first element if the elements with affinity to a
    given thread were not contiguous.
    In fact, no casts between PTS of different blocksizes would make sense unless the elements
    are contiguous.
    
    I think we all "know" the layout is contiguous, but I agree with Yili that no clear
    statement of this fact is evident in the (1.2) spec.
    
    
    Steve
    -----
    I think this is one of those things we inherit from C. See ISO/IEC 9899 6.2.5 20:
    
    "An array type describes a contiguously allocated nonempty set of objects with a particular
    member object type..."
    
    Since we don't say otherwise (other than to describe how the elements are distributed
    amongst threads), this applies to shared arrays as well, and thus they must be contiguous.
    
    ISO/IEC 9899 6.2.5 20 states that array elements (objects) must be contiguous in memory.
     UPC 1.2 6.5.2.1 3 states that elements are distributed round-robin to threads, but
    does not EXPLICITLY permit the chunks with affinity to the same thread to not be contiguous
    in local memory.  Therefore, since C requires it and UPC does not explicitly call out
    that the C requirement does not apply, the C requirement must apply and thus they must
    be contiguous.
    
    Spelling this out explicitly would be nice (probably as another formula in 6.4.2 3,
    but I do believe that this behavior is indeed mandated by the current spec.
    
    Troy
    ----
    I agree with Paul that Paragraph 5 on Page 18 implies it, but loosely and indirectly.
     As applied to Yili's example, &a[0] and &a[2] would be the S1 and S2 -- they point
    to the same shared array object and have affinity to the same thread.  Paragraph 5
    goes on to say that P1 (the local pointer cast of S1) and S1 point to the same object,
    and P2 and S2 point to the same object.  Transitively, P1 and P2 must point to the
    same object.
    
    What is that object?  P1 and P2 are normal, local C pointers so normally one would
    look to the C standard to answer this question, but the C standard cannot answer this
    question because the only two reasonable answers that I can think of are (1) they point
    to the same shared array object (but the C standard doesn't cover shared objects) or
    (2) they point to a non-shared array object that is logically a slice of the original
    shared array object (but the C standard doesn't cover hypothetical arrays that aren't
    actually declared explicitly in a program). 
    
    The UPC spec really should say something here.  If we say that the local slice is or
    behaves like a normal C array, then and only then do I agree with Steve that we inherit
    the contiguous property from ISO/IEC 9899 6.2.5 20.  As it stands, I don't think shared
    arrays have this property as applied directly from the C standard; rather they are
    the antithesis of it because A[0] and A[1] may be contiguous in memory or A[0] and
    A[1] may be across the machine room from each other in different cabinets.  But I do
    think we want to extend this property to the local portions of shared arrays.
    
    Bill
    ----
    First I think Paul is correct that this does indicate the need for an explicit statement
    of some sort.
    
    In my view "contiguity" has to do with pointer arithmetic and casting.  If things are
    contiguous you can count on pointer arithmetic to work between them in the usual way.
     If they are not, you cannot.  Note that ISO uses contiguity to distinguish arrays
    and multiple calls to malloc.
    
    As long as one stays in pointer-to-shared land, I think everything in the draft spec
    works fine with respect to this.  But when you cast to pointer-to-local, it does seem
    we should add something related to 6.4.3, footnote 13 in the 1.3 draft to say that
    the "are local accesses and behaves accordingly" includes pointer arithmetic to access
    all portions of the object which have affinity to that thread.
    
    I think others here are more qualified than I to suggest the language, but it is definitely
    the desired (and relied upon) effect.
    
    Kathy
    -----
    I always agree with removing ambiguity from the spec, and that distributing across
    regions a previously-specified contiguous object raises such an ambiguity. 
    

    Reported by danbonachea on 2013-02-25 13:23:35

  2. Former user Account Deleted
    I think we have consensus on the intended behavior, and I suspect all current implementations
    already provide the contiguity property under discussion. I agree that inserting a
    clarification sentence somewhere is probably in order to "make it official", although
    we don't have any actual proposed language yet.
    
    Adding this to the 1.3 milestone initially, although we'll need to draft some language
    immediately for this change to make it into 1.3 (we're way past the "new issues" deadline
    for 1.3).
    

    Reported by danbonachea on 2013-02-25 13:33:00 - Labels added: Milestone-Spec-1.3

  3. Former user Account Deleted
    I propose adding the following to the end of 6.5.2.1 5:
    
    The local portion of a shared array shall be contiguous in a thread's memory.  All
    of the chunks distributed to the same thread shall appear consecutively in that thread's
    memory, with no space between chunks.
    

    Reported by sdvormwa@cray.com on 2013-02-25 18:47:31

  4. Former user Account Deleted
    I think Steve's proposed text is clear.  I assume the contiguity property holds true
    for dynamically allocated arrays by upc_all_alloc or upc_global_alloc, right?  
    

    Reported by yzheng@lbl.gov on 2013-02-25 19:04:09

  5. Former user Account Deleted
    > The local portion of a shared array shall be contiguous in a thread's memory.  All
    
    > of the chunks distributed to the same thread shall appear consecutively in that 
    > thread's memory, with no space between chunks.
    
    It's a good first cut, but uses some terms and concepts not defined by the specification
    ("thread's memory"). Can we re-word it to discuss elements/chunks "with affinity to"
    a given thread?
    
    We also need to be careful not to prohibit internal padding in elements which may be
    required for alignment.
    

    Reported by danbonachea on 2013-02-25 20:02:37

  6. Former user Account Deleted
    How about
    
    Elements of data storage with affinity to a thread shall be contiguous in that thread's
    local address space.  All of the chunks distributed to the same thread shall appear
    consecutively, with no additional padding beyond the requirements of the ultimate element
    type [see issue 3] of the array.
    

    Reported by sdvormwa@cray.com on 2013-02-25 20:52:34

  7. Former user Account Deleted
    Since the padding (if any) between elements is already in the C definition of an array:
    
    "The elements of a shared array with affinity to any given thread shall appear in that
    thread's address space consecutively, as a single array object."
    
    On could add "with elements in increasing order" if we want to be pedantic.
    

    Reported by phhargrove@lbl.gov on 2013-02-25 21:04:13

  8. Former user Account Deleted
    We can't use elements, because an element of a shared array might itself be a shared
    array (for instance, the first element of an array declared 'shared int A[2][THREADS]'),
    and thus may not completely reside on a single thread.
    

    Reported by sdvormwa@cray.com on 2013-02-25 21:12:32

  9. Former user Account Deleted
    "address space" is another term not defined by the spec, and the type of implementation
    detail we should attempt to avoid.
    
    If possible, it might be nice to "fix" this within 6.4.2-6, which is already very closely
    related. Consider this proposal (1 line changed, 1 line added):
    
       T *P1, *P2;
    -  shared    T *S1, *S2;
    +  shared [] T *S1, *S2;
    
       P1 = (T*) S1; /* allowed if *S1 has affinity only to MYTHREAD */  
       P2 = (T*) S2; /* allowed if *S2 has affinity only to MYTHREAD */
    
      For all S1 and S2 that point to two distinct elements of the same shared
      array object which have affinity to the same thread:
    
      * S1 and P1 shall point to the same object.
      * S2 and P2 shall point to the same object.
      * The expression (((ptrdiff_t) upc_addrfield (S2) - (ptrdiff_t) upc_addrfield(S1))
    
        shall evaluate to the same value as ((P2 - P1) * sizeof(T)).
    + * The expression (S2 - S1) shall evaluate to the same value as (P2 - P1)
    
    The new constraint enforces that indefinitely-blocked PTS arithmetic (ie indexing with
    affinity to a single thread) increments in exactly the same way as a pointer-to-local,
    and I believe has the side-effect of disallowing the discontiguous layout we wish to
    prohibit.
    

    Reported by danbonachea on 2013-02-26 12:59:14

  10. Former user Account Deleted
    Amendment to the proposal in comment #9, change the new constraint line to read:
    
    + * The expression P1 + (S2 - S1) == P2 shall evaluate to 1.
    
    (This equation is equivalent in a correct implementation, but this form 
    additionally disallows a perverse implementation from passing the test via integer
    round-off.)
    

    Reported by danbonachea on 2013-02-26 14:28:18

  11. Former user Account Deleted
    Going back to the original comments, I thought we agreed that the current spec wording
    required contiguity of elements within a block.  The ambiguous case was contiguity
    across blocks with affinity to the same thread.  The proposed language in comments
    9 and 10 does not address the case of contiguity across blocks because there's no such
    concept for indefinitely blocked objects.  If we're going to clarify this with an example,
    I think we must use a definitely blocked shared array to do so.
    

    Reported by sdvormwa@cray.com on 2013-02-26 14:45:19

  12. Former user Account Deleted
    > The ambiguous case was contiguity across blocks with affinity to the same thread.
    
    
    Agreed.
    
    >  The proposed language in comments 9 and 10 does not address the case of contiguity
    
    >across blocks because there's no such concept for indefinitely blocked objects.
    
    I believe it does. The key is in the setup phrase:
    
      For ALL S1 and S2 that point to two distinct elements of the same shared
      array object which have affinity to the same thread.
    
    This implies the equations must hold for ALL pairwise combinations of distinct elements
    with the same affinity that exist in every shared array object (with any blocking factor).
    In particular, it must hold for every pair of elements (e1, e2) with affinity to the
    same thread, INCLUDING those pairs where e1 and e2 were part of different blocks in
    the original array allocation. The use of indefinitely-blocked (instead of cyclically-blocked)
    pointers S1 and S2 to construct the constraints is just a convenience - the equations
    still govern the placement of elements in memory for all shared arrays, including those
    allocated with a definite blocking factor.
    
    I invite you to construct an example where the local blocks are discontiguous that
    still satisfies the equations for all pairwise combinations of elements with local
    affinity.
    

    Reported by danbonachea on 2013-02-26 15:37:14

  13. Former user Account Deleted
    However, blocksize is part of the type compatibility, so casting a pointer to a definitely
    typed object to an indefinitely typed one is "fishy".  I'd propose that instead of
    putting things here, I think we should instead modify 6.4.2 3, by explicitly defining
    upc_addrfield's value for the case of the result having affinity to the same thread
    as the initial pointer [see proposed text of issue 3 for definition of 'elem_delta']:
    
      Additionally, if upc_threadof(p) == upc_threadof(p1), the following equation must
    hold
    
      ptrdiff_t block_delta = (((upc_phaseof(p) + elem_delta) div B) div THREADS);
      ptrdiff_t local_elem_offset = (block_delta * B) - upc_phaseof(p) + upc_phaseof(p1);
    
      upc_addrfield(p1) == upc_addrfield(p) + local_elem_offset * upc_elemsizeof(*p)
    

    Reported by sdvormwa@cray.com on 2013-02-26 15:54:09

  14. Former user Account Deleted
    > casting a pointer to a definitely typed object to an indefinitely typed one is "fishy".
    
    There is no such cast. The only cast in the equations is from a PTS with local affinity
    to a PTL, which is perfectly kosher. 
    
    The setup text requires the equations to hold for every S1 and S2 pointing to distinct
    elements of the shared array with local affinity. It does not prescribe how those pointers
    are constructed, because it is irrelevant. All that matters is that it covers every
    pair of pointer values referencing local elements.
    

    Reported by danbonachea on 2013-02-26 16:03:06

  15. Former user Account Deleted
    Also, the new text is no "fishier" than the old text, which used a pair of cyclically-blocked
    pointers to reference every pairwise set of distinct local elements. I just changed
    cyclic to indefinite to make the equation cleaner.
    

    Reported by danbonachea on 2013-02-26 16:04:24

  16. Former user Account Deleted
    I'm sorry, I wasn't clear about the fishiness I was referring to.  The new text as written
    looks good to me from a technical standpoint, albeit redundant (see below).  The part
    that I find fishy is using the new text to answer the question "For a definitely blocked
    shared array object, are the blocks with affinity to a thread contiguous?"  Because
    the new text only addresses indefinitely blocked shared objects, I find its use to
    answer that question fishy.
    
    Moreover, because of the following text in 6.4.2 2 in the existing spec, it is completely
    redundant:
    
    "If the shared array is declared with indefinite block size, the result of the pointer-to-shared
    arithmetic is identical to that described for normal C pointers in [ISO/IEC00 Sec.
    6.5.6], except that the thread of the new pointer shall be the same as that of the
    original pointer and the phase component is defined to always be zero."
    
    As I said in comment 11, if we need to disambiguate this in the spec (which, as noted
    in my email quoted in comment 1, I don't believe is strictly necessary), then we must
    do so with definitely blocked arrays.  It is not sufficient to restate something about
    indefinitely blocked arrays that is already in the spec.
    

    Reported by sdvormwa@cray.com on 2013-02-26 16:40:25

  17. Former user Account Deleted
    >  Because the new text only addresses indefinitely blocked shared objects, I find its
    use to answer that question fishy.
    
    I'm sorry but this is completely false. The equations in 6.4.2-6 apply to ALL SHARED
    ARRAYS. It applies to arrays allocated with a indefinite, cyclic or definite blocking
    factor. The equations use two POINTERS with a particular blocksize (because every non-generic
    PTS must have a blocksize), but the constraints apply to ALL shared arrays, regardless
    of allocation layout.
    
    Please point out exactly the text from either the old or new text that you believe
    says the equations apply to only a subset of all shared arrays?
    

    Reported by danbonachea on 2013-02-26 16:50:42

  18. Former user Account Deleted
    What am I missing here?
    
    With four threads:
    
    shared int Arr[3*THREADS];
    shared int *S1 = Arr[0];
    shared int *S2 = Arr[THREADS]; // = Arr[4]
    int *P1 = (int *)S1;
    int *P2 = (int *)S2;
    
    S2 - S1 = THREADS; // = 4
    P2 - P1 = 1;
    
    The proposed constraint:
    
    + * The expression P1 + (S2 - S1) == P2 shall evaluate to 1.
    
    But P2 = P1 + 1, and P1 + 4 != P1 + 1.  The constraint does enforce indefinite block
    size arithmetic, but it does not hold across blocks.  Is it expected to?
    

    Reported by brian.wibecan on 2013-02-26 17:10:15

  19. Former user Account Deleted
    Yes, I agree they apply to all shared arrays.  I agree that it technically says what
    we want.  That's why I said "fishy" and not "incorrect".
    
    My concern is that, given the declarations
    
    shared [] T *S1, *S2;
    
    it is natural to assume that they point at indefinitely blocked shared arrays, because
    that is what their referenced type is.  Because of 6.5.1.1 12, the ONLY way they could
    point at (part of) a definitely blocked shared array is with an EXPLICIT cast, which
    IS NOT PRESENT in the proposed text.
    
    I think my proposal in comment 13 says what we want to say in a manner that is less
    "fishy" and more direct--importantly, it doesn't require any hidden EXPLICIT casts
    to disambiguate the case that we want to disambiguate.
    

    Reported by sdvormwa@cray.com on 2013-02-26 17:10:17

  20. Former user Account Deleted
    "What am I missing here?"
    
    S1 and S2 should be 'shared [] int *' (and note, you need an explicit cast), not 'shared
    int *'.
    

    Reported by sdvormwa@cray.com on 2013-02-26 17:12:12

  21. Former user Account Deleted
    "it is natural to assume that they point at indefinitely blocked shared arrays,"
    
    I disagree that is a "natural assumption", but we can also add an amplification phrase
    like "The following property applies to all shared arrays".
    
    "the ONLY way they could point at (part of) a definitely blocked shared array is with
    an EXPLICIT cast,"
    
    This is also false. Simple concrete example:
    
    shared [] int *S1 = upc_all_alloc(2*THREADS, 100*sizeof(int))
    shared [] int *S2 = S1 + 5;
    
    there is no cast here. 
    
    The text in 6.4.2 is describing a mathematical property that must hold true for all
    elements in EVERY shared array with the same affinity. It does not prescribe an algorithm
    to construct the pointers to actually perform this check, because it's unnecessary
    to do so. The "old" text uses cyclic pointers to state the property, my proposed "new"
    text uses indefinite pointers to state the strengthened property. This is no way changes
    the fact these properties must be preserved for ALL shared arrays. If that's not sufficiently
    clear in the old text then we need to amplify this point, and that's orthogonal to
    this issue.
    
    Your proposed equations in comment #13 look correct to my casual inspection, but it
    is also significantly "denser", and in my opinion sacrifices clarity as a result. My
    change is minimalistic and I believe easier to understand.
    
    Brian said:
    > What am I missing here?
    
    When S1 and S2 are properly declared as indefinitely blocked, the expression (S2-S1)
    in your example evaluates to 1, exactly matching (P2 - P1). (assuming a compliant implementation)
    

    Reported by danbonachea on 2013-02-26 17:32:34

  22. Former user Account Deleted
    Here's a concrete example of applying the OLD declarations to a definitely-blocked array,
    based on Brian's example:
    
    shared [10] int Arr[20*THREADS];  // assume THREADS == 4
    shared int *S1 = (shared int *)&Arr[0];
    shared int *S2 = (shared int *)&Arr[10*THREADS]; 
    int *P1 = (int *)S1;
    int *P2 = (int *)S2;
    
    Note the "old" text ALSO requires a cast in order to "check" the property on any statically-allocated
    array which is not cyclic. The only time a cast is not required is when the array in
    question happens to be cyclic. A similar cast is required under the proposed text.
    This is nothing new.
    

    Reported by danbonachea on 2013-02-26 17:42:26

  23. Former user Account Deleted
    Ah, you're right.  I thought C99 required an explicit cast if the referenced types are
    not compatible, but I see that's not the case.  Forget about that. ;)
    
    That said, I think we still run into potential issues with alignment.  If an implementation
    defines that the alignment of 'shared [] int' is different than the alignment of 'shared
    [B] int' for any positive B, then the conversion results in undefined behavior.  My
    proposed text in comment 13 still works and allows this, while the proposed text in
    comments 9-10 does not.
    

    Reported by sdvormwa@cray.com on 2013-02-26 18:00:10

  24. Former user Account Deleted
    > If an implementation defines that the alignment of 'shared [] int' is different than
    
    > the alignment of 'shared [B] int' for any positive B, then the conversion results
    in 
    > undefined behavior.  My proposed text in comment 13 still works and allows this,
    
    > while the proposed text in comments 9-10 does not.
    
    Let's backup and make sure we agree on the goals of this clarification. I think there
    are two goals:
    1) Clarify that users can create a pointer-to-local to the slice of an array with affinity
    to one thread, and access it as a contiguous local array.
    I would also add:
    2) Clarify that users to construct different "views" of their array using PTS's with
    different blocksizes and access the elements using the most natural indexing arithmetic
    for the current piece of code. This practice is already commonplace in deployed UPC
    codes, and I believe we must allow it. 
    
    Requirement #2 has also long been explicitly encoded (to some extent) in the UPC bulk
    transfer libraries, collectives and now nb transfers, many of which contain text like:
       The upc_memcpy function treats the dst and src pointers as if they had type: shared
    [] char[n]
    This implies the elements in "local slice" of any shared array that can be passed to
    these functions must be valid to access using indefinite blocking.
    The shared array dynamic allocation functions also rely upon this "reblocking" property:
    
       The upc_global_alloc allocates shared space compatible with the declaration:  shared
    [nbytes] char[nblocks * nbytes].
    Here [nbytes] is a typeless blocking factor, whose numerical value will differ from
    the typed blocking factor the user uses to access the array (for any type with sizeof()
    > 1).
    
    For all these reasons, I believe an implementation that uses different element alignment
    constraints based on the blocksize used in the allocation is simply an invalid implementation
    that must be prohibited. 
    
    My proposal directly enforces requirement #2 for "reblocking" any array to indefinite
    (because the equations would be false if the implementation did not comply for that
    case). By transitivity, it also enforces it for reblocking to any block size.
    

    Reported by danbonachea on 2013-02-26 18:19:18

  25. Former user Account Deleted
    Actually, I suppose that [comment 23] is exactly what we're trying to prevent, so never
    mind. ;)
    
    I still think that using a pointer whose referenced type is indefinitely blocked to
    clarify something about definitely typed shared objects is a bit fishy.  However, if
    the general consensus is that it works, I'll go with it.
    

    Reported by sdvormwa@cray.com on 2013-02-26 18:22:29

  26. Former user Account Deleted
    s/definitely typed/definitely blocked/
    

    Reported by sdvormwa@cray.com on 2013-02-26 18:29:24

  27. Former user Account Deleted
    For completeness, below is a Latex diff of my proposed resolution for this issue, relative
    to the current working draft.
    I invite comment from other members of the committee.
    
    --- upc-language.tex    (revision 204)
    +++ upc-language.tex    (working copy)
    @@ -289,12 +289,12 @@
    
     \begin{verbatim}
         T *P1, *P2;
    -    shared T *S1, *S2;  
    +    shared [] T *S1, *S2;  
    
         P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
         P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
     \end{verbatim}
    -    
    +\xchangenote[id=DB]{106}{Declaration of S1/S1 pointers changed to indefinite blocksize}
    
     \np For all S1 and S2 that point to two distinct elements of
        the same shared array object which have affinity to the same
    @@ -305,6 +305,9 @@
     \item S2 and P2 shall point to the same object.
     \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall
        evaluate to the same value as ((P2 - P1) * sizeof(T)).
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.
    +}
     \end{itemize}
    
     \np Two compatible pointers-to-shared which point to the same
    

    Reported by danbonachea on 2013-02-26 20:20:56

  28. Former user Account Deleted
    Can we delete the third item in the list if we make this change?  I believe the new
    expression makes it redundant.
    

    Reported by sdvormwa@cray.com on 2013-02-26 20:37:34

  29. Former user Account Deleted
    That is, remove
    
     \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall
        evaluate to the same value as ((P2 - P1) * sizeof(T)).
    

    Reported by sdvormwa@cray.com on 2013-02-26 20:38:00

  30. Former user Account Deleted
    I think we've begun to converge on a "spec speak" version of "blocks with same affinity
    are contiguous in local memory".  In other words: we've required implementations to
    do something we "just knew" they must.
    
    If a user were to ask herself the same question today that Yili asked a couple days
    ago, do you think she'd find the answer in the spec as clarified by the proposed change?
     I doubt it.  So, would it be acceptable to add a footnote with the "plain English"
    conclusion?  I normally argue against such things, but this is a case where I think
    the conclusion is sufficiently non-obvious that my feelings on this are "neutral".
     So am ASKING: do others think such a footnote is appropriate?
    

    Reported by phhargrove@lbl.gov on 2013-02-26 20:48:33

  31. Former user Account Deleted
    Paul: Yes, I think a footnote would be helpful here.  (Then again, I'm always pro explanatory
    footnote.  :) )
    

    Reported by johnson.troy.a on 2013-02-26 21:00:12

  32. Former user Account Deleted
    > Can we delete the third item in the list if we make this change?  I believe the new
    expression makes it redundant.
    
    No. The third item constrains the allowable behavior of the upc_addrfield function,
    which otherwise just returns an "implementation-defined value". The library function
    semantics cross-reference to this section as part of its definition.
    
    > would it be acceptable to add a footnote with the "plain English" conclusion?  I
    normally argue against such 
    > things, but this is a case where I think the conclusion is sufficiently non-obvious
    that my feelings on this 
    > are "neutral".  So am ASKING: do others think such a footnote is appropriate?
    
    I would not be against inserting such a clarification footnote, PROVIDED it can be
    stated in a way that doesn't resort to undefined terms or operational implementation
    details (eg "thread's address space", "thread's memory"), or imply any requirement
    stronger than the one we're trying to impose.
    
    Perhaps just a very high-level clue like this?:
    \footnote{This implies there is no padding inserted between shared array elements with
    affinity to a thread}
    

    Reported by danbonachea on 2013-02-26 21:05:57

  33. Former user Account Deleted
    Dan,
    
    I fear "no padding" could be misunderstood to mean that the padding NORMALLY present
    between elements in arrays of structs might be omitted.  So, would the following work
    for you (with the emphasis NOT intended for inclusion in the spec):
    
    \footnote{This implies there is no padding inserted between BLOCKS OF shared array
    elements with affinity to a thread}
    

    Reported by phhargrove@lbl.gov on 2013-02-26 21:13:21

  34. Former user Account Deleted
    > \footnote{This implies there is no padding inserted between BLOCKS OF shared array
    elements with affinity to a thread}
    
    Sounds reasonable to me.
    

    Reported by danbonachea on 2013-02-26 21:31:56

  35. Former user Account Deleted
    Official proposal mailed 2/26/13:
    
    --- upc-language.tex    (revision 204)
    +++ upc-language.tex    (working copy)
    @@ -289,12 +289,12 @@
    
     \begin{verbatim}
         T *P1, *P2;
    -    shared T *S1, *S2;  
    +    shared [] T *S1, *S2;  
    
         P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
         P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
     \end{verbatim}
    -    
    +\xchangenote[id=DB]{106}{Declaration of S1/S1 pointers changed to indefinite blocksize}
    
     \np For all S1 and S2 that point to two distinct elements of
        the same shared array object which have affinity to the same
    @@ -305,6 +305,10 @@
     \item S2 and P2 shall point to the same object.
     \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall
        evaluate to the same value as ((P2 - P1) * sizeof(T)).
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
    +\truefootnote{This implies there is no padding inserted between blocks of shared array
    elements with affinity to a thread.}
    +}
     \end{itemize}
    
     \np Two compatible pointers-to-shared which point to the same
    

    Reported by danbonachea on 2013-02-26 23:10:57 - Status changed: PendingApproval

  36. Former user Account Deleted
    > No. The third item constrains the allowable behavior of the upc_addrfield function,
    which otherwise just returns an "implementation-defined value". The library function
    semantics cross-reference to this section as part of its definition.
    
    Woah, I completely missed that.  In that case, I think we need to think about this
    a bit more.  Consider the following code:
    
    shared int *S1;
    shared [] int *S2;
    
    S2 = S1;
    
    if ( upc_addrfield(S1) == upc_addrfield(S2) ) {
        printf("Match\n");
    }
    
    With the proposed change (as well as UPC 1.2 apparently), it is undefined whether or
    not anything is printed.  We don't require anywhere that pointers-to-shared with different
    types that point to the same object produce the same result when passed to upc_addrfield(),
    merely that such pointers shall compare equal.  Since I don't think this is intended,
    we probably need a stronger statement somewhere to make upc_addrfield() useful.
    

    Reported by sdvormwa@cray.com on 2013-02-26 23:47:33

  37. Former user Account Deleted
    > Since I don't think this is intended, we probably need a stronger statement somewhere
    to make upc_addrfield() useful.
    
    The specification and behavior of upc_addrfield() is a new issue, which I've entered
    as issue 107. Please continue discussion of that topic there.
    
    The current issue and proposed fix are completely orthogonal to the semantic guarantees
    of upc_addrfield().
    

    Reported by danbonachea on 2013-02-27 11:07:53

  38. Former user Account Deleted
    > The current issue and proposed fix are completely orthogonal to the semantic guarantees
    of upc_addrfield().
    
    True, but placing such semantic guarantees on upc_addrfield() makes writing tests for
    the current issue much easier.
    

    Reported by sdvormwa@cray.com on 2013-02-27 14:02:42

  39. Former user Account Deleted
    > > Can we delete the third item in the list if we make this change?  I believe the
    new expression makes it redundant.
    >
    > No. The third item constrains the allowable behavior of the upc_addrfield function,
    which otherwise just returns an "implementation-defined value". The library function
    semantics cross-reference to this section as part of its definition.
    
    > The current issue and proposed fix are completely orthogonal to the semantic guarantees
    of upc_addrfield().
    
    Since the only constraint on the result of upc_addrfield() is that that the difference
    between its results when applied to two pointers-to-shared with particular properties
    be equal to the difference between two pointers-to-local pointing to the same objects
    scaled by the size of the type, it seems likely that the reason that constraint exists
    was to attempt to address this very issue.  Therefore, I think we should remove that
    constraint as part of the change for this issue, and leave the deprecation of upc_addrfield()
    and discussion of a possible replacement for issue 107.
    

    Reported by sdvormwa@cray.com on 2013-02-27 18:13:12

  40. Former user Account Deleted
    > Can we delete the third item in the list if we make this change?  I believe the new
    expression makes it redundant.
    
    Note also that, by changing the type of S1 and S2 as proposed, any existing UPC 1.2
    programs that relied on this constraint would have to be changed, as the constraint
    itself is subtly different due to the inherited type change, even though the wording
    remains the same (see issue 107).
    

    Reported by sdvormwa@cray.com on 2013-02-27 19:01:19

  41. Former user Account Deleted
    > I think we should remove that constraint as part of the change for this issue, and
    
    > leave the deprecation of upc_addrfield() and discussion of a possible replacement
    
    > for issue 107.
    
    This PendingApproval issue (106) is concerned with clarifying an ambiguity concerning
    the contiguity of shared array elements. That goal has nothing to do with the upc_addrfield()
    library function, aside from textual proximity in the spec. The fact that the resolution
    of this issue might render one application of the library function obsolete does not
    automatically imply that a constraint used to define library behavior should be removed.
    
    I agree we should consider *eventually* removing the constraint as part of the resolution
    to issue 107, if we decide to deprecate the function. However I believe relaxing the
    semantic definition prior to the deprecation of upc_addrfield would be premature. For
    better or worse, it does currently constrain the implementation-defined behavior of
    that function, and I don't think we should be tweaking the semantics of a function
    we are considering throwing away. 
    
    Issue 107 is concerned with modifying the library function semantics. Please take further
    discussion of this topic there.
    

    Reported by danbonachea on 2013-02-27 19:02:00

  42. Former user Account Deleted
    > This PendingApproval issue (106) is concerned with clarifying an ambiguity concerning
    the contiguity of shared array elements. That goal has nothing to do with the upc_addrfield()
    library function, aside from textual proximity in the spec. The fact that the resolution
    of this issue might render one application of the library function obsolete does not
    automatically imply that a constraint used to define library behavior should be removed.
    
    But by changing the type of S1 and S2, we are already implicitly removing the existing
    constraint, and adding a new similar one.  It is unlikely that one would even notice
    this unless one looks very closely at the differences between the 1.2 and 1.3 specs
    and works through the semantics.  That seems much more dangerous to me.
    

    Reported by sdvormwa@cray.com on 2013-02-27 19:07:04

  43. Former user Account Deleted
    > But by changing the type of S1 and S2, we are already implicitly removing the existing
    constraint, and adding a new similar one. 
    
    As I argued at length in issue 107, comment 4:
      http://code.google.com/p/upc-specification/issues/detail?id=107#c4
    there is no change to the language-level constraint on the behavior of the library
    function.
    

    Reported by danbonachea on 2013-02-27 19:19:42

  44. Former user Account Deleted
    > As I argued at length in issue 107, comment 4:
    >  http://code.google.com/p/upc-specification/issues/detail?id=107#c4
    > there is no change to the language-level constraint on the behavior of the library
    function.
    
    And as I argued at length in the very next comment (http://code.google.com/p/upc-specification/issues/detail?id=107#c5),
    that is simply not true.
    

    Reported by sdvormwa@cray.com on 2013-02-27 19:22:34

  45. Former user Account Deleted
    > there is no change to the language-level constraint on the behavior of the library
    function.
    
    We clearly disagree on this point, but I think we're wasting time arguing about a semantic
    quibble nobody has ever even noticed, let alone relied upon.
    
    Can we at least agree this distinction has no effect on the behavior of the library
    function in any current real implementation, and therefore on real users?
    

    Reported by danbonachea on 2013-02-27 19:26:20

  46. Former user Account Deleted
    To get really precise, in UPC 1.2 6.4.2 6 places constraints on the result of upc_addrfield()
    when passed a generic pointer-to-shared value that is the result of an implicit conversion
    from a pointer-to-shared whose referenced type has block size 1.  With the changes
    as proposed, the constraint now applies to the result of upc_addrfield() when passed
    a generic pointer-to-shared value that is the result of an implicit conversion from
    a pointer-to-shared whose referenced type has indefinite block size.  Because the UPC
    specification does not require that these values be the same (thus must merely compare
    equal), by using the proposed changes we have ever so subtly changed the constraint.
    
    I will agree that, to my knowledge, no existing UPC implementation, and thus no existing
    users, would be affected by this.  However, I strongly believe that we should not implicitly
    change an exising constraint in the spec to fix an "unrelated" issue.  Either we should
    modify the proposal so the UPC 1.2 semantics are preserved, explicitly remove the constraint
    or clearly call out that the constraint has changed.
    

    Reported by sdvormwa@cray.com on 2013-02-27 20:01:05

  47. Former user Account Deleted
    > Either we should modify the proposal so the UPC 1.2 semantics are preserved, 
    > explicitly remove the constraint or clearly call out that the constraint has changed.
    
    In one comment you're militant about a ridiculously subtle change to a constraint with
    no realistic impact on any implementation or user, and in the next you want to remove
    the constraint entirely, resulting in a significant semantic relaxation. Please choose
    a side.
    
    The change is already clearly annotated, as with every other semantic change in the
    1.3 working draft. The depth of the annotation is proportional to its expected impact
    (ie vanishingly small).
    

    Reported by danbonachea on 2013-02-27 20:09:42

  48. Former user Account Deleted
    If I am understanding things correctly (and please correctly me gently if not), then
    in comment #46 Steven has describe how we have proposed text that would remove one
    constraint (however subtle/implicit is may be) on the implementation of upc_addrfield()
    and replace it with a *different* constraint (equally subtle/implicit).
    
    Under other circumstances that might be an alarming thing to do.  HOWEVER, for this
    particular case the two constraints (and this is where I might not have followed) are
    ENTIRELY COMPATIBLE.  Not only are the compatible, but we have every reason to believe
    that every existing implementation satisfies both simultaneously.
    
    Would folks be more or less happy with a proposal that introduced a P3 and S3 so that
    the example could provide BOTH cyclic and indefinite examples and thus ADD the new
    constraint on upc_addrfield() without removing the original one?
    

    Reported by phhargrove@lbl.gov on 2013-02-27 20:37:05

  49. Former user Account Deleted
    > introduced a P3 and S3 so that the example could provide BOTH cyclic and indefinite
    
    > examples and thus ADD the new constraint on upc_addrfield() without removing the
    
    > original one?
    
    I don't think it makes sense to add verbiage to 6.4.2 whose only purpose is to clarify
    the behavior of a library function defined in section 7. The section is already sufficiently
    subtle without the introduction of what is essentially irrelevant noise. The original
    constraint should probably have appeared in section 7 in the first place.
    
    If committee members (other than Steve) are convinced that we must preserve this effectively
    meaningless semantic distinction that has everyone agrees has no impact on real implementations
    or users, then I think it makes more sense to MOVE the old constraint (as originally
    written) into section 7.2.3.4 and remove it from 6.4.2. If issue 107 resolves to deprecate,
    strengthen or remove upc_addrfield in some future revision of the spec, any such change
    would remain local to 7.2.3.4.
    

    Reported by danbonachea on 2013-02-27 21:04:57

  50. Former user Account Deleted
    > In one comment you're militant about a ridiculously subtle change to a constraint
    with no realistic impact on any implementation or user, and in the next you want to
    remove the constraint entirely, resulting in a significant semantic relaxation. Please
    choose a side.
    
    My concern here is that we are changing the semantics of an unrelated constraint with
    the proposed changes without explicitly calling out that the new semantics are different
    than the old.  If the change is intentional, we should add language to make it clear
    that it is intentional.  If it is not intentional, then we need to modify proposal
    so that the existing behavior is preserved.  I believe these two options apply for
    ANY proposed change that would have the effect of silently changing the semantics of
    unrelated parts of the language.  However, because this specific constraint is effectively
    useless, I think that a simpler third alternative would be to explicitly remove it.
     Any of these solutions would be acceptable to me.
    
    > Under other circumstances that might be an alarming thing to do.  HOWEVER, for this
    particular case the two constraints (and this is where I might not have followed) are
    ENTIRELY COMPATIBLE.  Not only are the compatible, but we have every reason to believe
    that every existing implementation satisfies both simultaneously.
    
    No, they are not compatible.  We do however believe that every existing implementation
    satisfies both already.
    

    Reported by sdvormwa@cray.com on 2013-02-27 21:28:32

  51. Former user Account Deleted
    Steve wrote:
    > No, they are not compatible.  We do however believe that every existing
    > implementation satisfies both already.
    
    My intended meaning for "compatible" was "it is possible to satisfy both".
    So, under that definition they ARE compatible.
    
    Steve,
    What was your interpretation of "compatible" under which the 2 constraints are NOT
    compatible?  (honest lack of understanding on my part - not being sarcastic).
    

    Reported by phhargrove@lbl.gov on 2013-02-27 21:46:32

  52. Former user Account Deleted
    > What was your interpretation of "compatible" under which the 2 constraints are NOT
    compatible? 
    
    I thought you meant that there was no user-detectable difference between the two.
    

    Reported by sdvormwa@cray.com on 2013-02-27 21:49:06

  53. Former user Account Deleted
    > My concern here is that we are changing the semantics of an unrelated constraint 
    > with the proposed changes without explicitly calling out that the new semantics are
    
    > different than the old.  If the change is intentional, we should add language to
    
    > make it clear that it is intentional.
    
    The change annotation already automatically includes a hyperlink directly to this very
    page, where the issue is discussed ad nauseum in the comments above. Any users or implementers
    who really care (and I strongly believe that's the singleton set: { Steve }) can click
    the hyperlink to read all about it right here. What more do we need?
    

    Reported by danbonachea on 2013-02-27 21:57:37

  54. Former user Account Deleted
    > The change annotation already automatically includes a hyperlink directly to this
    
    > very page, where the issue is discussed
    
    I should also note this is the standard procedure we've been following for EVERY spec
    change. Rationale and motivation is NOT placed in the document. Every change includes
    only the actual textual wording change and an issue number with a hyperlink where interested
    parties can read the details. This applies even for changes with very large semantic
    impact and complicated implications - the spec only contains the normative text, the
    issue database contains the rationale and story behind the change.
    

    Reported by danbonachea on 2013-02-27 22:07:07

  55. Former user Account Deleted
    > The change annotation already automatically includes a hyperlink directly to this
    very page, where the issue is discussed ad nauseum in the comments above.
    
    But there's no change annotation on the constraint we're referring to and most people
    aren't going to recognize at first glance that the type change in the previous paragraph
    actually has a semantic effect on the constraint.  I certainly didn't recognize that
    until you pointed out that there is only that single constraint on the result of upc_addrfield(),
    and then spent a half an hour thinking through the implications of that revelation.
     So how is someone that is wondering why upc_addrfield() doesn't do what it used to
    going to know to reference an apparently unrelated issue?  Well, we make it clear that
    it is not in fact unrelated.
    
    Here, a proposed modification:
    
    Index: lang/upc-language.tex
    ===================================================================
    --- lang/upc-language.tex       (revision 205)
    +++ lang/upc-language.tex       (working copy)
    @@ -289,12 +289,12 @@
    
     \begin{verbatim}
         T *P1, *P2;
    -    shared T *S1, *S2;
    +    shared [] T *S1, *S2;
    
         P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
         P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
     \end{verbatim}
    -
    +\xchangenote[id=DB]{106}{Declaration of S1/S1 pointers changed to indefinite blocksize}
    
     \np For all S1 and S2 that point to two distinct elements of
        the same shared array object which have affinity to the same
    @@ -305,6 +305,11 @@
     \item S2 and P2 shall point to the same object.
     \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall
        evaluate to the same value as ((P2 - P1) * sizeof(T)).
    +\xchangenote[id=SV]{106}{The semantics of the constraint on upc\_addrfield() are slightly
    changed.}
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
    +\truefootnote{This implies there is no padding inserted between blocks of shared array
    elements with affinity to a thread.}
    +}
     \end{itemize}
    
     \np Two compatible pointers-to-shared which point to the same
    

    Reported by sdvormwa@cray.com on 2013-02-27 22:12:21

  56. Former user Account Deleted
    I don't want readers wasting their time trying to fathom this triviality that will never
    have any observable effect on anyone. I'm appalled that we've already wasted so much
    of our time on it.
    
    Updated PendingApproval proposal:
    
    --- upc-language.tex    (revision 204)
    +++ upc-language.tex    (working copy)
    @@ -289,12 +289,13 @@
    
     \begin{verbatim}
         T *P1, *P2;
    -    shared T *S1, *S2;  
    +    shared [] T *S1, *S2;  
    
         P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
         P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
     \end{verbatim}
    -    
    +\xchangenote[id=DB]{106}{Declaration of S1/S1 changed to indefinite blocksize to accomodate
    new constraint. 
    +This change also subtly modifies the constraint on {\tt upc\_addrfield} in a way that
    has no impact on current implementations.}
    
     \np For all S1 and S2 that point to two distinct elements of
        the same shared array object which have affinity to the same
    @@ -305,6 +306,10 @@
     \item S2 and P2 shall point to the same object.
     \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall
        evaluate to the same value as ((P2 - P1) * sizeof(T)).
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
    +\truefootnote{This implies there is no padding inserted between blocks of shared array
    elements with affinity to a thread.}
    +}
     \end{itemize}
    
     \np Two compatible pointers-to-shared which point to the same
    

    Reported by danbonachea on 2013-02-28 09:11:03

  57. Former user Account Deleted
    > I don't want readers wasting their time trying to fathom this triviality that will
    never have any observable effect on anyone. I'm appalled that we've already wasted
    so much of our time on it.
    
    I now have to revoke my support for this proposal.  I am quite disappointed to hear
    that the author of the proposed changes considers discussion of the implications of
    his changes is a waste of time.  This issue was pushed into 1.3 at the last minute
    with less than 2 days of discussion.  It SHOULD have been a 1.4 issue, given where
    we are in 1.3 and the fact that this is a clarification of behavior that is required
    for an implementation to get the correct semantics on many examples already in the
    spec and further, that we know of no existing implementation that doesn't already have
    this behavior.  It is good to clarify this property, but it is NOT ACCEPTABLE to rush
    changes into the spec at the last minute for a trivial clarification that affects no
    current implementations, and then summarily dismiss concerns that the proposed change
    has unintended consequences.
    

    Reported by sdvormwa@cray.com on 2013-02-28 15:38:12 - Status changed: Started

  58. Former user Account Deleted
    We had consensus on the call that generating a clarification for 1.3 is worthwhile.
    We have consensus that the proposed change effects the intended clarification.
    
    I'm frustrated arguing about a 0.001% semantic side-effect to a function that is already
    99.999% implementation-defined, and likely to receive massive semantic changes in a
    future revision. We already have universal agreement that said concern has no possibility
    of impact on real implementations and thus real users. Nevertheless, the issue has
    been noted in the change note upon Steve's insistence.
    
    It's not the issue, proposal or even the addendum that is a waste of time, it is the
    continued adversarial bickering on this triviality, which I now consider fully resolved.
    The discussion between myself and Steve on this matter has reached a point of hostility
    that I feel no further progress can be made. Steve if you still disagree, then seek
    out some impartial third party to back your argument.
    

    Reported by danbonachea on 2013-02-28 16:10:04 - Status changed: PendingApproval

  59. Former user Account Deleted
    +\xchangenote[id=DB]{106}{Declaration of S1/S1 changed to indefinite blocksize to accomodate
    new constraint. 
    +This change also subtly modifies the constraint on {\tt upc\_addrfield} in a way that
    has no impact on current implementations.}
    
    I believe you intend "S1 and S2" instead of S1/S1.
    
    Given that Issue 107 seems to be headed toward strengthening upc_addrfield instead
    of deprecating it, I agree with Steve that this issue should be moved to UPC 1.4. 
    If upc_addrfield were being deprecated and we were thus subtly changing a function
    that we no longer cared about, I'd be fine with the language in Comment #56 with the
    S1/S1 corrected. However, because it looks like we're keeping upc_addrfield and are
    likely to change it yet again in UPC 1.4, I believe it is best to wait on this change
    and make all of the upc_addrfield related changes at once.
    
    That said, I likely do not satisfy Dan's desire for an "impartial third party" requested
    in Comment #58 because I work with Steve, so I will wait to see what others say.
    

    Reported by johnson.troy.a on 2013-02-28 17:38:14

  60. Former user Account Deleted
    We're already going out of our way to clarify something that is already implied by other
    parts of the spec and which is the behavior of all existing UPC implementations.  When
    it looked like upc_addrfield() was going to be deprecated in 1.4 (see issue 107), it
    seemed to me that simply making a change note that the semantics of upc_addrfield()
    are slightly different in 1.3 was an acceptable compromise, because it would be going
    away anyway.  That seems to no longer be the case.
    
    In short, if we are going to change the semantics of upc_addrfield(), then we should
    do it right and make the function useful for something (see Paul's really good text
    in comment 17 in issue 107 (http://code.google.com/p/upc-specification/issues/detail?id=107#c17)
    regarding this).  If we're not going to fix it, then we should preserve the existing
    semantics.
    

    Reported by sdvormwa@cray.com on 2013-02-28 17:39:27

  61. Former user Account Deleted
    Below is the ALTERNATE proposal that I referred to in comment #49, which some people
    may find more acceptable. It moves the upc_addrfield constraint, completely unchanged,
    to the relevant library section (where it probably belonged in the first place). I
    believe this proposal preserves the semantic constraint on upc_addrfield in a way that
    is completely identical to 1.2 in every respect. This change has the benefit of removing
    a long cross-reference to important information concerning the library behavior, at
    the cost of a few duplicated lines of declarations. If issue 107 resolves to deprecate,
    strengthen or remove the upc_addrfield library functionin some future revision of the
    spec, any such change would likely modify these lines, but the change should remain
    local to 7.2.3.4.
    
    I would be satisfied with applying either this proposal or the one in comment #56 (with
    Troy's clarification to the annotation). I think either one resolves the original ambiguity
    which spawned this issue, which we agreed should be clarified in 1.3 if at all possible.
    I think it would be a shame to allow such an significant ambiguity to persist as a
    "known bug" in 1.3, when we all agree on how shared array contiguity needs to behave.
    
    --- upc-language.tex    (revision 204)
    +++ upc-language.tex    (working copy)
    @@ -289,12 +289,12 @@
    
     \begin{verbatim}
         T *P1, *P2;
    -    shared T *S1, *S2;  
    +    shared [] T *S1, *S2;  
    
    -    P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */ 
    -    P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */ 
    +    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
    +    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
     \end{verbatim}
    -    
    +\xchangenote[id=DB]{106}{Declaration of S1 and S1 changed to indefinite blocksize
    to accommodate new constraint.}
    
     \np For all S1 and S2 that point to two distinct elements of
        the same shared array object which have affinity to the same
    @@ -303,9 +303,12 @@
     \begin{itemize}
     \item S1 and P1 shall point to the same object.
     \item S2 and P2 shall point to the same object.
    -\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall 
    -   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
    +\truefootnote{This implies there is no padding inserted between blocks of shared array
    elements with affinity to a thread.}
    +}
     \end{itemize}
    +\xchangenote[id=DB]{106}{Constraint on {\tt upc\_addrfield} moved to Section~\ref{upc_addrfield}.}
    
     \np Two compatible pointers-to-shared which point to the same
         object (i.e. having the same address and thread components) shall
    
    --- upc-lib-core.tex    (revision 204)
    +++ upc-lib-core.tex    (working copy)
    @@ -302,11 +302,26 @@
    
     \np The {\tt upc\_addrfield} function returns an
        implementation-defined value reflecting the ``local address'' of the
    -   object pointed to by the pointer-to-shared argument.\footnote{%
    -   This function is used in defining the semantics of pointer-to-shared
    -   arithmetic in Section \ref{pointer-arithmetic}}
    +   object pointed to by the pointer-to-shared argument.
    
    -   
    +\np Given the following declarations:
    +
    +\begin{verbatim}
    +    T *P1, *P2;  
    +    shared T *S1, *S2;  
    +
    +    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
    +    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
    +\end{verbatim}
    +
    +   For all S1 and S2 that point to two distinct elements of
    +   the same shared array object which have affinity to the same
    +   thread, the expression:\\
    +    {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t)upc\_addrfield(S1))} \\
    +   shall evaluate to the same value as: {\tt ((P2 - P1) * sizeof(T))}.
    +
    +\xchangenote[id=DB]{106}{Paragraph moved from 6.4.2 and cross-reference footnote removed.}
    +
     \paragraph{The {\tt upc\_affinitysize} function}
    
     {\bf Synopsis}
    

    Reported by danbonachea on 2013-02-28 18:17:54

  62. Former user Account Deleted
    Dan asked me off-list to "chime in" on this.
    Just as Troy's co-worker relation to Steve disqualifies him as "impartial 3rd party"
    (comment #59), so does my co-worker relation to Dan.  I am stating that clearly so
    nobody things we are "pulling a fast one".
    
    First off, I am somewhat disheartened by the strength of the disagreement between Dan
    and Steve at this point, and don't feel that either one of them is 100% correct.  While
    I cannot (yet?) offer alternative text to resolve the original ambiguity, I am unsatisfied
    with the current proposal.  Here is a summary of my point-of-view.  I am labeling the
    points to make responding to them simpler.
    
    PHH1)  The proposal (idea, not the diff) in comment #61 to duplicate the semantics
    of upc_addrfield() to the library document has my support.  HOWEVER, that idea is independent
    of what changes are made to clarify that shared array elems are contiguous.  The proposed
    diff in that comment makes changes to 6.4.2 that I don't fully agree with.
    
    PHH2)  I agree (as I *think* we all do now) that the change from cyclic to indefinite
    for S1 and S2 does provide the desired constraint on the layout of array elements,
    but one may need read the issue tracker to understand that.  To put that in other words:
    I have no objection to the technical soundness of Dan's proposal for the purpose of
    resolving issue 106.
    
    PHH3)  I asked for the addition of a footnote because I felt that Dan's proposed changes
    failed to make the desired constraint on layout clear to most readers.  That is, to
    me, a small strike against Dan's proposal - though not on technical grounds.
    
    PHH4)  Dan wrote in comment #53
    > The change annotation already automatically includes a hyperlink directly to
    > this very page, where the issue is discussed ad nauseum in the comments above.
    > Any users or implementers who really care [...] can click the hyperlink to
    > read all about it right here. What more do we need?
    HOWEVER, once we reach final spec that isn't true. Our cover text says:
    > Change annotations in the specification body are for reviewer convenience only
    > and are not normative, nor will they appear in the final draft.
    
    PHH5)  Personally, I agree with Dan that the very small indirect change to the semantics
    of upc_addrfield() aren't worth a big fuss.  HOWEVER, the spec process is about building
    a consensus, and currently the "nearly silent" change to upc_addrfield()'s semantics
    are an obstacle to reaching that consensus.  So, since a "big fuss" does exist, it
    is our responsibility as members of this working group to work it through to a resolution.
    
    PHH6)  While the point was made that this issue is naturally tied to how one defines
    pointer arithmetic, following that direction has lead us to the current proposed change.
     This is a textually very small change, but has strong opposition.  So, I think it
    wise to consider what other options are available.  Maybe there is a better solution,
    or maybe the current proposed change is the "lesser of N evils" and will gain support
    when compared to one or more alternatives.
    
    PHH7)  Since I suggest we need to look at alternative, I feel obligated to attempt
    to provide at least one:  What if we pick up again from Steve's suggestion in comment
    #3 to augment 6.5.2.1, and work out a wording that doesn't use undefined terminology.
     Since the footnote of Dan's current proposal got an OK from Dan, we could start from
    that.
    
    PHH8)  For what it is worth: a upc_addrfield() strengthened to satisfy my requirements
    given in issue 107 would, I believe, match all currently known implementations and
    would be constrained by both the original version of 6.4.2 4-5 and the proposed indefinite
    version.
    

    Reported by phhargrove@lbl.gov on 2013-02-28 19:31:29

  63. Former user Account Deleted
    The proposal in comment 61 is acceptable, though I'd prefer to just outright fix upc_addrfield()
    at the same time, since it is so closely related to this issue.  As another alternative
    to consider, could we promote issue 107 to 1.3, use your original proposal for 106
    and fix the semantics of upc_addrfield()?  I believe my proposal in comment 29 (http://code.google.com/p/upc-specification/issues/detail?id=107#c29)
    addresses all of our concerns.
    

    Reported by sdvormwa@cray.com on 2013-02-28 19:32:05

  64. Former user Account Deleted
    It looks like my comment #62 and Steve's comment #63 "crossed in the ether".
    
    If Steve is happy with the contents of #61, then perhaps a wasted a lot of time typing
    and polishing my text for comment #62  :-)
    
    I will be examining Steve's proposal in issue 107 momentarily.
    

    Reported by phhargrove@lbl.gov on 2013-02-28 19:40:41

  65. Former user Account Deleted
    Responding to Paul's non-technical point:
    
    >PHH4) Our cover text says:
    > Change annotations in the specification body are for reviewer convenience only
    > and are not normative, nor will they appear in the final draft.
    
    Once the spec ratification process is complete, we will generate and distribute a 1.3
    document that is the "official" language definition that contains only the normative
    text. However, I believe we decided last year that we would additionally distribute
    a version of the document with change bars and annotations intact (and possibly also
    a full Latex diff). The former will serve as the official normative definition of the
    revised language, while the latter will be provided for reference purposes to implementers
    and users during the transition to 1.3 compliance.
    

    Reported by danbonachea on 2013-02-28 20:14:04

  66. Former user Account Deleted
    Dan wrote:
    > Responding to Paul's non-technical point:
    > 
    > PHH4) Our cover text says:
    > Change annotations in the specification body are for reviewer convenience only
    > and are not normative, nor will they appear in the final draft.
    > 
    > Once the spec ratification process is complete, we will generate and distribute
    > a 1.3 document that is the "official" language definition that contains only the
    > normative text. However, I believe we decided last year that we would additionally
    > distribute a version of the document with change bars and annotations intact (and
    > possibly also a full Latex diff). The former will serve as the official normative
    > definition of the revised language, while the latter will be provided for reference
    > purposes to implementers and users during the transition to 1.3 compliance.
    
    
    Dan,
    
    Thanks for cluing me in on this point - your comment #53 make more sense to me now.
    My absence from recent conference calls has left me ignorant of some things like this.
    
    -Paul
    

    Reported by phhargrove@lbl.gov on 2013-02-28 20:25:40

  67. Former user Account Deleted
    I believe that if we accept the changes in comment #61, then we should additionally
    remove upc_addrfield from the list of forward references at the end of 6.4.2.  Right?
    

    Reported by phhargrove@lbl.gov on 2013-02-28 20:41:37

  68. Former user Account Deleted
    Updated proposal below. We seem to be approaching consensus on this language.
    
    It is the same proposal from comment #61, with the following modifications:
    
    * Removed the forward reference pointed out by Paul in comment #67
    * Add a comment clarifying that T is not a shared type in both copies of the declarations
    * Augmented the change note to indicate that comments have been clarified
    
    The last two subsume a similar change to the same lines in the issue 3 proposal, to
    prevent a merge collision. If for some reason the issue 3 change is rejected (seems
    unlikely), then the comment will have to be re-phrased using current definitions.
    
    --- upc-language.tex    (revision 204)
    +++ upc-language.tex    (working copy)
    @@ -288,13 +288,13 @@
        constructs:
    
     \begin{verbatim}
    -    T *P1, *P2;  
    -    shared T *S1, *S2;  
    +    T *P1, *P2;    /* T is not a shared type */
    +    shared [] T *S1, *S2;  
    
    -    P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */ 
    -    P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */ 
    +    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
    +    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
     \end{verbatim}
    -    
    +\xchangenote[id=DB]{106}{Declaration of S1 and S1 changed to indefinite blocksize
    to accommodate new constraint. Comments clarified.}
    
     \np For all S1 and S2 that point to two distinct elements of
        the same shared array object which have affinity to the same
    @@ -303,9 +303,12 @@
     \begin{itemize}
     \item S1 and P1 shall point to the same object.
     \item S2 and P2 shall point to the same object.
    -\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall 
    -   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
    +\truefootnote{This implies there is no padding inserted between blocks of shared array
    elements with affinity to a thread.}
    +}
     \end{itemize}
    +\xchangenote[id=DB]{106}{Constraint on {\tt upc\_addrfield} moved to Section~\ref{upc_addrfield}.}
    
     \np Two compatible pointers-to-shared which point to the same
         object (i.e. having the same address and thread components) shall
    
    --- upc-lib-core.tex    (revision 204)
    +++ upc-lib-core.tex    (working copy)
    @@ -302,11 +302,26 @@
    
     \np The {\tt upc\_addrfield} function returns an
        implementation-defined value reflecting the ``local address'' of the
    -   object pointed to by the pointer-to-shared argument.\footnote{%
    -   This function is used in defining the semantics of pointer-to-shared
    -   arithmetic in Section \ref{pointer-arithmetic}}
    +   object pointed to by the pointer-to-shared argument.
    
    -   
    +\np Given the following declarations:
    +
    +\begin{verbatim}
    +    T *P1, *P2;    /* T is not a shared type */
    +    shared T *S1, *S2;  
    +
    +    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
    +    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
    +\end{verbatim}
    +
    +   For all S1 and S2 that point to two distinct elements of
    +   the same shared array object which have affinity to the same
    +   thread, the expression:\\
    +    {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t)upc\_addrfield(S1))} \\
    +   shall evaluate to the same value as: {\tt ((P2 - P1) * sizeof(T))}.
    +
    +\xchangenote[id=DB]{106}{Paragraph moved from 6.4.2 and cross-reference footnote removed.}
    +
     \paragraph{The {\tt upc\_affinitysize} function}
    
     {\bf Synopsis}
    

    Reported by danbonachea on 2013-03-01 13:30:00 - Labels added: Consensus-High - Labels removed: Consensus-Low

  69. Former user Account Deleted
    Unfortunately, a few of the semantic issues brought up in issue 107 apply here, and
    we should modify the wording to take them into account.
    
    1. Is pointing to one element past the end of a shared array object valid (as it is
    for local objects by ISO/IEC 9899 6.5.6 8-9)?  If so, we should be sure that we get
    the expected behavior for those as well.  Note that this is a much larger change, as
    a lot of the spec assumes that any valid non-null pointer-to-shared points to an object.
    
    2. Given a multi-dimensional shared array object, an object with ultimate element type
    (see issue 3) of the array object and contained within it is not an element of the
    array object (see http://code.google.com/p/upc-specification/issues/detail?id=107#c64).
     Thus the wording "that point to two distinct elements of same shared array object"
    would not cover such cases.
    

    Reported by sdvormwa@cray.com on 2013-03-01 16:01:54

  70. Former user Account Deleted
    > Is pointing to one element past the end of a shared array object valid (as it is for
    local objects by ISO/IEC 9899 6.5.6 8-9)?
    
    This questionable "feature" of C99 is one that UPC does not currently specify as valid
    for definitely-blocked shared arrays, and I suspect current implementations differ
    in their behavior. Unlike in C, blocked pointer arithmetic in UPC is not a simple linear
    relationship, so "one past" the last element in a shared array is a non-trivial concept
    to express. Specifically, the location of "one past" would often depend on the blocking
    factor of the PTS, and the affinity and phase of such a pointer would be questionable
    as well. I consider it a *feature* that UPC leaves indexing past the end of a shared
    array unspecified, and therefore undefined behavior. Changing that would be a significant
    behavioral modification with a non-trivial impact on some of the trickiest code in
    our implementations.
    
    Even if you don't agree with my reasoning above, THIS issue (106) deals ONLY with clarifying
    the placement of actual shared array ELEMENTS in memory and clarifying they are placed
    contiguously; a clarification that has no behavioral or implementation impact, and
    reflects the common understanding of all UPC implementers and users since the language
    inception. Adding additional flexibility to PTS arithmetic clearly falls outside the
    scope of this effort and is orthogonal to it, despite the fact that it might eventually
    modify the same section. Please open a NEW issue if you wish to pursue that matter
    (or any other issue not directly related to this clarification).
    
    > an object with ultimate element type (see issue 3) of the array object and contained
    within it is not an element of the array object 
    
    This case is already prevented by the type declarations of S1 and S2. They both have
    the same referent type, and thus they both must already point to elements at the same
    "level" of the multi-D shared array.
    

    Reported by danbonachea on 2013-03-01 16:37:34

  71. Former user Account Deleted
    > the last element in a shared array is a non-trivial concept to express. Specifically,
    the location of "one past" would often depend on the blocking factor of the PTS, and
    the affinity and phase of such a pointer would be questionable as well.
    
    No it isn't.  This is trivial to express.  The existing equations in 6.4.2 3 define
    the exact behavior of upc_threadof() and upc_phaseof().  My proposal in comment 13
    suffices to define the behavior of upc_addrfield(), and can be trivially tweaked to
    define the local address as well.  Since you can't do pointer-to-shared arithmetic
    on generic pointers-to-shared, nor on pointers-to-shared whose referenced type is incomplete,
    we don't need to worry about what "one past" means in those cases, and it is well-defined
    for all others.
    
    > This case is already prevented by the type declarations of S1 and S2. They both have
    the same referent type, and thus they both must already point to elements at the same
    "level" of the multi-D shared array.
    
    I think you missed my point.  Given
    
    shared [] char A[2][2];
    
    shared [] char *si1 = &A[0][0];
    shared [] char *si2 = &A[1][0];
    int *pi1 = (int *)si1;
    int *pi2 = (int *)si2;
    
    shared [] char (*sa1) = &A[0];
    shared [] char (*sa2) = &A[1];
    int (*pa1)[2] = (char *)sa1;
    int (*pa2)[2] = (char *)sa2;
    
    Using your logic from http://code.google.com/p/upc-specification/issues/detail?id=107#c64,
    because the shared array object that A[0][0] is an element of is A[0], and the shared
    array object that A[1][0] is an element of is A[1], and these are not the same shared
    array object, the new constraint DOES NOT apply to the expression (pi1 + (si2 - si1)
    == pi2), but DOES apply to (pa1 + (pa2 - pa1) == pa2).  I think we want the constraint
    to apply to the former as well.
    

    Reported by sdvormwa@cray.com on 2013-03-01 17:50:09

  72. Former user Account Deleted
    Sorry that should be
    
    shared [] char A[2][2];
    
    shared [] char *si1 = &A[0][0];
    shared [] char *si2 = &A[1][0];
    int *pi1 = (char *)si1;
    int *pi2 = (char *)si2;
    
    shared [] char (*sa1)[2] = &A[0];
    shared [] char (*sa2)[2] = &A[1];
    int (*pa1)[2] = (char (*)[2])sa1;
    int (*pa2)[2] = (char (*)[2])sa2;
    
    I got interrupted while changing the types and forgot to finish when I came back. ;)
    

    Reported by sdvormwa@cray.com on 2013-03-01 18:07:45

  73. Former user Account Deleted
    And I still missed four 'int' -> 'char' conversions. =(
    
    char *pi1 = (char *)si1;
    char *pi2 = (char *)si2;
    ...
    char (*pa1)[2] = (char (*)[2])sa1;
    char (*pa2)[2] = (char (*)[2])sa2;
    

    Reported by sdvormwa@cray.com on 2013-03-01 18:09:10

  74. Former user Account Deleted
    Steve, since you seem unwilling to take unrelated issues to new threads, I've created
    an issue for you to discuss your latest completely unrelated comment on this issue:
    
    http://code.google.com/p/upc-specification/issues/detail?id=109
    
    Please take that discussion there and lets keep this one on topic please.
    
    Your multi-D example makes no sense to me. Please reformulate it in a way that directly
    applies to the declarations and variable names in 6.4.2-5 that are the topic of this
    issue.
    

    Reported by danbonachea on 2013-03-01 18:29:18

  75. Former user Account Deleted
    > This questionable "feature" of C99 is one that UPC does not currently specify as valid
    for definitely-blocked shared arrays
    
    UPC DOES currently permit it for indefinitely blocked shared arrays however, which,
    due to how you chose to make the change (via an example with pointers whose referenced
    type are INDEFINITELY BLOCKED), are the only ones that matter here.  To quote 6.4.2
    2:
    
    If the shared array is declared with indefinite block size, the result of the pointer-to-shared
    arithmetic is identical to that described for normal C pointers in [ISO/IEC00 Sec.
    6.5.6], except that the thread of the new pointer shall be the same as that of the
    original pointer and the phase component is defined to always be zero.
    
    Oddly enough, I believe this statement already provides not only the constraint that
    you are attempting to explicitly add, but a much stronger one.
    

    Reported by sdvormwa@cray.com on 2013-03-01 18:49:17

  76. Former user Account Deleted
    > Oddly enough, I believe this statement already provides not only the constraint that
    you are attempting to explicitly add, but a much stronger one.
    
    I believe contiguity of indefinitely blocked array elements was never in question (due
    in part to that very text). The clarification of this issue was primarily motivated
    for definitely blocked arrays, and is benignly redundant for indefinitely blocked arrays.
    
    > due to how you chose to make the change (via an example with pointers whose referenced
    type are INDEFINITELY BLOCKED), are the only ones that matter here.
    
    Once again you are confusing pointers and arrays. We already hashed out this exact
    argument in comment 16-17, but since you brought it up again for some reason, I'll
    restate the salient point. The proposal for this issue clarifies the layout of ARRAYS
    OF ANY BLOCKING FACTOR, and merely uses POINTERS OF A PARTICULAR BLOCKING FACTOR as
    a notational convenience to express the necessary constraint (because non-generic PTS
    must have SOME blocking factor, and that particular choice allowed the most concise
    expression). 
    
    Anticipating your next response, the unmodified setup text in 6.4.2 which is the precondition
    to the clarification constraint says:
    
      For all S1 and S2 that point to two distinct ELEMENTS of the same shared
      array object which have affinity to the same thread.
    
    The fact that S1 and S2 *could* be pointed at unallocated space is completely irrelevant,
    because the precondition explicitly states that they are NOT. When this precondition
    is violated, the logical implication is vacuously asserted and the constraint is irrelevant.
    

    Reported by danbonachea on 2013-03-01 19:06:51

  77. Former user Account Deleted
    > Your multi-D example makes no sense to me. Please reformulate it in a way that directly
    applies to the declarations and variable names in 6.4.2-5 that are the topic of this
    issue.
    
    Ok, I'll try to be a bit more clear.
    
    shared [2] char A[2*THREADS][2];                // Declare an multi-dimensional shared
    array object
    
    shared [] char (*S1)[2] = &A[MYTHREAD];         // Points to the first element of A
    on the local thread
    shared [] char (*S2)[2] = &A[THREADS+MYTHREAD]; // Points to the second element of
    A on the local thread
    
    char (*P1)[2] = (char (*)[2]) S1;
    char (*P2)[2] = (char (*)[2]) S2;
    
    if ( P1 + (S2 - S1) == P2 ) {
        // Guaranteed by the new constraint
    }
    
    shared [] char *S3 = &A[MYTHREAD][0];         // Points to the first element of *S1
    on the local thread
    shared [] char *S4 = &A[THREADS+MYTHREAD][0]; // Points to the first element of *S2
    on the local thread
    
    char *P3 = (char *)S3;
    char *P4 = (char *)S4;
    
    if ( P3 + (S4 - S3) == P4 ) {
        // Unspecified because neither S3 nor S4 point to elements of the same object.
        // However, programmers using multidimensional shared arrays are more likely to
    use this form.
    }
    

    Reported by sdvormwa@cray.com on 2013-03-01 19:22:23

  78. Former user Account Deleted
    > Once again you are confusing pointers and arrays. We already hashed out this exact
    argument in comment 16-17, but since you brought it up again for some reason, I'll
    restate the salient point.
    
    And you are ignoring C99 constraints on accessing objects.  ISO/IEC 9899 6.5 7 (emphasis
    mine):
    
    An object shall have its stored value accessed ONLY by an lvalue expression that has
    one of the following types:
    
    -- a type compatible with the effective type of the object
    -- a qualified version of a type compatible with the effective type of the object
    -- a type that is the signed or unsigned type corresponding to the effective type of
    the object.
    -- a type that is the signed or unsigned type corresponding to a qualified version
    of the effective type of the object.
    -- an aggregate or union type that includes one of the aforementioned types among its
    members (including, recursively, a member of a subaggregate or contained union), or
    -- a character type
    
    Since we defined that the blocking factor is part of the type compatibility, accessing
    elements of a definitely blocked shared array via a pointer-to-shared whose referenced
    type is indefinitely blocked is undefined (unless the pointer-to-shared's referenced
    type is a character type).
    

    Reported by sdvormwa@cray.com on 2013-03-01 19:33:21

  79. Former user Account Deleted
    > An object shall have its stored value accessed ONLY by an lvalue expression that has
    one of the following types:
    
    Irrelevant to this issue. The equations in 6.4.2 do not ACCESS any heap objects whatsoever.
    
    > Since we defined that the blocking factor is part of the type compatibility, accessing
    elements of a 
    > definitely blocked shared array via a pointer-to-shared whose referenced type is
    indefinitely blocked 
    > is undefined (unless the pointer-to-shared's referenced type is a character type).
    
    This is irrelevant to the current issue, but I believe this assertion to be false and
    represents a misunderstanding of type compatibility. If you don't agree please open
    a NEW issue to discuss that separate topic.
    

    Reported by danbonachea on 2013-03-01 19:40:05

  80. Former user Account Deleted
    Code from Steve's comment (ignoring missing casts in S3/S4 initializers):
    -----------------------------
    shared [2] char A[2*THREADS][2];                // Declare an multi-dimensional shared
    array object
    ...
    shared [] char *S3 = &A[MYTHREAD][0];         // Points to the first element of *S1
    on the local thread
    shared [] char *S4 = &A[THREADS+MYTHREAD][0]; // Points to the first element of *S2
    on the local thread
    
    char *P3 = (char *)S3;
    char *P4 = (char *)S4;
    
    if ( P3 + (S4 - S3) == P4 ) {
        // Unspecified because neither S3 nor S4 point to elements of the same object.
        // However, programmers using multidimensional shared arrays are more likely to
    use this form.
    }
    -----------------------------
    OK I understand your nitpick now, but I respectfully disagree. The intended meaning
    in this case is that S3 and S4 both indeed "point to two distinct elements of the same
    shared array object", namely they point to elements of the enclosing multidimensional
    shared array object A. This seems relatively obvious to me, but I'm open to a footnote
    clarification if you really feel that's necessary and have text to propose. The main
    purpose of the text in question is to ensure both pointers reference objects that are
    entirely contained within ANY single, enclosing shared object, regardless of referent
    type.
    
    C99 is actually surprisingly silent on the exact usage of the term "element", especially
    as applied to multi-D arrays. In the example above, A[MYTHREAD] is clearly an element
    of A, and A[MYTHREAD][0] is clearly an element of A[MYTHREAD], but C99 does not explicitly
    state whether or not this terminology is transitive, ie if these statements also imply
    that A[MYTHREAD][0] is ALSO an element of A (the underlying assumption I've made).
    I believe the latter is common usage in the community, but the only actual mention
    I can find in C99 is from 6.5.2.1 which defines indexing into multi-d array objects:
    
      Successive subscript operators designate an element of a multidimensional array object.
    
    To me this implies that the final element accessed by a sequence of [][][] operators
    is also "an element of a multidimensional array object".
    

    Reported by danbonachea on 2013-03-01 20:05:43

  81. Former user Account Deleted
    What I'm concerned about is users missing the subtlety of the following code due to
    the confusion over the term "element" when applied to multi-dimensional arrays:
    
    shared [B] T A[2*THREADS];
    
    shared [] T *S1 = (shared [] T *)&A[0];
    shared [] T *S2 = (shared [] T *)&A[1];
    
    if ( upc_threadof( S1 ) == upc_threadof( S2 ) ) {
       T *P1 = (T *)S1;
       T *P2 = (T *)S2;
    
       if ( P1 + (S1 - S2) == P2 ) {
           // Required by new constraint?
       }
    }
    

    Reported by sdvormwa@cray.com on 2013-03-01 20:52:03

  82. Former user Account Deleted
    Thanks to Troy for proofreading this for me:
    
    shared [B] T A[2*THREADS];
    
    shared [] T *S1 = (shared [] T *)&A[0];
    shared [] T *S2 = (shared [] T *)&A[1];
    
    if ( (upc_threadof( S1 ) == MYTHREAD) &&
         (upc_threadof( S1 ) == upc_threadof( S2 )) ) {
    
       T *P1 = (T *)S1;
       T *P2 = (T *)S2;
    
       if ( P1 + (S2 - S1) == P2 ) {
           // Required by new constraint?
       }
    }
    

    Reported by sdvormwa@cray.com on 2013-03-01 21:25:16

  83. Former user Account Deleted
    My views on 106 and 107, when taken individually, match what Dan has stated in his email
    requesting feedback:
    + 106 is ready (IMHO) for inclusion in 1.3
    + 107 is too problematic for inclusion in 1.3
    
    However, I do agree with Steve that it would be better to resolve both in the same
    spec revision.
    
    With that in mind, I think that the clarification provided by issue 106 just formally
    codifies something we "already knew".
    The likelihood that between 1.3 and 1.4 somebody will implement a UPC compiler/runtime
    that doesn't provide the expected contiguous layout is precisely ZERO because a significant
    number of existing codes (incl benchmarks, tutorials, etc.) would fail.
    THEREFORE, if there is not a consensus on 106 soon, then I am OK with deferring it
    until 1.4.
    
    -Paul
    

    Reported by phhargrove@lbl.gov on 2013-03-01 21:32:56

  84. Former user Account Deleted
    Consider the following "specialization" of the example from comment 82.  This one should
    be fairly clear, and work the way everyone expects:
    
    #define B 2
    typedef int T;
    
    shared [B] T A[2*THREADS];
    
    shared [] T *S1 = (shared [] T *)&A[0];
    shared [] T *S2 = (shared [] T *)&A[1];
    
    if ( (upc_threadof( S1 ) == MYTHREAD) &&
         (upc_threadof( S1 ) == upc_threadof( S2 )) ) {
    
       T *P1 = (T *)S1;
       T *P2 = (T *)S2;
    
       if ( P1 + (S2 - S1) == P2 ) {
           // Required by new constraint?
       }
    }
    
    Clearly, &A[0] and &A[1] point to elements of the same shared array object A.  S1 and
    S2 point to the former and later respectively, but with a different referenced type.
     I think we can all agree they also point to elements of the same shared array object.
     Moving into the condition, casting S1 and S2 to pointers-to-local P1 and P2 is valid,
    as all the bytes making up the objects pointed to by S1 and S2 has already been verified
    to be the local thread.  Since S1 and S2 point to elements of the same shared array,
    the expression (P1 + (S2 - S1) == P2) is defined to be 1 by the new constraint.  This
    is exactly the behavior we want and intend.
    
    Now, consider a different "specialization" of the same example code.  This is where
    I think things get confusing for users, and question if we need to word the changes
    differently.
    
    #define B 3
    typedef char T[3][2];
    
    shared [B] T A[2*THREADS];
    
    shared [] T *S1 = (shared [] T *)&A[0];
    shared [] T *S2 = (shared [] T *)&A[1];
    
    if ( (upc_threadof( S1 ) == MYTHREAD) &&
         (upc_threadof( S1 ) == upc_threadof( S2 )) ) {
    
       T *P1 = (T *)S1;
       T *P2 = (T *)S2;
    
       if ( P1 + (S2 - S1) == P2 ) {
           // Required by new constraint?
       }
    }
    
    Once again, clearly &A[0] and &A[1] point to elements of the same shared array object
    A.  When run with 2 UPC threads, both will have affinity to thread 0:
    
        T0         T1
    ---------- ----------
    A[0][0][0] A[0][1][1]
    A[0][0][1] A[0][2][0]
    A[0][1][0] A[0][2][1]
    ---------- ----------
    A[1][0][0] A[1][1][1]
    A[1][0][1] A[1][2][0]
    A[1][1][0] A[1][2][1]
    ---------- ----------
    A[2][0][0] A[2][1][1]
    A[2][0][1] A[2][2][0]
    A[2][1][0] A[2][2][1]
    ---------- ----------
    A[3][0][0] A[3][1][1]
    A[3][0][1] A[3][2][0]
    A[3][1][0] A[3][2][1]
    ---------- ----------
    
    But now things get tricky.  All the bytes making up the objects that S1 and S2 point
    to also make up the shared array object A, and they were initialized by pointers that
    clearly point to elements of the same array object A.  Do they also point to elements
    of the same array object?  Ponder that while we continue on into the conditional.
    
    The cast of S1 and S2 to pointers-to-local here is clearly legal, since the bytes pointed
    to by S1 and S2 all have affinity to the same thread due to the referenced type having
    indefinite block size, and that thread has been verified to be the local thread.  Now
    we come to our new constraint.  If we consider that S1 and S2 point to elements of
    the same shared array object, then the new constraint REQUIRES that the expression
    (P1 + (S2 - S1) == P2) evaluate to 1.  However, the expression (S2 - S1) has an undefined
    value, because there is no integer X that we can add to S1 to produce S2!  Additionally
    (and for similar reasons), there is no integer X we could add to P1 to produce P2!
    
    Since it seems clear that this constraint cannot apply in this case, is it possible
    that S1 and S2 don't point to elements of the same shared array object?  If they don't,
    then the proposed wording is still valid, though potentially confusing when used with
    pointers to array types.  We've already established that the bytes they point to are
    included inside the memory region of the shared array object A, so that is not a valid
    test.  The mere fact that the expression (S2 - S1) is undefined would seem insufficient
    due to the language in 6.4.2 6.  Consider the following example 
    
    shared [3] int B[THREADS][2];
    shared [3] int (*S3)[2] = &B[1];
    shared [3] int (*S4)[2] = (shared [3] int (*)[2])upc_resetphase( S3 );
    
    Do S3 and S4 point to the same object?  Do both point to elements of the shared array
    B?  Why or why not?  Would the same (or a similar) argument apply to S1 and S2, and
    the shared array A?  Our terminology here is very confusing, at least to me.
    
    > C99 is actually surprisingly silent on the exact usage of the term "element", especially
    as applied to multi-D arrays. In the example above, A[MYTHREAD] is clearly an element
    of A, and A[MYTHREAD][0] is clearly an element of A[MYTHREAD], but C99 does not explicitly
    state whether or not this terminology is transitive, ie if these statements also imply
    that A[MYTHREAD][0] is ALSO an element of A (the underlying assumption I've made).
    I believe the latter is common usage in the community, but the only actual mention
    I can find in C99 is from 6.5.2.1 which defines indexing into multi-d array objects:
    >
    >  Successive subscript operators designate an element of a multidimensional array
    object.
    >
    > To me this implies that the final element accessed by a sequence of [][][] operators
    is also "an element of a multidimensional array object".
    
    Actually, I think C is pretty clear, though it'd be nice if the clarification were
    part of some constraints or definitions rather than part of an example.  Read a bit
    further down to 6.5.2.1 4:
    
      EXAMPLE Consider the array object defined by the declaration
    
        int x[3][5];
    
      Here x is a 3x5 array of ints; more precisely, x is an array of three element objects,
    each of which is an array of five ints.
    

    Reported by sdvormwa@cray.com on 2013-03-03 16:51:20

  85. Former user Account Deleted
    Below is an updated change proposal that tweaks the "setup text" to accommodate Steve's
    objection concerning multi-dimensional arrays. This version side-steps the problem
    entirely by defining the contiguity constraint solely in terms of "ultimate elements"
    of the shared array. I believe this still guarantees the contiguity constraint we need
    to clarify the original issue, and by induction also enforces the required constraint
    for the case of multi-dimensional arrays.
    
    As before, the constraint for upc_addrfield() still remains completely unchanged from
    1.2, and is merely moved to the library section where it belongs.
    
    --- upc-language.tex    (revision 204)
    +++ upc-language.tex    (working copy)
    @@ -288,24 +288,33 @@
        constructs:
    
     \begin{verbatim}
    -    T *P1, *P2;  
    -    shared T *S1, *S2;  
    +    T *P1, *P2;    /* T is not a shared type */
    +    shared [] T *S1, *S2;  
    
    -    P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */ 
    -    P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */ 
    +    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
    +    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
     \end{verbatim}
    -    
    +\xchangenote[id=DB]{106}{Declaration of S1 and S1 changed to indefinite blocksize
    to accommodate new constraint. Comments clarified.}
    
    -\np For all S1 and S2 that point to two distinct elements of
    -   the same shared array object which have affinity to the same
    -   thread:
    +\np For all S1 and S2 that point to two distinct 
    +\xreplaced[id=DB]{106}{
    +   objects with affinity to the same thread, 
    +   where both are subobjects contained in the same shared array whose
    +   ultimate element type is a qualified version of {\tt T}: 
    +}{
    +   elements of the same shared array object 
    +   which have affinity to the same thread:
    +}
    
     \begin{itemize}
     \item S1 and P1 shall point to the same object.
     \item S2 and P2 shall point to the same object.
    -\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
    upc\_addrfield(S1))} shall 
    -   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
    +\xadded[id=DB]{106}{
    +\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
    +\truefootnote{This implies there is no padding inserted between blocks of shared array
    elements with affinity to a thread.}
    +}
     \end{itemize}
    +\xchangenote[id=DB]{106}{Constraint on {\tt upc\_addrfield} moved to Section~\ref{upc_addrfield}.}
    
     \np Two compatible pointers-to-shared which point to the same
         object (i.e. having the same address and thread components) shall
    
    --- upc-lib-core.tex    (revision 204)
    +++ upc-lib-core.tex    (working copy)
    @@ -302,11 +302,26 @@
    
     \np The {\tt upc\_addrfield} function returns an
        implementation-defined value reflecting the ``local address'' of the
    -   object pointed to by the pointer-to-shared argument.\footnote{%
    -   This function is used in defining the semantics of pointer-to-shared
    -   arithmetic in Section \ref{pointer-arithmetic}}
    +   object pointed to by the pointer-to-shared argument.
    
    -   
    +\np Given the following declarations:
    +
    +\begin{verbatim}
    +    T *P1, *P2;    /* T is not a shared type */
    +    shared T *S1, *S2;  
    +
    +    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
    +    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
    +\end{verbatim}
    +
    +   For all S1 and S2 that point to two distinct elements of
    +   the same shared array object which have affinity to the same
    +   thread, the expression:\\
    +    {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t)upc\_addrfield(S1))} \\
    +   shall evaluate to the same value as: {\tt ((P2 - P1) * sizeof(T))}.
    +
    +\xchangenote[id=DB]{106}{Paragraph moved from 6.4.2 and cross-reference footnote removed.}
    +
     \paragraph{The {\tt upc\_affinitysize} function}
    
     {\bf Synopsis}
    

    Reported by danbonachea on 2013-03-15 11:23:29

  86. Former user Account Deleted
    I like the suggested re-formulation in Dan's Comment #85.
    
    A couple of editorial suggestions.
    
    1. In the example, where the comment states: "T is not a shared type", I recommend
    that it be written as "T is not a shared qualified type", or "T is not a UPC shared
    qualified type".  I recommend a similar improvement for other pending proposals where
    the phrase "shared type" is used.  The reason that I believe that this is an improvement
    is that "shared type" is rather generic sounding and might be used in contexts that
    are not UPC-related.
    
    2. In the added text "two distinct elements of the same shared array object", I don't
    know if my suggestion made above would also apply, so will offer my suggestion as a
    question: Would re-stating this as "two distinct elements of the same shared  qualified
    array object" improve the precision of the statement?  BTW, in some of the documentation
    that we/Intrepid write, we will often say "UPC shared type" and so on to help disambiguate,
    but that usage is likely a departure from the style of the current UPC specification.
    
    3. In the replacement text, would the phrase "subobjects" be clear as two words "sub
    objects" or a hyphenated word "sub-objects"?
    

    Reported by gary.funck on 2013-03-15 16:14:55

  87. Former user Account Deleted
    > I like the suggested re-formulation in Dan's Comment #85.
    >
    
    Me too.  It looks like it narrows things down enough to not run afoul any more nasty
    corner cases.
    
    > A couple of editorial suggestions.
    >
    > 1. In the example, where the comment states: "T is not a shared type", I recommend
    that it be written as "T is not a shared qualified type", or "T is not a UPC shared
    qualified type".  I recommend a similar improvement for other pending proposals where
    the phrase "shared type" is used.  The reason that I believe that this is an improvement
    is that "shared type" is rather generic sounding and might be used in contexts that
    are not UPC-related.
    
    Shared type is (with the change for issue 3) explicitly defined in section 3 (Terms,
    definitions, and symbols), so I think we're ok here.  Note that "ultimate element type"
    and a number of other things included in this change come from the change for issue
    3.  I believe Dan mentioned somewhere that he did this to alleviate merge problems.
    
    > 2. In the added text "two distinct elements of the same shared array object", I don't
    know if my suggestion made above would also apply, so will offer my suggestion as a
    question: Would re-stating this as "two distinct elements of the same shared  qualified
    array object" improve the precision of the statement?  BTW, in some of the documentation
    that we/Intrepid write, we will often say "UPC shared type" and so on to help disambiguate,
    but that usage is likely a departure from the style of the current UPC specification.
    
    No, it is not possible for an array object to be shared qualified--only non-array objects
    may be shared qualified.  See issue 3 for details.
    

    Reported by sdvormwa@cray.com on 2013-03-15 16:33:45

  88. Former user Account Deleted
    > A couple of editorial suggestions.
    
    I should have clarified that the text proposed in comment #85 is heavily reliant upon
    terms added to the definitions section by the issue 3 proposal, specifically "shared
    type", "shared array" and "ultimate element type". If for some reason we revert the
    issue #3 changes, this would need to be reworded, but I think it works well if both
    proposals are taken together.
    
    > In the replacement text, would the phrase "subobjects" be clear as two words "sub
    objects" or a hyphenated word "sub-objects"?
    
    In preparing this proposal, I spent several hours studying the C99 spec to find the
    best possible wording for exactly that concept. While not explicitly defined, C99 uses
    the term "subobject" (with that spelling) in several places to mean what we need. Eg
    C99 6.7.8:
    
      Each brace-enclosed initializer list has an associated current object. When no
      designations are present, subobjects of the current object are initialized in order
    according
      to the type of the current object: array elements in increasing subscript order,
    ...
    
    So I believe this is correct C99 usage.
    

    Reported by danbonachea on 2013-03-15 17:38:09

  89. Former user Account Deleted
    In the 3/15/13 telecon, we reached consensus that this issue should be be addressed
    in spec 1.3.
    The updated proposal from comment 85 has been mailed to the list.
    

    Reported by danbonachea on 2013-03-16 01:13:27

  90. Former user Account Deleted
    Comment #85 proposal committed as SVN r213
    

    Reported by danbonachea on 2013-04-30 18:47:08 - Status changed: Fixed

  91. Former user Account Deleted
    Ratified in the 5/22 telecon.
    

    Reported by danbonachea on 2013-08-03 03:55:37 - Status changed: Ratified

  92. Log in to comment