Clarify contiguity of local slice of a shared array

Former user Account Deleted

Some relevant discussion from email replies below:

Paul
----
I did not find an explicit "the blocks assigned to a thread must be contiguous", but
the following is what the current 1.2 spec says on p.25:

    Elements of shared arrays are distributed in a round robin fashion, by chunks
    of block-size elements, such that the i-th element has aﬃnity with thread
    (ﬂoor (i/block size) mod THREADS).

I believe the "contigousness" is implied by that text.

The text of paragraphs 4 and 5 on p18 also seem to imply that the elements must be
continuous, but addfield is so under-defined that I'll top short of claiming that there
is an unambiguous requirement here.

In addition, there are numerous examples and tutorials outside of the spec that "privatize"
a shared array with either of the following constructs:
    shared [1] int A[...]
    ...
    int *my_private_P = (int *)&A[MYTHREAD];
    shared [] int *my_shared_P = &A[MYTHREAD];
Neither would be useful beyond the first element if the elements with affinity to a
given thread were not contiguous.
In fact, no casts between PTS of different blocksizes would make sense unless the elements
are contiguous.

I think we all "know" the layout is contiguous, but I agree with Yili that no clear
statement of this fact is evident in the (1.2) spec.


Steve
-----
I think this is one of those things we inherit from C. See ISO/IEC 9899 6.2.5 20:

"An array type describes a contiguously allocated nonempty set of objects with a particular
member object type..."

Since we don't say otherwise (other than to describe how the elements are distributed
amongst threads), this applies to shared arrays as well, and thus they must be contiguous.

ISO/IEC 9899 6.2.5 20 states that array elements (objects) must be contiguous in memory.
 UPC 1.2 6.5.2.1 3 states that elements are distributed round-robin to threads, but
does not EXPLICITLY permit the chunks with affinity to the same thread to not be contiguous
in local memory.  Therefore, since C requires it and UPC does not explicitly call out
that the C requirement does not apply, the C requirement must apply and thus they must
be contiguous.

Spelling this out explicitly would be nice (probably as another formula in 6.4.2 3,
but I do believe that this behavior is indeed mandated by the current spec.

Troy
----
I agree with Paul that Paragraph 5 on Page 18 implies it, but loosely and indirectly.
 As applied to Yili's example, &a[0] and &a[2] would be the S1 and S2 -- they point
to the same shared array object and have affinity to the same thread.  Paragraph 5
goes on to say that P1 (the local pointer cast of S1) and S1 point to the same object,
and P2 and S2 point to the same object.  Transitively, P1 and P2 must point to the
same object.

What is that object?  P1 and P2 are normal, local C pointers so normally one would
look to the C standard to answer this question, but the C standard cannot answer this
question because the only two reasonable answers that I can think of are (1) they point
to the same shared array object (but the C standard doesn't cover shared objects) or
(2) they point to a non-shared array object that is logically a slice of the original
shared array object (but the C standard doesn't cover hypothetical arrays that aren't
actually declared explicitly in a program). 

The UPC spec really should say something here.  If we say that the local slice is or
behaves like a normal C array, then and only then do I agree with Steve that we inherit
the contiguous property from ISO/IEC 9899 6.2.5 20.  As it stands, I don't think shared
arrays have this property as applied directly from the C standard; rather they are
the antithesis of it because A[0] and A[1] may be contiguous in memory or A[0] and
A[1] may be across the machine room from each other in different cabinets.  But I do
think we want to extend this property to the local portions of shared arrays.

Bill
----
First I think Paul is correct that this does indicate the need for an explicit statement
of some sort.

In my view "contiguity" has to do with pointer arithmetic and casting.  If things are
contiguous you can count on pointer arithmetic to work between them in the usual way.
 If they are not, you cannot.  Note that ISO uses contiguity to distinguish arrays
and multiple calls to malloc.

As long as one stays in pointer-to-shared land, I think everything in the draft spec
works fine with respect to this.  But when you cast to pointer-to-local, it does seem
we should add something related to 6.4.3, footnote 13 in the 1.3 draft to say that
the "are local accesses and behaves accordingly" includes pointer arithmetic to access
all portions of the object which have affinity to that thread.

I think others here are more qualified than I to suggest the language, but it is definitely
the desired (and relied upon) effect.

Kathy
-----
I always agree with removing ambiguity from the spec, and that distributing across
regions a previously-specified contiguous object raises such an ambiguity.

Reported by danbonachea on 2013-02-25 13:23:35

2013-02-25T13:23:35+00:00

Former user Account Deleted

I think we have consensus on the intended behavior, and I suspect all current implementations
already provide the contiguity property under discussion. I agree that inserting a
clarification sentence somewhere is probably in order to "make it official", although
we don't have any actual proposed language yet.

Adding this to the 1.3 milestone initially, although we'll need to draft some language
immediately for this change to make it into 1.3 (we're way past the "new issues" deadline
for 1.3).

Reported by danbonachea on 2013-02-25 13:33:00 - Labels added: Milestone-Spec-1.3

2013-02-25T13:33:00+00:00

Former user Account Deleted

I propose adding the following to the end of 6.5.2.1 5:

The local portion of a shared array shall be contiguous in a thread's memory.  All
of the chunks distributed to the same thread shall appear consecutively in that thread's
memory, with no space between chunks.

Reported by sdvormwa@cray.com on 2013-02-25 18:47:31

2013-02-25T18:47:31+00:00

Former user Account Deleted

I think Steve's proposed text is clear.  I assume the contiguity property holds true
for dynamically allocated arrays by upc_all_alloc or upc_global_alloc, right?

Reported by yzheng@lbl.gov on 2013-02-25 19:04:09

2013-02-25T19:04:09+00:00

Former user Account Deleted

> The local portion of a shared array shall be contiguous in a thread's memory.  All

> of the chunks distributed to the same thread shall appear consecutively in that 
> thread's memory, with no space between chunks.

It's a good first cut, but uses some terms and concepts not defined by the specification
("thread's memory"). Can we re-word it to discuss elements/chunks "with affinity to"
a given thread?

We also need to be careful not to prohibit internal padding in elements which may be
required for alignment.

Reported by danbonachea on 2013-02-25 20:02:37

2013-02-25T20:02:37+00:00

Former user Account Deleted

How about

Elements of data storage with affinity to a thread shall be contiguous in that thread's
local address space.  All of the chunks distributed to the same thread shall appear
consecutively, with no additional padding beyond the requirements of the ultimate element
type [see issue 3] of the array.

Reported by sdvormwa@cray.com on 2013-02-25 20:52:34

2013-02-25T20:52:34+00:00

Former user Account Deleted

Since the padding (if any) between elements is already in the C definition of an array:

"The elements of a shared array with affinity to any given thread shall appear in that
thread's address space consecutively, as a single array object."

On could add "with elements in increasing order" if we want to be pedantic.

Reported by phhargrove@lbl.gov on 2013-02-25 21:04:13

2013-02-25T21:04:13+00:00

Former user Account Deleted

We can't use elements, because an element of a shared array might itself be a shared
array (for instance, the first element of an array declared 'shared int A[2][THREADS]'),
and thus may not completely reside on a single thread.

Reported by sdvormwa@cray.com on 2013-02-25 21:12:32

2013-02-25T21:12:32+00:00

Former user Account Deleted

"address space" is another term not defined by the spec, and the type of implementation
detail we should attempt to avoid.

If possible, it might be nice to "fix" this within 6.4.2-6, which is already very closely
related. Consider this proposal (1 line changed, 1 line added):

   T *P1, *P2;
-  shared    T *S1, *S2;
+  shared [] T *S1, *S2;

   P1 = (T*) S1; /* allowed if *S1 has affinity only to MYTHREAD */  
   P2 = (T*) S2; /* allowed if *S2 has affinity only to MYTHREAD */

  For all S1 and S2 that point to two distinct elements of the same shared
  array object which have affinity to the same thread:

  * S1 and P1 shall point to the same object.
  * S2 and P2 shall point to the same object.
  * The expression (((ptrdiff_t) upc_addrfield (S2) - (ptrdiff_t) upc_addrfield(S1))

    shall evaluate to the same value as ((P2 - P1) * sizeof(T)).
+ * The expression (S2 - S1) shall evaluate to the same value as (P2 - P1)

The new constraint enforces that indefinitely-blocked PTS arithmetic (ie indexing with
affinity to a single thread) increments in exactly the same way as a pointer-to-local,
and I believe has the side-effect of disallowing the discontiguous layout we wish to
prohibit.

Reported by danbonachea on 2013-02-26 12:59:14

2013-02-26T12:59:14+00:00

Former user Account Deleted

Amendment to the proposal in comment #9, change the new constraint line to read:

+ * The expression P1 + (S2 - S1) == P2 shall evaluate to 1.

(This equation is equivalent in a correct implementation, but this form 
additionally disallows a perverse implementation from passing the test via integer
round-off.)

Reported by danbonachea on 2013-02-26 14:28:18

2013-02-26T14:28:18+00:00

Former user Account Deleted

Going back to the original comments, I thought we agreed that the current spec wording
required contiguity of elements within a block.  The ambiguous case was contiguity
across blocks with affinity to the same thread.  The proposed language in comments
9 and 10 does not address the case of contiguity across blocks because there's no such
concept for indefinitely blocked objects.  If we're going to clarify this with an example,
I think we must use a definitely blocked shared array to do so.

Reported by sdvormwa@cray.com on 2013-02-26 14:45:19

2013-02-26T14:45:19+00:00

Former user Account Deleted

> The ambiguous case was contiguity across blocks with affinity to the same thread.


Agreed.

>  The proposed language in comments 9 and 10 does not address the case of contiguity

>across blocks because there's no such concept for indefinitely blocked objects.

I believe it does. The key is in the setup phrase:

  For ALL S1 and S2 that point to two distinct elements of the same shared
  array object which have affinity to the same thread.

This implies the equations must hold for ALL pairwise combinations of distinct elements
with the same affinity that exist in every shared array object (with any blocking factor).
In particular, it must hold for every pair of elements (e1, e2) with affinity to the
same thread, INCLUDING those pairs where e1 and e2 were part of different blocks in
the original array allocation. The use of indefinitely-blocked (instead of cyclically-blocked)
pointers S1 and S2 to construct the constraints is just a convenience - the equations
still govern the placement of elements in memory for all shared arrays, including those
allocated with a definite blocking factor.

I invite you to construct an example where the local blocks are discontiguous that
still satisfies the equations for all pairwise combinations of elements with local
affinity.

Reported by danbonachea on 2013-02-26 15:37:14

2013-02-26T15:37:14+00:00

Former user Account Deleted

However, blocksize is part of the type compatibility, so casting a pointer to a definitely
typed object to an indefinitely typed one is "fishy".  I'd propose that instead of
putting things here, I think we should instead modify 6.4.2 3, by explicitly defining
upc_addrfield's value for the case of the result having affinity to the same thread
as the initial pointer [see proposed text of issue 3 for definition of 'elem_delta']:

  Additionally, if upc_threadof(p) == upc_threadof(p1), the following equation must
hold

  ptrdiff_t block_delta = (((upc_phaseof(p) + elem_delta) div B) div THREADS);
  ptrdiff_t local_elem_offset = (block_delta * B) - upc_phaseof(p) + upc_phaseof(p1);

  upc_addrfield(p1) == upc_addrfield(p) + local_elem_offset * upc_elemsizeof(*p)

Reported by sdvormwa@cray.com on 2013-02-26 15:54:09

2013-02-26T15:54:09+00:00

Former user Account Deleted

> casting a pointer to a definitely typed object to an indefinitely typed one is "fishy".

There is no such cast. The only cast in the equations is from a PTS with local affinity
to a PTL, which is perfectly kosher. 

The setup text requires the equations to hold for every S1 and S2 pointing to distinct
elements of the shared array with local affinity. It does not prescribe how those pointers
are constructed, because it is irrelevant. All that matters is that it covers every
pair of pointer values referencing local elements.

Reported by danbonachea on 2013-02-26 16:03:06

2013-02-26T16:03:06+00:00

Former user Account Deleted

Also, the new text is no "fishier" than the old text, which used a pair of cyclically-blocked
pointers to reference every pairwise set of distinct local elements. I just changed
cyclic to indefinite to make the equation cleaner.

Reported by danbonachea on 2013-02-26 16:04:24

2013-02-26T16:04:24+00:00

Former user Account Deleted

I'm sorry, I wasn't clear about the fishiness I was referring to.  The new text as written
looks good to me from a technical standpoint, albeit redundant (see below).  The part
that I find fishy is using the new text to answer the question "For a definitely blocked
shared array object, are the blocks with affinity to a thread contiguous?"  Because
the new text only addresses indefinitely blocked shared objects, I find its use to
answer that question fishy.

Moreover, because of the following text in 6.4.2 2 in the existing spec, it is completely
redundant:

"If the shared array is declared with indefinite block size, the result of the pointer-to-shared
arithmetic is identical to that described for normal C pointers in [ISO/IEC00 Sec.
6.5.6], except that the thread of the new pointer shall be the same as that of the
original pointer and the phase component is defined to always be zero."

As I said in comment 11, if we need to disambiguate this in the spec (which, as noted
in my email quoted in comment 1, I don't believe is strictly necessary), then we must
do so with definitely blocked arrays.  It is not sufficient to restate something about
indefinitely blocked arrays that is already in the spec.

Reported by sdvormwa@cray.com on 2013-02-26 16:40:25

2013-02-26T16:40:25+00:00

Former user Account Deleted

>  Because the new text only addresses indefinitely blocked shared objects, I find its
use to answer that question fishy.

I'm sorry but this is completely false. The equations in 6.4.2-6 apply to ALL SHARED
ARRAYS. It applies to arrays allocated with a indefinite, cyclic or definite blocking
factor. The equations use two POINTERS with a particular blocksize (because every non-generic
PTS must have a blocksize), but the constraints apply to ALL shared arrays, regardless
of allocation layout.

Please point out exactly the text from either the old or new text that you believe
says the equations apply to only a subset of all shared arrays?

Reported by danbonachea on 2013-02-26 16:50:42

2013-02-26T16:50:42+00:00

Former user Account Deleted

What am I missing here?

With four threads:

shared int Arr[3*THREADS];
shared int *S1 = Arr[0];
shared int *S2 = Arr[THREADS]; // = Arr[4]
int *P1 = (int *)S1;
int *P2 = (int *)S2;

S2 - S1 = THREADS; // = 4
P2 - P1 = 1;

The proposed constraint:

+ * The expression P1 + (S2 - S1) == P2 shall evaluate to 1.

But P2 = P1 + 1, and P1 + 4 != P1 + 1.  The constraint does enforce indefinite block
size arithmetic, but it does not hold across blocks.  Is it expected to?

Reported by brian.wibecan on 2013-02-26 17:10:15

2013-02-26T17:10:15+00:00

Former user Account Deleted

Yes, I agree they apply to all shared arrays.  I agree that it technically says what
we want.  That's why I said "fishy" and not "incorrect".

My concern is that, given the declarations

shared [] T *S1, *S2;

it is natural to assume that they point at indefinitely blocked shared arrays, because
that is what their referenced type is.  Because of 6.5.1.1 12, the ONLY way they could
point at (part of) a definitely blocked shared array is with an EXPLICIT cast, which
IS NOT PRESENT in the proposed text.

I think my proposal in comment 13 says what we want to say in a manner that is less
"fishy" and more direct--importantly, it doesn't require any hidden EXPLICIT casts
to disambiguate the case that we want to disambiguate.

Reported by sdvormwa@cray.com on 2013-02-26 17:10:17

2013-02-26T17:10:17+00:00

Former user Account Deleted

"What am I missing here?"

S1 and S2 should be 'shared [] int *' (and note, you need an explicit cast), not 'shared
int *'.

Reported by sdvormwa@cray.com on 2013-02-26 17:12:12

2013-02-26T17:12:12+00:00

Former user Account Deleted

"it is natural to assume that they point at indefinitely blocked shared arrays,"

I disagree that is a "natural assumption", but we can also add an amplification phrase
like "The following property applies to all shared arrays".

"the ONLY way they could point at (part of) a definitely blocked shared array is with
an EXPLICIT cast,"

This is also false. Simple concrete example:

shared [] int *S1 = upc_all_alloc(2*THREADS, 100*sizeof(int))
shared [] int *S2 = S1 + 5;

there is no cast here. 

The text in 6.4.2 is describing a mathematical property that must hold true for all
elements in EVERY shared array with the same affinity. It does not prescribe an algorithm
to construct the pointers to actually perform this check, because it's unnecessary
to do so. The "old" text uses cyclic pointers to state the property, my proposed "new"
text uses indefinite pointers to state the strengthened property. This is no way changes
the fact these properties must be preserved for ALL shared arrays. If that's not sufficiently
clear in the old text then we need to amplify this point, and that's orthogonal to
this issue.

Your proposed equations in comment #13 look correct to my casual inspection, but it
is also significantly "denser", and in my opinion sacrifices clarity as a result. My
change is minimalistic and I believe easier to understand.

Brian said:
> What am I missing here?

When S1 and S2 are properly declared as indefinitely blocked, the expression (S2-S1)
in your example evaluates to 1, exactly matching (P2 - P1). (assuming a compliant implementation)

Reported by danbonachea on 2013-02-26 17:32:34

2013-02-26T17:32:34+00:00

Former user Account Deleted

Here's a concrete example of applying the OLD declarations to a definitely-blocked array,
based on Brian's example:

shared [10] int Arr[20*THREADS];  // assume THREADS == 4
shared int *S1 = (shared int *)&Arr[0];
shared int *S2 = (shared int *)&Arr[10*THREADS]; 
int *P1 = (int *)S1;
int *P2 = (int *)S2;

Note the "old" text ALSO requires a cast in order to "check" the property on any statically-allocated
array which is not cyclic. The only time a cast is not required is when the array in
question happens to be cyclic. A similar cast is required under the proposed text.
This is nothing new.

Reported by danbonachea on 2013-02-26 17:42:26

2013-02-26T17:42:26+00:00

Former user Account Deleted

Ah, you're right.  I thought C99 required an explicit cast if the referenced types are
not compatible, but I see that's not the case.  Forget about that. ;)

That said, I think we still run into potential issues with alignment.  If an implementation
defines that the alignment of 'shared [] int' is different than the alignment of 'shared
[B] int' for any positive B, then the conversion results in undefined behavior.  My
proposed text in comment 13 still works and allows this, while the proposed text in
comments 9-10 does not.

Reported by sdvormwa@cray.com on 2013-02-26 18:00:10

2013-02-26T18:00:10+00:00

Former user Account Deleted

> If an implementation defines that the alignment of 'shared [] int' is different than

> the alignment of 'shared [B] int' for any positive B, then the conversion results
in 
> undefined behavior.  My proposed text in comment 13 still works and allows this,

> while the proposed text in comments 9-10 does not.

Let's backup and make sure we agree on the goals of this clarification. I think there
are two goals:
1) Clarify that users can create a pointer-to-local to the slice of an array with affinity
to one thread, and access it as a contiguous local array.
I would also add:
2) Clarify that users to construct different "views" of their array using PTS's with
different blocksizes and access the elements using the most natural indexing arithmetic
for the current piece of code. This practice is already commonplace in deployed UPC
codes, and I believe we must allow it. 

Requirement #2 has also long been explicitly encoded (to some extent) in the UPC bulk
transfer libraries, collectives and now nb transfers, many of which contain text like:
   The upc_memcpy function treats the dst and src pointers as if they had type: shared
[] char[n]
This implies the elements in "local slice" of any shared array that can be passed to
these functions must be valid to access using indefinite blocking.
The shared array dynamic allocation functions also rely upon this "reblocking" property:

   The upc_global_alloc allocates shared space compatible with the declaration:  shared
[nbytes] char[nblocks * nbytes].
Here [nbytes] is a typeless blocking factor, whose numerical value will differ from
the typed blocking factor the user uses to access the array (for any type with sizeof()
> 1).

For all these reasons, I believe an implementation that uses different element alignment
constraints based on the blocksize used in the allocation is simply an invalid implementation
that must be prohibited. 

My proposal directly enforces requirement #2 for "reblocking" any array to indefinite
(because the equations would be false if the implementation did not comply for that
case). By transitivity, it also enforces it for reblocking to any block size.

Reported by danbonachea on 2013-02-26 18:19:18

2013-02-26T18:19:18+00:00

Former user Account Deleted

Actually, I suppose that [comment 23] is exactly what we're trying to prevent, so never
mind. ;)

I still think that using a pointer whose referenced type is indefinitely blocked to
clarify something about definitely typed shared objects is a bit fishy.  However, if
the general consensus is that it works, I'll go with it.

Reported by sdvormwa@cray.com on 2013-02-26 18:22:29

2013-02-26T18:22:29+00:00

Former user Account Deleted

s/definitely typed/definitely blocked/

Reported by sdvormwa@cray.com on 2013-02-26 18:29:24

2013-02-26T18:29:24+00:00

Former user Account Deleted

For completeness, below is a Latex diff of my proposed resolution for this issue, relative
to the current working draft.
I invite comment from other members of the committee.

--- upc-language.tex    (revision 204)
+++ upc-language.tex    (working copy)
@@ -289,12 +289,12 @@

 \begin{verbatim}
     T *P1, *P2;
-    shared T *S1, *S2;  
+    shared [] T *S1, *S2;  

     P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
     P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
 \end{verbatim}
-    
+\xchangenote[id=DB]{106}{Declaration of S1/S1 pointers changed to indefinite blocksize}

 \np For all S1 and S2 that point to two distinct elements of
    the same shared array object which have affinity to the same
@@ -305,6 +305,9 @@
 \item S2 and P2 shall point to the same object.
 \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall
    evaluate to the same value as ((P2 - P1) * sizeof(T)).
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.
+}
 \end{itemize}

 \np Two compatible pointers-to-shared which point to the same

Reported by danbonachea on 2013-02-26 20:20:56

2013-02-26T20:20:56+00:00

Former user Account Deleted

Can we delete the third item in the list if we make this change?  I believe the new
expression makes it redundant.

Reported by sdvormwa@cray.com on 2013-02-26 20:37:34

2013-02-26T20:37:34+00:00

Former user Account Deleted

That is, remove

 \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall
    evaluate to the same value as ((P2 - P1) * sizeof(T)).

Reported by sdvormwa@cray.com on 2013-02-26 20:38:00

2013-02-26T20:38:00+00:00

Former user Account Deleted

I think we've begun to converge on a "spec speak" version of "blocks with same affinity
are contiguous in local memory".  In other words: we've required implementations to
do something we "just knew" they must.

If a user were to ask herself the same question today that Yili asked a couple days
ago, do you think she'd find the answer in the spec as clarified by the proposed change?
 I doubt it.  So, would it be acceptable to add a footnote with the "plain English"
conclusion?  I normally argue against such things, but this is a case where I think
the conclusion is sufficiently non-obvious that my feelings on this are "neutral".
 So am ASKING: do others think such a footnote is appropriate?

Reported by phhargrove@lbl.gov on 2013-02-26 20:48:33

2013-02-26T20:48:33+00:00

Former user Account Deleted

Paul: Yes, I think a footnote would be helpful here.  (Then again, I'm always pro explanatory
footnote.  :) )

Reported by johnson.troy.a on 2013-02-26 21:00:12

2013-02-26T21:00:12+00:00

Former user Account Deleted

> Can we delete the third item in the list if we make this change?  I believe the new
expression makes it redundant.

No. The third item constrains the allowable behavior of the upc_addrfield function,
which otherwise just returns an "implementation-defined value". The library function
semantics cross-reference to this section as part of its definition.

> would it be acceptable to add a footnote with the "plain English" conclusion?  I
normally argue against such 
> things, but this is a case where I think the conclusion is sufficiently non-obvious
that my feelings on this 
> are "neutral".  So am ASKING: do others think such a footnote is appropriate?

I would not be against inserting such a clarification footnote, PROVIDED it can be
stated in a way that doesn't resort to undefined terms or operational implementation
details (eg "thread's address space", "thread's memory"), or imply any requirement
stronger than the one we're trying to impose.

Perhaps just a very high-level clue like this?:
\footnote{This implies there is no padding inserted between shared array elements with
affinity to a thread}

Reported by danbonachea on 2013-02-26 21:05:57

2013-02-26T21:05:57+00:00

Former user Account Deleted

Dan,

I fear "no padding" could be misunderstood to mean that the padding NORMALLY present
between elements in arrays of structs might be omitted.  So, would the following work
for you (with the emphasis NOT intended for inclusion in the spec):

\footnote{This implies there is no padding inserted between BLOCKS OF shared array
elements with affinity to a thread}

Reported by phhargrove@lbl.gov on 2013-02-26 21:13:21

2013-02-26T21:13:21+00:00

Former user Account Deleted

> \footnote{This implies there is no padding inserted between BLOCKS OF shared array
elements with affinity to a thread}

Sounds reasonable to me.

Reported by danbonachea on 2013-02-26 21:31:56

2013-02-26T21:31:56+00:00

Former user Account Deleted

Official proposal mailed 2/26/13:

--- upc-language.tex    (revision 204)
+++ upc-language.tex    (working copy)
@@ -289,12 +289,12 @@

 \begin{verbatim}
     T *P1, *P2;
-    shared T *S1, *S2;  
+    shared [] T *S1, *S2;  

     P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
     P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
 \end{verbatim}
-    
+\xchangenote[id=DB]{106}{Declaration of S1/S1 pointers changed to indefinite blocksize}

 \np For all S1 and S2 that point to two distinct elements of
    the same shared array object which have affinity to the same
@@ -305,6 +305,10 @@
 \item S2 and P2 shall point to the same object.
 \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall
    evaluate to the same value as ((P2 - P1) * sizeof(T)).
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of shared array
elements with affinity to a thread.}
+}
 \end{itemize}

 \np Two compatible pointers-to-shared which point to the same

Reported by danbonachea on 2013-02-26 23:10:57 - Status changed: PendingApproval

2013-02-26T23:10:57+00:00

Former user Account Deleted

> No. The third item constrains the allowable behavior of the upc_addrfield function,
which otherwise just returns an "implementation-defined value". The library function
semantics cross-reference to this section as part of its definition.

Woah, I completely missed that.  In that case, I think we need to think about this
a bit more.  Consider the following code:

shared int *S1;
shared [] int *S2;

S2 = S1;

if ( upc_addrfield(S1) == upc_addrfield(S2) ) {
    printf("Match\n");
}

With the proposed change (as well as UPC 1.2 apparently), it is undefined whether or
not anything is printed.  We don't require anywhere that pointers-to-shared with different
types that point to the same object produce the same result when passed to upc_addrfield(),
merely that such pointers shall compare equal.  Since I don't think this is intended,
we probably need a stronger statement somewhere to make upc_addrfield() useful.

Reported by sdvormwa@cray.com on 2013-02-26 23:47:33

2013-02-26T23:47:33+00:00

Former user Account Deleted

> Since I don't think this is intended, we probably need a stronger statement somewhere
to make upc_addrfield() useful.

The specification and behavior of upc_addrfield() is a new issue, which I've entered
as issue 107. Please continue discussion of that topic there.

The current issue and proposed fix are completely orthogonal to the semantic guarantees
of upc_addrfield().

Reported by danbonachea on 2013-02-27 11:07:53

2013-02-27T11:07:53+00:00

Former user Account Deleted

> The current issue and proposed fix are completely orthogonal to the semantic guarantees
of upc_addrfield().

True, but placing such semantic guarantees on upc_addrfield() makes writing tests for
the current issue much easier.

Reported by sdvormwa@cray.com on 2013-02-27 14:02:42

2013-02-27T14:02:42+00:00

Former user Account Deleted

> > Can we delete the third item in the list if we make this change?  I believe the
new expression makes it redundant.
>
> No. The third item constrains the allowable behavior of the upc_addrfield function,
which otherwise just returns an "implementation-defined value". The library function
semantics cross-reference to this section as part of its definition.

> The current issue and proposed fix are completely orthogonal to the semantic guarantees
of upc_addrfield().

Since the only constraint on the result of upc_addrfield() is that that the difference
between its results when applied to two pointers-to-shared with particular properties
be equal to the difference between two pointers-to-local pointing to the same objects
scaled by the size of the type, it seems likely that the reason that constraint exists
was to attempt to address this very issue.  Therefore, I think we should remove that
constraint as part of the change for this issue, and leave the deprecation of upc_addrfield()
and discussion of a possible replacement for issue 107.

Reported by sdvormwa@cray.com on 2013-02-27 18:13:12

2013-02-27T18:13:12+00:00

Former user Account Deleted

> Can we delete the third item in the list if we make this change?  I believe the new
expression makes it redundant.

Note also that, by changing the type of S1 and S2 as proposed, any existing UPC 1.2
programs that relied on this constraint would have to be changed, as the constraint
itself is subtly different due to the inherited type change, even though the wording
remains the same (see issue 107).

Reported by sdvormwa@cray.com on 2013-02-27 19:01:19

2013-02-27T19:01:19+00:00

Former user Account Deleted

> I think we should remove that constraint as part of the change for this issue, and

> leave the deprecation of upc_addrfield() and discussion of a possible replacement

> for issue 107.

This PendingApproval issue (106) is concerned with clarifying an ambiguity concerning
the contiguity of shared array elements. That goal has nothing to do with the upc_addrfield()
library function, aside from textual proximity in the spec. The fact that the resolution
of this issue might render one application of the library function obsolete does not
automatically imply that a constraint used to define library behavior should be removed.

I agree we should consider *eventually* removing the constraint as part of the resolution
to issue 107, if we decide to deprecate the function. However I believe relaxing the
semantic definition prior to the deprecation of upc_addrfield would be premature. For
better or worse, it does currently constrain the implementation-defined behavior of
that function, and I don't think we should be tweaking the semantics of a function
we are considering throwing away. 

Issue 107 is concerned with modifying the library function semantics. Please take further
discussion of this topic there.

Reported by danbonachea on 2013-02-27 19:02:00

2013-02-27T19:02:00+00:00

Former user Account Deleted

> This PendingApproval issue (106) is concerned with clarifying an ambiguity concerning
the contiguity of shared array elements. That goal has nothing to do with the upc_addrfield()
library function, aside from textual proximity in the spec. The fact that the resolution
of this issue might render one application of the library function obsolete does not
automatically imply that a constraint used to define library behavior should be removed.

But by changing the type of S1 and S2, we are already implicitly removing the existing
constraint, and adding a new similar one.  It is unlikely that one would even notice
this unless one looks very closely at the differences between the 1.2 and 1.3 specs
and works through the semantics.  That seems much more dangerous to me.

Reported by sdvormwa@cray.com on 2013-02-27 19:07:04

2013-02-27T19:07:04+00:00

Former user Account Deleted

> But by changing the type of S1 and S2, we are already implicitly removing the existing
constraint, and adding a new similar one. 

As I argued at length in issue 107, comment 4:
  http://code.google.com/p/upc-specification/issues/detail?id=107#c4
there is no change to the language-level constraint on the behavior of the library
function.

Reported by danbonachea on 2013-02-27 19:19:42

2013-02-27T19:19:42+00:00

Former user Account Deleted

> As I argued at length in issue 107, comment 4:
>  http://code.google.com/p/upc-specification/issues/detail?id=107#c4
> there is no change to the language-level constraint on the behavior of the library
function.

And as I argued at length in the very next comment (http://code.google.com/p/upc-specification/issues/detail?id=107#c5),
that is simply not true.

Reported by sdvormwa@cray.com on 2013-02-27 19:22:34

2013-02-27T19:22:34+00:00

Former user Account Deleted

> there is no change to the language-level constraint on the behavior of the library
function.

We clearly disagree on this point, but I think we're wasting time arguing about a semantic
quibble nobody has ever even noticed, let alone relied upon.

Can we at least agree this distinction has no effect on the behavior of the library
function in any current real implementation, and therefore on real users?

Reported by danbonachea on 2013-02-27 19:26:20

2013-02-27T19:26:20+00:00

Former user Account Deleted

To get really precise, in UPC 1.2 6.4.2 6 places constraints on the result of upc_addrfield()
when passed a generic pointer-to-shared value that is the result of an implicit conversion
from a pointer-to-shared whose referenced type has block size 1.  With the changes
as proposed, the constraint now applies to the result of upc_addrfield() when passed
a generic pointer-to-shared value that is the result of an implicit conversion from
a pointer-to-shared whose referenced type has indefinite block size.  Because the UPC
specification does not require that these values be the same (thus must merely compare
equal), by using the proposed changes we have ever so subtly changed the constraint.

I will agree that, to my knowledge, no existing UPC implementation, and thus no existing
users, would be affected by this.  However, I strongly believe that we should not implicitly
change an exising constraint in the spec to fix an "unrelated" issue.  Either we should
modify the proposal so the UPC 1.2 semantics are preserved, explicitly remove the constraint
or clearly call out that the constraint has changed.

Reported by sdvormwa@cray.com on 2013-02-27 20:01:05

2013-02-27T20:01:05+00:00

Former user Account Deleted

> Either we should modify the proposal so the UPC 1.2 semantics are preserved, 
> explicitly remove the constraint or clearly call out that the constraint has changed.

In one comment you're militant about a ridiculously subtle change to a constraint with
no realistic impact on any implementation or user, and in the next you want to remove
the constraint entirely, resulting in a significant semantic relaxation. Please choose
a side.

The change is already clearly annotated, as with every other semantic change in the
1.3 working draft. The depth of the annotation is proportional to its expected impact
(ie vanishingly small).

Reported by danbonachea on 2013-02-27 20:09:42

2013-02-27T20:09:42+00:00

Former user Account Deleted

If I am understanding things correctly (and please correctly me gently if not), then
in comment #46 Steven has describe how we have proposed text that would remove one
constraint (however subtle/implicit is may be) on the implementation of upc_addrfield()
and replace it with a *different* constraint (equally subtle/implicit).

Under other circumstances that might be an alarming thing to do.  HOWEVER, for this
particular case the two constraints (and this is where I might not have followed) are
ENTIRELY COMPATIBLE.  Not only are the compatible, but we have every reason to believe
that every existing implementation satisfies both simultaneously.

Would folks be more or less happy with a proposal that introduced a P3 and S3 so that
the example could provide BOTH cyclic and indefinite examples and thus ADD the new
constraint on upc_addrfield() without removing the original one?

Reported by phhargrove@lbl.gov on 2013-02-27 20:37:05

2013-02-27T20:37:05+00:00

Former user Account Deleted

> introduced a P3 and S3 so that the example could provide BOTH cyclic and indefinite

> examples and thus ADD the new constraint on upc_addrfield() without removing the

> original one?

I don't think it makes sense to add verbiage to 6.4.2 whose only purpose is to clarify
the behavior of a library function defined in section 7. The section is already sufficiently
subtle without the introduction of what is essentially irrelevant noise. The original
constraint should probably have appeared in section 7 in the first place.

If committee members (other than Steve) are convinced that we must preserve this effectively
meaningless semantic distinction that has everyone agrees has no impact on real implementations
or users, then I think it makes more sense to MOVE the old constraint (as originally
written) into section 7.2.3.4 and remove it from 6.4.2. If issue 107 resolves to deprecate,
strengthen or remove upc_addrfield in some future revision of the spec, any such change
would remain local to 7.2.3.4.

Reported by danbonachea on 2013-02-27 21:04:57

2013-02-27T21:04:57+00:00

Former user Account Deleted

> In one comment you're militant about a ridiculously subtle change to a constraint
with no realistic impact on any implementation or user, and in the next you want to
remove the constraint entirely, resulting in a significant semantic relaxation. Please
choose a side.

My concern here is that we are changing the semantics of an unrelated constraint with
the proposed changes without explicitly calling out that the new semantics are different
than the old.  If the change is intentional, we should add language to make it clear
that it is intentional.  If it is not intentional, then we need to modify proposal
so that the existing behavior is preserved.  I believe these two options apply for
ANY proposed change that would have the effect of silently changing the semantics of
unrelated parts of the language.  However, because this specific constraint is effectively
useless, I think that a simpler third alternative would be to explicitly remove it.
 Any of these solutions would be acceptable to me.

> Under other circumstances that might be an alarming thing to do.  HOWEVER, for this
particular case the two constraints (and this is where I might not have followed) are
ENTIRELY COMPATIBLE.  Not only are the compatible, but we have every reason to believe
that every existing implementation satisfies both simultaneously.

No, they are not compatible.  We do however believe that every existing implementation
satisfies both already.

Reported by sdvormwa@cray.com on 2013-02-27 21:28:32

2013-02-27T21:28:32+00:00

Former user Account Deleted

Steve wrote:
> No, they are not compatible.  We do however believe that every existing
> implementation satisfies both already.

My intended meaning for "compatible" was "it is possible to satisfy both".
So, under that definition they ARE compatible.

Steve,
What was your interpretation of "compatible" under which the 2 constraints are NOT
compatible?  (honest lack of understanding on my part - not being sarcastic).

Reported by phhargrove@lbl.gov on 2013-02-27 21:46:32

2013-02-27T21:46:32+00:00

Former user Account Deleted

> What was your interpretation of "compatible" under which the 2 constraints are NOT
compatible? 

I thought you meant that there was no user-detectable difference between the two.

Reported by sdvormwa@cray.com on 2013-02-27 21:49:06

2013-02-27T21:49:06+00:00

Former user Account Deleted

> My concern here is that we are changing the semantics of an unrelated constraint 
> with the proposed changes without explicitly calling out that the new semantics are

> different than the old.  If the change is intentional, we should add language to

> make it clear that it is intentional.

The change annotation already automatically includes a hyperlink directly to this very
page, where the issue is discussed ad nauseum in the comments above. Any users or implementers
who really care (and I strongly believe that's the singleton set: { Steve }) can click
the hyperlink to read all about it right here. What more do we need?

Reported by danbonachea on 2013-02-27 21:57:37

2013-02-27T21:57:37+00:00

Former user Account Deleted

> The change annotation already automatically includes a hyperlink directly to this

> very page, where the issue is discussed

I should also note this is the standard procedure we've been following for EVERY spec
change. Rationale and motivation is NOT placed in the document. Every change includes
only the actual textual wording change and an issue number with a hyperlink where interested
parties can read the details. This applies even for changes with very large semantic
impact and complicated implications - the spec only contains the normative text, the
issue database contains the rationale and story behind the change.

Reported by danbonachea on 2013-02-27 22:07:07

2013-02-27T22:07:07+00:00

Former user Account Deleted

> The change annotation already automatically includes a hyperlink directly to this
very page, where the issue is discussed ad nauseum in the comments above.

But there's no change annotation on the constraint we're referring to and most people
aren't going to recognize at first glance that the type change in the previous paragraph
actually has a semantic effect on the constraint.  I certainly didn't recognize that
until you pointed out that there is only that single constraint on the result of upc_addrfield(),
and then spent a half an hour thinking through the implications of that revelation.
 So how is someone that is wondering why upc_addrfield() doesn't do what it used to
going to know to reference an apparently unrelated issue?  Well, we make it clear that
it is not in fact unrelated.

Here, a proposed modification:

Index: lang/upc-language.tex
===================================================================
--- lang/upc-language.tex       (revision 205)
+++ lang/upc-language.tex       (working copy)
@@ -289,12 +289,12 @@

 \begin{verbatim}
     T *P1, *P2;
-    shared T *S1, *S2;
+    shared [] T *S1, *S2;

     P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
     P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
 \end{verbatim}
-
+\xchangenote[id=DB]{106}{Declaration of S1/S1 pointers changed to indefinite blocksize}

 \np For all S1 and S2 that point to two distinct elements of
    the same shared array object which have affinity to the same
@@ -305,6 +305,11 @@
 \item S2 and P2 shall point to the same object.
 \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall
    evaluate to the same value as ((P2 - P1) * sizeof(T)).
+\xchangenote[id=SV]{106}{The semantics of the constraint on upc\_addrfield() are slightly
changed.}
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of shared array
elements with affinity to a thread.}
+}
 \end{itemize}

 \np Two compatible pointers-to-shared which point to the same

Reported by sdvormwa@cray.com on 2013-02-27 22:12:21

2013-02-27T22:12:21+00:00

Former user Account Deleted

I don't want readers wasting their time trying to fathom this triviality that will never
have any observable effect on anyone. I'm appalled that we've already wasted so much
of our time on it.

Updated PendingApproval proposal:

--- upc-language.tex    (revision 204)
+++ upc-language.tex    (working copy)
@@ -289,12 +289,13 @@

 \begin{verbatim}
     T *P1, *P2;
-    shared T *S1, *S2;  
+    shared [] T *S1, *S2;  

     P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */
     P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */
 \end{verbatim}
-    
+\xchangenote[id=DB]{106}{Declaration of S1/S1 changed to indefinite blocksize to accomodate
new constraint. 
+This change also subtly modifies the constraint on {\tt upc\_addrfield} in a way that
has no impact on current implementations.}

 \np For all S1 and S2 that point to two distinct elements of
    the same shared array object which have affinity to the same
@@ -305,6 +306,10 @@
 \item S2 and P2 shall point to the same object.
 \item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall
    evaluate to the same value as ((P2 - P1) * sizeof(T)).
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of shared array
elements with affinity to a thread.}
+}
 \end{itemize}

 \np Two compatible pointers-to-shared which point to the same

Reported by danbonachea on 2013-02-28 09:11:03

2013-02-28T09:11:03+00:00

Former user Account Deleted

> I don't want readers wasting their time trying to fathom this triviality that will
never have any observable effect on anyone. I'm appalled that we've already wasted
so much of our time on it.

I now have to revoke my support for this proposal.  I am quite disappointed to hear
that the author of the proposed changes considers discussion of the implications of
his changes is a waste of time.  This issue was pushed into 1.3 at the last minute
with less than 2 days of discussion.  It SHOULD have been a 1.4 issue, given where
we are in 1.3 and the fact that this is a clarification of behavior that is required
for an implementation to get the correct semantics on many examples already in the
spec and further, that we know of no existing implementation that doesn't already have
this behavior.  It is good to clarify this property, but it is NOT ACCEPTABLE to rush
changes into the spec at the last minute for a trivial clarification that affects no
current implementations, and then summarily dismiss concerns that the proposed change
has unintended consequences.

Reported by sdvormwa@cray.com on 2013-02-28 15:38:12 - Status changed: Started

2013-02-28T15:38:12+00:00

Former user Account Deleted

We had consensus on the call that generating a clarification for 1.3 is worthwhile.
We have consensus that the proposed change effects the intended clarification.

I'm frustrated arguing about a 0.001% semantic side-effect to a function that is already
99.999% implementation-defined, and likely to receive massive semantic changes in a
future revision. We already have universal agreement that said concern has no possibility
of impact on real implementations and thus real users. Nevertheless, the issue has
been noted in the change note upon Steve's insistence.

It's not the issue, proposal or even the addendum that is a waste of time, it is the
continued adversarial bickering on this triviality, which I now consider fully resolved.
The discussion between myself and Steve on this matter has reached a point of hostility
that I feel no further progress can be made. Steve if you still disagree, then seek
out some impartial third party to back your argument.

Reported by danbonachea on 2013-02-28 16:10:04 - Status changed: PendingApproval

2013-02-28T16:10:04+00:00

Former user Account Deleted

+\xchangenote[id=DB]{106}{Declaration of S1/S1 changed to indefinite blocksize to accomodate
new constraint. 
+This change also subtly modifies the constraint on {\tt upc\_addrfield} in a way that
has no impact on current implementations.}

I believe you intend "S1 and S2" instead of S1/S1.

Given that Issue 107 seems to be headed toward strengthening upc_addrfield instead
of deprecating it, I agree with Steve that this issue should be moved to UPC 1.4. 
If upc_addrfield were being deprecated and we were thus subtly changing a function
that we no longer cared about, I'd be fine with the language in Comment #56 with the
S1/S1 corrected. However, because it looks like we're keeping upc_addrfield and are
likely to change it yet again in UPC 1.4, I believe it is best to wait on this change
and make all of the upc_addrfield related changes at once.

That said, I likely do not satisfy Dan's desire for an "impartial third party" requested
in Comment #58 because I work with Steve, so I will wait to see what others say.

Reported by johnson.troy.a on 2013-02-28 17:38:14

2013-02-28T17:38:14+00:00

Former user Account Deleted

We're already going out of our way to clarify something that is already implied by other
parts of the spec and which is the behavior of all existing UPC implementations.  When
it looked like upc_addrfield() was going to be deprecated in 1.4 (see issue 107), it
seemed to me that simply making a change note that the semantics of upc_addrfield()
are slightly different in 1.3 was an acceptable compromise, because it would be going
away anyway.  That seems to no longer be the case.

In short, if we are going to change the semantics of upc_addrfield(), then we should
do it right and make the function useful for something (see Paul's really good text
in comment 17 in issue 107 (http://code.google.com/p/upc-specification/issues/detail?id=107#c17)
regarding this).  If we're not going to fix it, then we should preserve the existing
semantics.

Reported by sdvormwa@cray.com on 2013-02-28 17:39:27

2013-02-28T17:39:27+00:00

Former user Account Deleted

Below is the ALTERNATE proposal that I referred to in comment #49, which some people
may find more acceptable. It moves the upc_addrfield constraint, completely unchanged,
to the relevant library section (where it probably belonged in the first place). I
believe this proposal preserves the semantic constraint on upc_addrfield in a way that
is completely identical to 1.2 in every respect. This change has the benefit of removing
a long cross-reference to important information concerning the library behavior, at
the cost of a few duplicated lines of declarations. If issue 107 resolves to deprecate,
strengthen or remove the upc_addrfield library functionin some future revision of the
spec, any such change would likely modify these lines, but the change should remain
local to 7.2.3.4.

I would be satisfied with applying either this proposal or the one in comment #56 (with
Troy's clarification to the annotation). I think either one resolves the original ambiguity
which spawned this issue, which we agreed should be clarified in 1.3 if at all possible.
I think it would be a shame to allow such an significant ambiguity to persist as a
"known bug" in 1.3, when we all agree on how shared array contiguity needs to behave.

--- upc-language.tex    (revision 204)
+++ upc-language.tex    (working copy)
@@ -289,12 +289,12 @@

 \begin{verbatim}
     T *P1, *P2;
-    shared T *S1, *S2;  
+    shared [] T *S1, *S2;  

-    P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */ 
-    P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */ 
+    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
+    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
 \end{verbatim}
-    
+\xchangenote[id=DB]{106}{Declaration of S1 and S1 changed to indefinite blocksize
to accommodate new constraint.}

 \np For all S1 and S2 that point to two distinct elements of
    the same shared array object which have affinity to the same
@@ -303,9 +303,12 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall 
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of shared array
elements with affinity to a thread.}
+}
 \end{itemize}
+\xchangenote[id=DB]{106}{Constraint on {\tt upc\_addrfield} moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

--- upc-lib-core.tex    (revision 204)
+++ upc-lib-core.tex    (working copy)
@@ -302,11 +302,26 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.

-   
+\np Given the following declarations:
+
+\begin{verbatim}
+    T *P1, *P2;  
+    shared T *S1, *S2;  
+
+    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
+    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
+\end{verbatim}
+
+   For all S1 and S2 that point to two distinct elements of
+   the same shared array object which have affinity to the same
+   thread, the expression:\\
+    {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t)upc\_addrfield(S1))} \\
+   shall evaluate to the same value as: {\tt ((P2 - P1) * sizeof(T))}.
+
+\xchangenote[id=DB]{106}{Paragraph moved from 6.4.2 and cross-reference footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}

Reported by danbonachea on 2013-02-28 18:17:54

2013-02-28T18:17:54+00:00

Former user Account Deleted

Dan asked me off-list to "chime in" on this.
Just as Troy's co-worker relation to Steve disqualifies him as "impartial 3rd party"
(comment #59), so does my co-worker relation to Dan.  I am stating that clearly so
nobody things we are "pulling a fast one".

First off, I am somewhat disheartened by the strength of the disagreement between Dan
and Steve at this point, and don't feel that either one of them is 100% correct.  While
I cannot (yet?) offer alternative text to resolve the original ambiguity, I am unsatisfied
with the current proposal.  Here is a summary of my point-of-view.  I am labeling the
points to make responding to them simpler.

PHH1)  The proposal (idea, not the diff) in comment #61 to duplicate the semantics
of upc_addrfield() to the library document has my support.  HOWEVER, that idea is independent
of what changes are made to clarify that shared array elems are contiguous.  The proposed
diff in that comment makes changes to 6.4.2 that I don't fully agree with.

PHH2)  I agree (as I *think* we all do now) that the change from cyclic to indefinite
for S1 and S2 does provide the desired constraint on the layout of array elements,
but one may need read the issue tracker to understand that.  To put that in other words:
I have no objection to the technical soundness of Dan's proposal for the purpose of
resolving issue 106.

PHH3)  I asked for the addition of a footnote because I felt that Dan's proposed changes
failed to make the desired constraint on layout clear to most readers.  That is, to
me, a small strike against Dan's proposal - though not on technical grounds.

PHH4)  Dan wrote in comment #53
> The change annotation already automatically includes a hyperlink directly to
> this very page, where the issue is discussed ad nauseum in the comments above.
> Any users or implementers who really care [...] can click the hyperlink to
> read all about it right here. What more do we need?
HOWEVER, once we reach final spec that isn't true. Our cover text says:
> Change annotations in the specification body are for reviewer convenience only
> and are not normative, nor will they appear in the final draft.

PHH5)  Personally, I agree with Dan that the very small indirect change to the semantics
of upc_addrfield() aren't worth a big fuss.  HOWEVER, the spec process is about building
a consensus, and currently the "nearly silent" change to upc_addrfield()'s semantics
are an obstacle to reaching that consensus.  So, since a "big fuss" does exist, it
is our responsibility as members of this working group to work it through to a resolution.

PHH6)  While the point was made that this issue is naturally tied to how one defines
pointer arithmetic, following that direction has lead us to the current proposed change.
 This is a textually very small change, but has strong opposition.  So, I think it
wise to consider what other options are available.  Maybe there is a better solution,
or maybe the current proposed change is the "lesser of N evils" and will gain support
when compared to one or more alternatives.

PHH7)  Since I suggest we need to look at alternative, I feel obligated to attempt
to provide at least one:  What if we pick up again from Steve's suggestion in comment
#3 to augment 6.5.2.1, and work out a wording that doesn't use undefined terminology.
 Since the footnote of Dan's current proposal got an OK from Dan, we could start from
that.

PHH8)  For what it is worth: a upc_addrfield() strengthened to satisfy my requirements
given in issue 107 would, I believe, match all currently known implementations and
would be constrained by both the original version of 6.4.2 4-5 and the proposed indefinite
version.

Reported by phhargrove@lbl.gov on 2013-02-28 19:31:29

2013-02-28T19:31:29+00:00

Former user Account Deleted

The proposal in comment 61 is acceptable, though I'd prefer to just outright fix upc_addrfield()
at the same time, since it is so closely related to this issue.  As another alternative
to consider, could we promote issue 107 to 1.3, use your original proposal for 106
and fix the semantics of upc_addrfield()?  I believe my proposal in comment 29 (http://code.google.com/p/upc-specification/issues/detail?id=107#c29)
addresses all of our concerns.

Reported by sdvormwa@cray.com on 2013-02-28 19:32:05

2013-02-28T19:32:05+00:00

Former user Account Deleted

It looks like my comment #62 and Steve's comment #63 "crossed in the ether".

If Steve is happy with the contents of #61, then perhaps a wasted a lot of time typing
and polishing my text for comment #62  :-)

I will be examining Steve's proposal in issue 107 momentarily.

Reported by phhargrove@lbl.gov on 2013-02-28 19:40:41

2013-02-28T19:40:41+00:00

Former user Account Deleted

Responding to Paul's non-technical point:

>PHH4) Our cover text says:
> Change annotations in the specification body are for reviewer convenience only
> and are not normative, nor will they appear in the final draft.

Once the spec ratification process is complete, we will generate and distribute a 1.3
document that is the "official" language definition that contains only the normative
text. However, I believe we decided last year that we would additionally distribute
a version of the document with change bars and annotations intact (and possibly also
a full Latex diff). The former will serve as the official normative definition of the
revised language, while the latter will be provided for reference purposes to implementers
and users during the transition to 1.3 compliance.

Reported by danbonachea on 2013-02-28 20:14:04

2013-02-28T20:14:04+00:00

Former user Account Deleted

Dan wrote:
> Responding to Paul's non-technical point:
> 
> PHH4) Our cover text says:
> Change annotations in the specification body are for reviewer convenience only
> and are not normative, nor will they appear in the final draft.
> 
> Once the spec ratification process is complete, we will generate and distribute
> a 1.3 document that is the "official" language definition that contains only the
> normative text. However, I believe we decided last year that we would additionally
> distribute a version of the document with change bars and annotations intact (and
> possibly also a full Latex diff). The former will serve as the official normative
> definition of the revised language, while the latter will be provided for reference
> purposes to implementers and users during the transition to 1.3 compliance.


Dan,

Thanks for cluing me in on this point - your comment #53 make more sense to me now.
My absence from recent conference calls has left me ignorant of some things like this.

-Paul

Reported by phhargrove@lbl.gov on 2013-02-28 20:25:40

2013-02-28T20:25:40+00:00

Former user Account Deleted

I believe that if we accept the changes in comment #61, then we should additionally
remove upc_addrfield from the list of forward references at the end of 6.4.2.  Right?

Reported by phhargrove@lbl.gov on 2013-02-28 20:41:37

2013-02-28T20:41:37+00:00

Former user Account Deleted

Updated proposal below. We seem to be approaching consensus on this language.

It is the same proposal from comment #61, with the following modifications:

* Removed the forward reference pointed out by Paul in comment #67
* Add a comment clarifying that T is not a shared type in both copies of the declarations
* Augmented the change note to indicate that comments have been clarified

The last two subsume a similar change to the same lines in the issue 3 proposal, to
prevent a merge collision. If for some reason the issue 3 change is rejected (seems
unlikely), then the comment will have to be re-phrased using current definitions.

--- upc-language.tex    (revision 204)
+++ upc-language.tex    (working copy)
@@ -288,13 +288,13 @@
    constructs:

 \begin{verbatim}
-    T *P1, *P2;  
-    shared T *S1, *S2;  
+    T *P1, *P2;    /* T is not a shared type */
+    shared [] T *S1, *S2;  

-    P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */ 
-    P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */ 
+    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
+    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
 \end{verbatim}
-    
+\xchangenote[id=DB]{106}{Declaration of S1 and S1 changed to indefinite blocksize
to accommodate new constraint. Comments clarified.}

 \np For all S1 and S2 that point to two distinct elements of
    the same shared array object which have affinity to the same
@@ -303,9 +303,12 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall 
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of shared array
elements with affinity to a thread.}
+}
 \end{itemize}
+\xchangenote[id=DB]{106}{Constraint on {\tt upc\_addrfield} moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

--- upc-lib-core.tex    (revision 204)
+++ upc-lib-core.tex    (working copy)
@@ -302,11 +302,26 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.

-   
+\np Given the following declarations:
+
+\begin{verbatim}
+    T *P1, *P2;    /* T is not a shared type */
+    shared T *S1, *S2;  
+
+    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
+    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
+\end{verbatim}
+
+   For all S1 and S2 that point to two distinct elements of
+   the same shared array object which have affinity to the same
+   thread, the expression:\\
+    {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t)upc\_addrfield(S1))} \\
+   shall evaluate to the same value as: {\tt ((P2 - P1) * sizeof(T))}.
+
+\xchangenote[id=DB]{106}{Paragraph moved from 6.4.2 and cross-reference footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}

Reported by danbonachea on 2013-03-01 13:30:00 - Labels added: Consensus-High - Labels removed: Consensus-Low

2013-03-01T13:30:00+00:00

Former user Account Deleted

Unfortunately, a few of the semantic issues brought up in issue 107 apply here, and
we should modify the wording to take them into account.

1. Is pointing to one element past the end of a shared array object valid (as it is
for local objects by ISO/IEC 9899 6.5.6 8-9)?  If so, we should be sure that we get
the expected behavior for those as well.  Note that this is a much larger change, as
a lot of the spec assumes that any valid non-null pointer-to-shared points to an object.

2. Given a multi-dimensional shared array object, an object with ultimate element type
(see issue 3) of the array object and contained within it is not an element of the
array object (see http://code.google.com/p/upc-specification/issues/detail?id=107#c64).
 Thus the wording "that point to two distinct elements of same shared array object"
would not cover such cases.

Reported by sdvormwa@cray.com on 2013-03-01 16:01:54

2013-03-01T16:01:54+00:00

Former user Account Deleted

> Is pointing to one element past the end of a shared array object valid (as it is for
local objects by ISO/IEC 9899 6.5.6 8-9)?

This questionable "feature" of C99 is one that UPC does not currently specify as valid
for definitely-blocked shared arrays, and I suspect current implementations differ
in their behavior. Unlike in C, blocked pointer arithmetic in UPC is not a simple linear
relationship, so "one past" the last element in a shared array is a non-trivial concept
to express. Specifically, the location of "one past" would often depend on the blocking
factor of the PTS, and the affinity and phase of such a pointer would be questionable
as well. I consider it a *feature* that UPC leaves indexing past the end of a shared
array unspecified, and therefore undefined behavior. Changing that would be a significant
behavioral modification with a non-trivial impact on some of the trickiest code in
our implementations.

Even if you don't agree with my reasoning above, THIS issue (106) deals ONLY with clarifying
the placement of actual shared array ELEMENTS in memory and clarifying they are placed
contiguously; a clarification that has no behavioral or implementation impact, and
reflects the common understanding of all UPC implementers and users since the language
inception. Adding additional flexibility to PTS arithmetic clearly falls outside the
scope of this effort and is orthogonal to it, despite the fact that it might eventually
modify the same section. Please open a NEW issue if you wish to pursue that matter
(or any other issue not directly related to this clarification).

> an object with ultimate element type (see issue 3) of the array object and contained
within it is not an element of the array object 

This case is already prevented by the type declarations of S1 and S2. They both have
the same referent type, and thus they both must already point to elements at the same
"level" of the multi-D shared array.

Reported by danbonachea on 2013-03-01 16:37:34

2013-03-01T16:37:34+00:00

Former user Account Deleted

> the last element in a shared array is a non-trivial concept to express. Specifically,
the location of "one past" would often depend on the blocking factor of the PTS, and
the affinity and phase of such a pointer would be questionable as well.

No it isn't.  This is trivial to express.  The existing equations in 6.4.2 3 define
the exact behavior of upc_threadof() and upc_phaseof().  My proposal in comment 13
suffices to define the behavior of upc_addrfield(), and can be trivially tweaked to
define the local address as well.  Since you can't do pointer-to-shared arithmetic
on generic pointers-to-shared, nor on pointers-to-shared whose referenced type is incomplete,
we don't need to worry about what "one past" means in those cases, and it is well-defined
for all others.

> This case is already prevented by the type declarations of S1 and S2. They both have
the same referent type, and thus they both must already point to elements at the same
"level" of the multi-D shared array.

I think you missed my point.  Given

shared [] char A[2][2];

shared [] char *si1 = &A[0][0];
shared [] char *si2 = &A[1][0];
int *pi1 = (int *)si1;
int *pi2 = (int *)si2;

shared [] char (*sa1) = &A[0];
shared [] char (*sa2) = &A[1];
int (*pa1)[2] = (char *)sa1;
int (*pa2)[2] = (char *)sa2;

Using your logic from http://code.google.com/p/upc-specification/issues/detail?id=107#c64,
because the shared array object that A[0][0] is an element of is A[0], and the shared
array object that A[1][0] is an element of is A[1], and these are not the same shared
array object, the new constraint DOES NOT apply to the expression (pi1 + (si2 - si1)
== pi2), but DOES apply to (pa1 + (pa2 - pa1) == pa2).  I think we want the constraint
to apply to the former as well.

Reported by sdvormwa@cray.com on 2013-03-01 17:50:09

2013-03-01T17:50:09+00:00

Former user Account Deleted

Sorry that should be

shared [] char A[2][2];

shared [] char *si1 = &A[0][0];
shared [] char *si2 = &A[1][0];
int *pi1 = (char *)si1;
int *pi2 = (char *)si2;

shared [] char (*sa1)[2] = &A[0];
shared [] char (*sa2)[2] = &A[1];
int (*pa1)[2] = (char (*)[2])sa1;
int (*pa2)[2] = (char (*)[2])sa2;

I got interrupted while changing the types and forgot to finish when I came back. ;)

Reported by sdvormwa@cray.com on 2013-03-01 18:07:45

2013-03-01T18:07:45+00:00

Former user Account Deleted

And I still missed four 'int' -> 'char' conversions. =(

char *pi1 = (char *)si1;
char *pi2 = (char *)si2;
...
char (*pa1)[2] = (char (*)[2])sa1;
char (*pa2)[2] = (char (*)[2])sa2;

Reported by sdvormwa@cray.com on 2013-03-01 18:09:10

2013-03-01T18:09:10+00:00

Former user Account Deleted

Steve, since you seem unwilling to take unrelated issues to new threads, I've created
an issue for you to discuss your latest completely unrelated comment on this issue:

http://code.google.com/p/upc-specification/issues/detail?id=109

Please take that discussion there and lets keep this one on topic please.

Your multi-D example makes no sense to me. Please reformulate it in a way that directly
applies to the declarations and variable names in 6.4.2-5 that are the topic of this
issue.

Reported by danbonachea on 2013-03-01 18:29:18

2013-03-01T18:29:18+00:00

Former user Account Deleted

> This questionable "feature" of C99 is one that UPC does not currently specify as valid
for definitely-blocked shared arrays

UPC DOES currently permit it for indefinitely blocked shared arrays however, which,
due to how you chose to make the change (via an example with pointers whose referenced
type are INDEFINITELY BLOCKED), are the only ones that matter here.  To quote 6.4.2
2:

If the shared array is declared with indefinite block size, the result of the pointer-to-shared
arithmetic is identical to that described for normal C pointers in [ISO/IEC00 Sec.
6.5.6], except that the thread of the new pointer shall be the same as that of the
original pointer and the phase component is defined to always be zero.

Oddly enough, I believe this statement already provides not only the constraint that
you are attempting to explicitly add, but a much stronger one.

Reported by sdvormwa@cray.com on 2013-03-01 18:49:17

2013-03-01T18:49:17+00:00

Former user Account Deleted

> Oddly enough, I believe this statement already provides not only the constraint that
you are attempting to explicitly add, but a much stronger one.

I believe contiguity of indefinitely blocked array elements was never in question (due
in part to that very text). The clarification of this issue was primarily motivated
for definitely blocked arrays, and is benignly redundant for indefinitely blocked arrays.

> due to how you chose to make the change (via an example with pointers whose referenced
type are INDEFINITELY BLOCKED), are the only ones that matter here.

Once again you are confusing pointers and arrays. We already hashed out this exact
argument in comment 16-17, but since you brought it up again for some reason, I'll
restate the salient point. The proposal for this issue clarifies the layout of ARRAYS
OF ANY BLOCKING FACTOR, and merely uses POINTERS OF A PARTICULAR BLOCKING FACTOR as
a notational convenience to express the necessary constraint (because non-generic PTS
must have SOME blocking factor, and that particular choice allowed the most concise
expression). 

Anticipating your next response, the unmodified setup text in 6.4.2 which is the precondition
to the clarification constraint says:

  For all S1 and S2 that point to two distinct ELEMENTS of the same shared
  array object which have affinity to the same thread.

The fact that S1 and S2 *could* be pointed at unallocated space is completely irrelevant,
because the precondition explicitly states that they are NOT. When this precondition
is violated, the logical implication is vacuously asserted and the constraint is irrelevant.

Reported by danbonachea on 2013-03-01 19:06:51

2013-03-01T19:06:51+00:00

Former user Account Deleted

> Your multi-D example makes no sense to me. Please reformulate it in a way that directly
applies to the declarations and variable names in 6.4.2-5 that are the topic of this
issue.

Ok, I'll try to be a bit more clear.

shared [2] char A[2*THREADS][2];                // Declare an multi-dimensional shared
array object

shared [] char (*S1)[2] = &A[MYTHREAD];         // Points to the first element of A
on the local thread
shared [] char (*S2)[2] = &A[THREADS+MYTHREAD]; // Points to the second element of
A on the local thread

char (*P1)[2] = (char (*)[2]) S1;
char (*P2)[2] = (char (*)[2]) S2;

if ( P1 + (S2 - S1) == P2 ) {
    // Guaranteed by the new constraint
}

shared [] char *S3 = &A[MYTHREAD][0];         // Points to the first element of *S1
on the local thread
shared [] char *S4 = &A[THREADS+MYTHREAD][0]; // Points to the first element of *S2
on the local thread

char *P3 = (char *)S3;
char *P4 = (char *)S4;

if ( P3 + (S4 - S3) == P4 ) {
    // Unspecified because neither S3 nor S4 point to elements of the same object.
    // However, programmers using multidimensional shared arrays are more likely to
use this form.
}

Reported by sdvormwa@cray.com on 2013-03-01 19:22:23

2013-03-01T19:22:23+00:00

Former user Account Deleted

> Once again you are confusing pointers and arrays. We already hashed out this exact
argument in comment 16-17, but since you brought it up again for some reason, I'll
restate the salient point.

And you are ignoring C99 constraints on accessing objects.  ISO/IEC 9899 6.5 7 (emphasis
mine):

An object shall have its stored value accessed ONLY by an lvalue expression that has
one of the following types:

-- a type compatible with the effective type of the object
-- a qualified version of a type compatible with the effective type of the object
-- a type that is the signed or unsigned type corresponding to the effective type of
the object.
-- a type that is the signed or unsigned type corresponding to a qualified version
of the effective type of the object.
-- an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
-- a character type

Since we defined that the blocking factor is part of the type compatibility, accessing
elements of a definitely blocked shared array via a pointer-to-shared whose referenced
type is indefinitely blocked is undefined (unless the pointer-to-shared's referenced
type is a character type).

Reported by sdvormwa@cray.com on 2013-03-01 19:33:21

2013-03-01T19:33:21+00:00

Former user Account Deleted

> An object shall have its stored value accessed ONLY by an lvalue expression that has
one of the following types:

Irrelevant to this issue. The equations in 6.4.2 do not ACCESS any heap objects whatsoever.

> Since we defined that the blocking factor is part of the type compatibility, accessing
elements of a 
> definitely blocked shared array via a pointer-to-shared whose referenced type is
indefinitely blocked 
> is undefined (unless the pointer-to-shared's referenced type is a character type).

This is irrelevant to the current issue, but I believe this assertion to be false and
represents a misunderstanding of type compatibility. If you don't agree please open
a NEW issue to discuss that separate topic.

Reported by danbonachea on 2013-03-01 19:40:05

2013-03-01T19:40:05+00:00

Former user Account Deleted

Code from Steve's comment (ignoring missing casts in S3/S4 initializers):
-----------------------------
shared [2] char A[2*THREADS][2];                // Declare an multi-dimensional shared
array object
...
shared [] char *S3 = &A[MYTHREAD][0];         // Points to the first element of *S1
on the local thread
shared [] char *S4 = &A[THREADS+MYTHREAD][0]; // Points to the first element of *S2
on the local thread

char *P3 = (char *)S3;
char *P4 = (char *)S4;

if ( P3 + (S4 - S3) == P4 ) {
    // Unspecified because neither S3 nor S4 point to elements of the same object.
    // However, programmers using multidimensional shared arrays are more likely to
use this form.
}
-----------------------------
OK I understand your nitpick now, but I respectfully disagree. The intended meaning
in this case is that S3 and S4 both indeed "point to two distinct elements of the same
shared array object", namely they point to elements of the enclosing multidimensional
shared array object A. This seems relatively obvious to me, but I'm open to a footnote
clarification if you really feel that's necessary and have text to propose. The main
purpose of the text in question is to ensure both pointers reference objects that are
entirely contained within ANY single, enclosing shared object, regardless of referent
type.

C99 is actually surprisingly silent on the exact usage of the term "element", especially
as applied to multi-D arrays. In the example above, A[MYTHREAD] is clearly an element
of A, and A[MYTHREAD][0] is clearly an element of A[MYTHREAD], but C99 does not explicitly
state whether or not this terminology is transitive, ie if these statements also imply
that A[MYTHREAD][0] is ALSO an element of A (the underlying assumption I've made).
I believe the latter is common usage in the community, but the only actual mention
I can find in C99 is from 6.5.2.1 which defines indexing into multi-d array objects:

  Successive subscript operators designate an element of a multidimensional array object.

To me this implies that the final element accessed by a sequence of [][][] operators
is also "an element of a multidimensional array object".

Reported by danbonachea on 2013-03-01 20:05:43

2013-03-01T20:05:43+00:00

Former user Account Deleted

What I'm concerned about is users missing the subtlety of the following code due to
the confusion over the term "element" when applied to multi-dimensional arrays:

shared [B] T A[2*THREADS];

shared [] T *S1 = (shared [] T *)&A[0];
shared [] T *S2 = (shared [] T *)&A[1];

if ( upc_threadof( S1 ) == upc_threadof( S2 ) ) {
   T *P1 = (T *)S1;
   T *P2 = (T *)S2;

   if ( P1 + (S1 - S2) == P2 ) {
       // Required by new constraint?
   }
}

Reported by sdvormwa@cray.com on 2013-03-01 20:52:03

2013-03-01T20:52:03+00:00

Former user Account Deleted

Thanks to Troy for proofreading this for me:

shared [B] T A[2*THREADS];

shared [] T *S1 = (shared [] T *)&A[0];
shared [] T *S2 = (shared [] T *)&A[1];

if ( (upc_threadof( S1 ) == MYTHREAD) &&
     (upc_threadof( S1 ) == upc_threadof( S2 )) ) {

   T *P1 = (T *)S1;
   T *P2 = (T *)S2;

   if ( P1 + (S2 - S1) == P2 ) {
       // Required by new constraint?
   }
}

Reported by sdvormwa@cray.com on 2013-03-01 21:25:16

2013-03-01T21:25:16+00:00

Former user Account Deleted

My views on 106 and 107, when taken individually, match what Dan has stated in his email
requesting feedback:
+ 106 is ready (IMHO) for inclusion in 1.3
+ 107 is too problematic for inclusion in 1.3

However, I do agree with Steve that it would be better to resolve both in the same
spec revision.

With that in mind, I think that the clarification provided by issue 106 just formally
codifies something we "already knew".
The likelihood that between 1.3 and 1.4 somebody will implement a UPC compiler/runtime
that doesn't provide the expected contiguous layout is precisely ZERO because a significant
number of existing codes (incl benchmarks, tutorials, etc.) would fail.
THEREFORE, if there is not a consensus on 106 soon, then I am OK with deferring it
until 1.4.

-Paul

Reported by phhargrove@lbl.gov on 2013-03-01 21:32:56

2013-03-01T21:32:56+00:00

Former user Account Deleted

Consider the following "specialization" of the example from comment 82.  This one should
be fairly clear, and work the way everyone expects:

#define B 2
typedef int T;

shared [B] T A[2*THREADS];

shared [] T *S1 = (shared [] T *)&A[0];
shared [] T *S2 = (shared [] T *)&A[1];

if ( (upc_threadof( S1 ) == MYTHREAD) &&
     (upc_threadof( S1 ) == upc_threadof( S2 )) ) {

   T *P1 = (T *)S1;
   T *P2 = (T *)S2;

   if ( P1 + (S2 - S1) == P2 ) {
       // Required by new constraint?
   }
}

Clearly, &A[0] and &A[1] point to elements of the same shared array object A.  S1 and
S2 point to the former and later respectively, but with a different referenced type.
 I think we can all agree they also point to elements of the same shared array object.
 Moving into the condition, casting S1 and S2 to pointers-to-local P1 and P2 is valid,
as all the bytes making up the objects pointed to by S1 and S2 has already been verified
to be the local thread.  Since S1 and S2 point to elements of the same shared array,
the expression (P1 + (S2 - S1) == P2) is defined to be 1 by the new constraint.  This
is exactly the behavior we want and intend.

Now, consider a different "specialization" of the same example code.  This is where
I think things get confusing for users, and question if we need to word the changes
differently.

#define B 3
typedef char T[3][2];

shared [B] T A[2*THREADS];

shared [] T *S1 = (shared [] T *)&A[0];
shared [] T *S2 = (shared [] T *)&A[1];

if ( (upc_threadof( S1 ) == MYTHREAD) &&
     (upc_threadof( S1 ) == upc_threadof( S2 )) ) {

   T *P1 = (T *)S1;
   T *P2 = (T *)S2;

   if ( P1 + (S2 - S1) == P2 ) {
       // Required by new constraint?
   }
}

Once again, clearly &A[0] and &A[1] point to elements of the same shared array object
A.  When run with 2 UPC threads, both will have affinity to thread 0:

    T0         T1
---------- ----------
A[0][0][0] A[0][1][1]
A[0][0][1] A[0][2][0]
A[0][1][0] A[0][2][1]
---------- ----------
A[1][0][0] A[1][1][1]
A[1][0][1] A[1][2][0]
A[1][1][0] A[1][2][1]
---------- ----------
A[2][0][0] A[2][1][1]
A[2][0][1] A[2][2][0]
A[2][1][0] A[2][2][1]
---------- ----------
A[3][0][0] A[3][1][1]
A[3][0][1] A[3][2][0]
A[3][1][0] A[3][2][1]
---------- ----------

But now things get tricky.  All the bytes making up the objects that S1 and S2 point
to also make up the shared array object A, and they were initialized by pointers that
clearly point to elements of the same array object A.  Do they also point to elements
of the same array object?  Ponder that while we continue on into the conditional.

The cast of S1 and S2 to pointers-to-local here is clearly legal, since the bytes pointed
to by S1 and S2 all have affinity to the same thread due to the referenced type having
indefinite block size, and that thread has been verified to be the local thread.  Now
we come to our new constraint.  If we consider that S1 and S2 point to elements of
the same shared array object, then the new constraint REQUIRES that the expression
(P1 + (S2 - S1) == P2) evaluate to 1.  However, the expression (S2 - S1) has an undefined
value, because there is no integer X that we can add to S1 to produce S2!  Additionally
(and for similar reasons), there is no integer X we could add to P1 to produce P2!

Since it seems clear that this constraint cannot apply in this case, is it possible
that S1 and S2 don't point to elements of the same shared array object?  If they don't,
then the proposed wording is still valid, though potentially confusing when used with
pointers to array types.  We've already established that the bytes they point to are
included inside the memory region of the shared array object A, so that is not a valid
test.  The mere fact that the expression (S2 - S1) is undefined would seem insufficient
due to the language in 6.4.2 6.  Consider the following example 

shared [3] int B[THREADS][2];
shared [3] int (*S3)[2] = &B[1];
shared [3] int (*S4)[2] = (shared [3] int (*)[2])upc_resetphase( S3 );

Do S3 and S4 point to the same object?  Do both point to elements of the shared array
B?  Why or why not?  Would the same (or a similar) argument apply to S1 and S2, and
the shared array A?  Our terminology here is very confusing, at least to me.

> C99 is actually surprisingly silent on the exact usage of the term "element", especially
as applied to multi-D arrays. In the example above, A[MYTHREAD] is clearly an element
of A, and A[MYTHREAD][0] is clearly an element of A[MYTHREAD], but C99 does not explicitly
state whether or not this terminology is transitive, ie if these statements also imply
that A[MYTHREAD][0] is ALSO an element of A (the underlying assumption I've made).
I believe the latter is common usage in the community, but the only actual mention
I can find in C99 is from 6.5.2.1 which defines indexing into multi-d array objects:
>
>  Successive subscript operators designate an element of a multidimensional array
object.
>
> To me this implies that the final element accessed by a sequence of [][][] operators
is also "an element of a multidimensional array object".

Actually, I think C is pretty clear, though it'd be nice if the clarification were
part of some constraints or definitions rather than part of an example.  Read a bit
further down to 6.5.2.1 4:

  EXAMPLE Consider the array object defined by the declaration

    int x[3][5];

  Here x is a 3x5 array of ints; more precisely, x is an array of three element objects,
each of which is an array of five ints.

Reported by sdvormwa@cray.com on 2013-03-03 16:51:20

2013-03-03T16:51:20+00:00

Former user Account Deleted

Below is an updated change proposal that tweaks the "setup text" to accommodate Steve's
objection concerning multi-dimensional arrays. This version side-steps the problem
entirely by defining the contiguity constraint solely in terms of "ultimate elements"
of the shared array. I believe this still guarantees the contiguity constraint we need
to clarify the original issue, and by induction also enforces the required constraint
for the case of multi-dimensional arrays.

As before, the constraint for upc_addrfield() still remains completely unchanged from
1.2, and is merely moved to the library section where it belongs.

--- upc-language.tex    (revision 204)
+++ upc-language.tex    (working copy)
@@ -288,24 +288,33 @@
    constructs:

 \begin{verbatim}
-    T *P1, *P2;  
-    shared T *S1, *S2;  
+    T *P1, *P2;    /* T is not a shared type */
+    shared [] T *S1, *S2;  

-    P1 = (T*) S1;  /* allowed if S1 has affinity to MYTHREAD */ 
-    P2 = (T*) S2;  /* allowed if S2 has affinity to MYTHREAD */ 
+    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
+    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
 \end{verbatim}
-    
+\xchangenote[id=DB]{106}{Declaration of S1 and S1 changed to indefinite blocksize
to accommodate new constraint. Comments clarified.}

-\np For all S1 and S2 that point to two distinct elements of
-   the same shared array object which have affinity to the same
-   thread:
+\np For all S1 and S2 that point to two distinct 
+\xreplaced[id=DB]{106}{
+   objects with affinity to the same thread, 
+   where both are subobjects contained in the same shared array whose
+   ultimate element type is a qualified version of {\tt T}: 
+}{
+   elements of the same shared array object 
+   which have affinity to the same thread:
+}

 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt (ptrdiff\_t)
upc\_addrfield(S1))} shall 
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of shared array
elements with affinity to a thread.}
+}
 \end{itemize}
+\xchangenote[id=DB]{106}{Constraint on {\tt upc\_addrfield} moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

--- upc-lib-core.tex    (revision 204)
+++ upc-lib-core.tex    (working copy)
@@ -302,11 +302,26 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.

-   
+\np Given the following declarations:
+
+\begin{verbatim}
+    T *P1, *P2;    /* T is not a shared type */
+    shared T *S1, *S2;  
+
+    P1 = (T*) S1;  /* allowed if upc_threadof(S1) == MYTHREAD */ 
+    P2 = (T*) S2;  /* allowed if upc_threadof(S2) == MYTHREAD */ 
+\end{verbatim}
+
+   For all S1 and S2 that point to two distinct elements of
+   the same shared array object which have affinity to the same
+   thread, the expression:\\
+    {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t)upc\_addrfield(S1))} \\
+   shall evaluate to the same value as: {\tt ((P2 - P1) * sizeof(T))}.
+
+\xchangenote[id=DB]{106}{Paragraph moved from 6.4.2 and cross-reference footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}

Reported by danbonachea on 2013-03-15 11:23:29

2013-03-15T11:23:29+00:00

Former user Account Deleted

I like the suggested re-formulation in Dan's Comment #85.

A couple of editorial suggestions.

1. In the example, where the comment states: "T is not a shared type", I recommend
that it be written as "T is not a shared qualified type", or "T is not a UPC shared
qualified type".  I recommend a similar improvement for other pending proposals where
the phrase "shared type" is used.  The reason that I believe that this is an improvement
is that "shared type" is rather generic sounding and might be used in contexts that
are not UPC-related.

2. In the added text "two distinct elements of the same shared array object", I don't
know if my suggestion made above would also apply, so will offer my suggestion as a
question: Would re-stating this as "two distinct elements of the same shared  qualified
array object" improve the precision of the statement?  BTW, in some of the documentation
that we/Intrepid write, we will often say "UPC shared type" and so on to help disambiguate,
but that usage is likely a departure from the style of the current UPC specification.

3. In the replacement text, would the phrase "subobjects" be clear as two words "sub
objects" or a hyphenated word "sub-objects"?

Reported by gary.funck on 2013-03-15 16:14:55

2013-03-15T16:14:55+00:00

Former user Account Deleted

> I like the suggested re-formulation in Dan's Comment #85.
>

Me too.  It looks like it narrows things down enough to not run afoul any more nasty
corner cases.

> A couple of editorial suggestions.
>
> 1. In the example, where the comment states: "T is not a shared type", I recommend
that it be written as "T is not a shared qualified type", or "T is not a UPC shared
qualified type".  I recommend a similar improvement for other pending proposals where
the phrase "shared type" is used.  The reason that I believe that this is an improvement
is that "shared type" is rather generic sounding and might be used in contexts that
are not UPC-related.

Shared type is (with the change for issue 3) explicitly defined in section 3 (Terms,
definitions, and symbols), so I think we're ok here.  Note that "ultimate element type"
and a number of other things included in this change come from the change for issue
3.  I believe Dan mentioned somewhere that he did this to alleviate merge problems.

> 2. In the added text "two distinct elements of the same shared array object", I don't
know if my suggestion made above would also apply, so will offer my suggestion as a
question: Would re-stating this as "two distinct elements of the same shared  qualified
array object" improve the precision of the statement?  BTW, in some of the documentation
that we/Intrepid write, we will often say "UPC shared type" and so on to help disambiguate,
but that usage is likely a departure from the style of the current UPC specification.

No, it is not possible for an array object to be shared qualified--only non-array objects
may be shared qualified.  See issue 3 for details.

Reported by sdvormwa@cray.com on 2013-03-15 16:33:45

2013-03-15T16:33:45+00:00

Former user Account Deleted

> A couple of editorial suggestions.

I should have clarified that the text proposed in comment #85 is heavily reliant upon
terms added to the definitions section by the issue 3 proposal, specifically "shared
type", "shared array" and "ultimate element type". If for some reason we revert the
issue #3 changes, this would need to be reworded, but I think it works well if both
proposals are taken together.

> In the replacement text, would the phrase "subobjects" be clear as two words "sub
objects" or a hyphenated word "sub-objects"?

In preparing this proposal, I spent several hours studying the C99 spec to find the
best possible wording for exactly that concept. While not explicitly defined, C99 uses
the term "subobject" (with that spelling) in several places to mean what we need. Eg
C99 6.7.8:

  Each brace-enclosed initializer list has an associated current object. When no
  designations are present, subobjects of the current object are initialized in order
according
  to the type of the current object: array elements in increasing subscript order,
...

So I believe this is correct C99 usage.

Reported by danbonachea on 2013-03-15 17:38:09

2013-03-15T17:38:09+00:00

Former user Account Deleted

In the 3/15/13 telecon, we reached consensus that this issue should be be addressed
in spec 1.3.
The updated proposal from comment 85 has been mailed to the list.

Reported by danbonachea on 2013-03-16 01:13:27

2013-03-16T01:13:27+00:00

Former user Account Deleted

Comment #85 proposal committed as SVN r213

Reported by danbonachea on 2013-04-30 18:47:08 - Status changed: Fixed

2013-04-30T18:47:08+00:00

Former user Account Deleted

Ratified in the 5/22 telecon.

Reported by danbonachea on 2013-08-03 03:55:37 - Status changed: Ratified

2013-08-03T03:55:37+00:00

Comments (91)