clarification: overlapping memory copies undefined by presence of "restrict" keyword

Issue #50 new
Former user created an issue

Originally reported on Google Code with ID 50

I user recently asked:
   Can src and dest be overlapped in upc_all_broadcast?

Their [corrected] example:
   shared int a[THREADS];
   ...
   upc_all_broadcast(a, &a[1], sizeof(int), flags);

I couldn't find anything in the spec clearly allowing or disallowing this.  There is
an obvious analogous question for (at least) scatter and gather.

I have no specific proposal, but really see only 2 alternatives:

1) State that this is OK (for certain collectives, TBD)
2) State that this has undefined behavior.

Regardless of what we decide on "in place", we should probably include language to
make other cases of "overlapping" src and dest have undefined behavior.

Reported by phhargrove@lbl.gov on 2012-06-11 20:19:46

Comments (14)

  1. Former user Account Deleted

    ``` The src and dest parameters of upc_all_broadcast are declared restrict. Scatter and gather are the same way. ```

    Reported by `johnson.troy.a` on 2012-06-11 20:23:27

  2. Former user Account Deleted

    ``` Thank you, Troy, I had missed that.

    I am closing the issue as WontFix to indicate that no spec change is required. ```

    Reported by `phhargrove@lbl.gov` on 2012-06-11 20:30:29 - Status changed: `WontFix`

  3. Former user Account Deleted

    ``` Perhaps some additional language in the spec. would be worthwhile, given that the use of restrict only implicitly states the constraint.

    ```

    Reported by `gary.funck` on 2012-06-11 20:42:41

  4. Former user Account Deleted

    ``` There IS precedent (show below) from C99 for including both "restrict" and language explicitly making overlap undefined (though I'd be willing to bet the language predates the existence of "restrict"). I've "reopened" the issue so we can discuss Gary's suggestion.

    7.21.2.1 The memcpy function

    Synopsis

    1 #include <string.h> void *memcpy(void * restrict s1, const void * restrict s2, size_t n);

    Description

    2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined. ```

    Reported by `phhargrove@lbl.gov` on 2012-06-11 20:49:40 - Status changed: `Accepted`

  5. Former user Account Deleted

    ``` Update summary to reflect nature of current proposal ```

    Reported by `phhargrove@lbl.gov` on 2012-06-14 23:11:06

  6. Former user Account Deleted

    Reported by `gary.funck` on 2012-07-03 15:10:26 - Labels added: Type-Clarification - Labels removed: Type-Defect

  7. Former user Account Deleted

    ``` Set default Consensus to "Low". ```

    Reported by `gary.funck` on 2012-08-19 23:26:19 - Labels added: Consensus-Low

  8. Former user Account Deleted

    ``` The status of this item is that "restrict" implicitly prohibits "in place" collective operations, but a clarification is possible to EXplicitly prohibit this practice.

    Changing status to New to get this back "on the radar" for discussion. ```

    Reported by `phhargrove@lbl.gov` on 2012-09-26 18:45:34 - Status changed: `New`

  9. Former user Account Deleted

    ``` Notes from the 2012.09.26 spec wg call:

    + C99 provides a clarification for the memory copying routines which were examined while on the call (at least memcpy, strcpy, strncpy), even though they have the restrict keyword. So, we can/should do so as well.

    + We'll clarify in the Description using the same language used by C99: "If copying takes place between objects that overlap, the behavior is undefined."

    + We should apply this to BOTH the collectives AND to the upc_mem{copy,get,put} functions. [hence the updated Summary]

    + The idea of putting the clarification once per section was deemed less useful than putting it with each function to which it applies (users looking up the function won't read the section intro).

    + The plan for the async memcpy interfaces is to define their semantics in terms of the blocking ones - and thus the clarification WON'T be repeated there.

    Moving to Consensus-High ```

    Reported by `phhargrove@lbl.gov` on 2012-09-26 21:13:36 - Labels added: Consensus-High - Labels removed: Consensus-Low

  10. Former user Account Deleted

    ``` There's plenty of use-cases for in-place collectives to save memory, but perhaps this is not important to UPC codes. Is it worth considering the addition of (some) collectives that would be amenable to in-place? I think the reductions are the most important usage and might not be necessary to define them for data-movement-only collectives (e.g. scatter).

    Unfortunately, because UPC is C, one cannot overload a collective symbol, but if we could use symbol overloading, it would be trivial to define the in-place functions by eliminate one of src or dst from the call syntax. ```

    Reported by `jeff.science` on 2012-10-05 11:45:32

  11. Former user Account Deleted

    ``` "There's plenty of use-cases for in-place collectives to save memory, but perhaps this is not important to UPC codes. Is it worth considering the addition of (some) collectives that would be amenable to in-place? "

    I believe in-place operation is one of the features being considered for the next version of the collectives library (for 1.4). This issue is concerned with clarifying the current/previous specification which already includes "restrict" to prohibit overlap. ```

    Reported by `danbonachea` on 2012-10-05 17:58:06

  12. Former user Account Deleted

    ``` Is there already an issue number for in-place collective or should I create one? ```

    Reported by `jeff.science` on 2012-10-06 02:47:31

  13. Former user Account Deleted
    Proposal mailed 11/16/13:
    
    --- lang/upc-lib-core.tex       (revision 199)
    +++ lang/upc-lib-core.tex       (working copy)
    @@ -617,6 +617,10 @@
        shared array object with this type (the {\tt src} array) to another shared
        array object with this type (the {\tt dst} array).
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \paragraph{The {\tt upc\_memget} function}
    
     {\bf Synopsis}
    @@ -649,6 +653,10 @@
         char[n]
     \end{verbatim}
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \paragraph{The {\tt upc\_memput} function}
    
     {\bf Synopsis}
    @@ -681,6 +689,10 @@
         shared [] char[n]
     \end{verbatim}
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \paragraph{The {\tt upc\_memset} function}
    
     {\bf Synopsis}
    --- lib/req/upc-lib-collectives.tex     (revision 199)
    +++ lib/req/upc-lib-collectives.tex     (working copy)
    @@ -82,6 +82,10 @@
    
     \np The {\tt dst} pointer is treated as if it has phase 0.
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \np EXAMPLE 1 shows {\tt upc\_all\_broadcast}
     \begin{verbatim}
       #include <upc_collective.h>
    @@ -167,6 +171,10 @@
     the block of {\tt nbytes} bytes
     pointed to by {\tt dst} that has affinity to thread $i$.
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \np EXAMPLE 1 {\tt upc\_all\_scatter} for the {dynamic THREADS} translation
     environment.
     %This example corresponds to Figure \ref{fig2}.
    @@ -253,6 +261,10 @@
     pointed to by {\tt src} that has affinity to thread $i$
     to the $i$th block of {\tt nbytes} bytes pointed to by {\tt dst}.
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \np EXAMPLE 1 {\tt upc\_all\_gather} for the {\em static THREADS}
     translation environment.
    
    @@ -334,6 +346,10 @@
     $i$th block of {\tt nbytes} bytes pointed to by {\tt dst} that
     has affinity to each thread.
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \np EXAMPLE 1 {\tt upc\_all\_gather\_all} for the {\em static THREADS}
     translation environment.
    
    @@ -415,6 +431,10 @@
     the $j$th block of {\tt nbytes} bytes that has affinity to thread $i$
     pointed to by {\tt dst}.
    
    +\xadded[id=DB]{50}{
    +\np If copying takes place between objects that overlap, the behavior is undefined.
    +}
    +
     \np EXAMPLE 1 {\tt upc\_all\_exchange} for the {\em static THREADS}
     translation environment.
    @@ -503,6 +523,11 @@
     to the block of {\tt nbytes} bytes
     that has affinity to thread {\tt perm}[$i$] pointed to by {\tt dst}.
    
    +\xadded[id=DB]{50}{
    +\np If any of the elements referenced by {\tt dst} overlap any of the 
    +    elements referenced by {\tt src} or {\tt perm}, the behavior is undefined.
    +}
    +
     \np EXAMPLE 1 {\tt upc\_all\_permute}.
     \begin{verbatim}
       #include <upc_collective.h>
    @@ -668,6 +699,10 @@
         shared TYPE *
     \end{verbatim}
    
    +\xadded[id=DB]{50}{
    +\np If any of the elements referenced by {\tt src} and {\tt dst} overlap, the behavior
    is undefined.
    +}
    +
     \np EXAMPLE 1 {\tt upc\_all\_reduce} of type {\tt long UPC\_ADD}.
     \begin{verbatim}
       #include <upc_collective.h>
    

    Reported by danbonachea on 2013-01-17 03:52:44 - Status changed: PendingApproval

  14. Former user Account Deleted
    Ratified at the 2/11/13 telecon and committed as SVN 204
    

    Reported by danbonachea on 2013-02-11 22:56:52 - Status changed: Ratified

  15. Log in to comment