upc_forall semantics: upc affinity expression and side-effects

Issue #47 new
Former user created an issue

Originally reported on Google Code with ID 47

There are several issues with the upc_forall specification.

1) Nested upc_forall semantics. 
===============================

The current specification calls for affinity statementes to be ignored, based on whether
the upc_forall loop currently being executed is "outermost" or not. Since upc_forall
loop nesting is completely dynamic, driven by control flow and function calls, there
is no real fool proof compiler analysis that can tell whether a forall loop's affinity
statement should be ignored or not.

One proposed solution in the original issue is to make the behavior of nested upc_forall()
undefined.

2) Side effects.

Is the specification clear on the semantics of side-effects in the affinity expression
of upc_forall?  For example,
upc_forall(i=0, k = 0; i< 8; i++ ; k++ ) {}

See: https://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=2956

3) Ease of compilation

Can upc_forall() be defined in a way that full analysis of the loop body (and everything
it calls) is not necessary to make guarantees about the lack of side effects that might
affect the variables mentioned in the loop initialization and conditions? This would
potentially make it easier to generate efficient code without full analysis, and without
much loss in utility.

One example I can think of is combining issues 1) and 2), where the side effect of
a potentially inner forall loop's affinity cannot be predicted.

Reported by ga10502 on 2012-05-23 13:33:28

Comments (10)

  1. Former user Account Deleted

    ``` Part 1) Nested upc_forall

    I dislike the "undefined" option if only because this means that library writers cannot use upc_forall() unless they can be certain their code is not called from a upc_forall. In the current situation that is NOT true, as one might think, but the library writer must be VERY careful how they work with the current specification.

    Perhaps a way to query the dynamic nesting at runtime? That would let library writers know (regardless of current semantic or the "undefined" option) that they cannot use upc_forall() "normally" (current semantic) or cannot use it at all (the "undefined" semantic)

    Part 2) Side effects.

    Perhaps somebody has an opinion on this. I've not heard one yet. According to the user filing the BUPC bug report referenced in comment #0, the Cray and GUPC compilers showed they were evaluating the expression WITH side-effects. So, we "fixed" BUPC to match. [Except that when we try to optimize the affinity test away we can loose the side-effect again!]

    3) Ease of compilation

    I have no strong opinions, but am mildly in favor of the idea of outlawing side-effects in the affinity expression - they are just weird.

    ```

    Reported by `phhargrove@lbl.gov` on 2012-06-01 04:03:41 - Labels added: Spec-1.3

  2. Former user Account Deleted

    Reported by `phhargrove@lbl.gov` on 2012-06-01 06:06:40 - Labels added: Milestone-Spec-1.3 - Labels removed: Spec-1.3

  3. Former user Account Deleted

    ``` Set default Consensus to "Low". ```

    Reported by `gary.funck` on 2012-08-19 23:26:19 - Labels added: Consensus-Low

  4. Former user Account Deleted

    ``` Change Status to New: Requires review. ```

    Reported by `gary.funck` on 2012-08-19 23:37:41 - Status changed: `New`

  5. Former user Account Deleted

    ``` Decided during phone call on 9/21 to separate issues. The upc_forall nesting issue has now become issue 88 - please look it up there.

    The current issue is henceforth dealing only with the problem of side effects in the upc_forall affinity statements. The current proposal is to amend the specification to say the following:

    • UPC semantics do not guarantee actual execution of the affinity statement in every iteration.*

    As a consequence -

    • the behavior of what are traditionally called "side effects" is undefined in a upc_forall loop, since the affinity expression may not execute in every iteration (or any iteration!)

    Example: for (... k=0 ... ; ... ; ... ; k++) <------ k may evaluate to 0 in every iteration.

    *

    It also occurred to me that we should consider another situation - that is, when the loop body changes a variable that is used in the affinity expression.

    Example: shared int k=0; for (... ; ... ; ... ; k ) { ... k++; }

    This is obviously just as nonsensical as the original problem, and I propose we treat it in a similar manner.

    ```

    Reported by `ga10502` on 2012-09-21 21:30:45 - Labels added: Consensus-High - Labels removed: Consensus-Low

  6. Former user Account Deleted

    ``` George wrote:

    It also occurred to me that we should consider another situation - that is, when

    the loop body changes

    a variable that is used in the affinity expression.

    Example: shared int k=0; for (... ; ... ; ... ; k ) { ... k++; }

    This is obviously just as nonsensical as the original problem, and I propose we treat

    it in a similar manner.

    I agree that this is nonsensical and that we should treat it similarly.

    I think, off the top of my head, that if written carefully this can be covered by the same language that states the affinity expression is not guaranteed to be evaluated on every (or any) iteration. ```

    Reported by `phhargrove@lbl.gov` on 2012-09-21 21:40:17

  7. Former user Account Deleted

    ``` Upon further study, I've changed my mind on this one.

    I think we need to step back a moment and carefully re-read the 1.2 spec for upc_forall semantics (assuming a non-nested loop):

    ------------------------------------------ p5: upc_forall is a collective operation in which, for each execution of the loop body, the controlling expression and affinity expression are single-valued.[27] [27]Note that single-valued implies that all thread agree on the total number of iterations, their sequence, and which threads execute each iteration. ... p11: Every thread evaluates the first three clauses of a upc_forall statement in accordance with the semantics of the corresponding clauses for the for statement, as defined in [ISO/IEC00 Sec. 6.8.5.3]. Every thread evaluates the fourth clause of every iteration. ------------------------------------------

    Taken together, these paragraphs already provide some very strong requirements which are relevant to the current discussion.

    For starters, to clarify comment 5, UPC 1.2 semantics (last sentence of p11) currently DO guarantee actual execution of the affinity statement in every iteration. Consequently, any optimizations performed must still provide the appearance that this occurred. In the (very?) common case where the affinity expression has no side-effects, the app cannot tell the difference and you're free to go nuts with loop optimizations. However in the rare case where the affinity expression contains side-effects, 1.2 requires those to be preserved - so if a compiler drops those on the ground I believe that currently constitutes a non-compliance.

    Next, considering George's example:

    Example: shared int k=0; upc_forall (... ; ... ; ... ; k ) { ... k++; }

    This is already erroneous because it violates the requirement in p5 that the affinity expression must be single-valued for each execution of the loop body (it will be violated after the first loop body executes). This is true even if k was a local variable, because the loop body is not executed by every thread on every iteration. Nothing further needs to be said for this to be considered undefined behavior.

    So let us return to the original example, where k is a local variable that is only updated in the affinity expression:

    Example: upc_forall (int k=0 ... ; ... ; ... ; k++) { ... }

    I agree this is a "weird" thing to write, but given p5 and p11 above, it should have completely deterministic behavior - it satisfies the single-valued requirements of the affinity expression, and all threads should agree upon which thread executes each loop body. The optimizer does not need to worry about side-effects upon k within the body of the loop, because as demonstrated above the single-valued requirement ensures that would imply undefined behavior.

    Upon further consideration I don't see a strong motivation why this case should be specifically prohibited. I understand the optimizer will need to "prove" the affinity expression is side-effect free (or at least understand the linear progression) before applying loop optimizations - however this seems like a straightforward serial analysis that should provide accurate results for all the common cases of interest (ie no side effects). This analysis certainly seems no more difficult than the similar analysis which must already be performed to handle the loop induction variable. In the rare case of a perverse or obfuscated affinity expression, the optimizer must fall back to generating conservative code, as it already would if the test and increment clauses on the loop induction variable defeat analysis.

    Unless someone can present a stronger argument for a spec change, I think the semantics should be left as-is. If people feel the current semantics should be clarified, we could expand example 2 to include a "foo" for the affinity clause to demonstrate behavior.. ```

    Reported by `danbonachea` on 2012-09-22 23:11:50 - Labels added: Consensus-Low - Labels removed: Consensus-High

  8. Former user Account Deleted

    ``` Agreed on the second test case, and agreed on the main lines of Dan's reasoning.

    However, to argue that the standard - as is - misses the point IMHO. As the writers of the standard we have an opportunity to guide people towards certain ways of writing code - we should *discourage* weird things like expressions with side effects in the affinity expression.

    I would still like to see some text in the standard to discourage affinity expressions with side effects.

    ```

    Reported by `ga10502` on 2012-09-26 19:24:28

  9. Former user Account Deleted
    At the 11/29 telecon it was decided this should be closed with NoChange.
    

    Reported by danbonachea on 2013-01-17 00:51:01 - Status changed: NoChange

  10. Log in to comment