Define UPC implementation limits and required minimum values for same

Issue #21 new
Former user created an issue

Originally reported on Google Code with ID 21

This proposal is responsive to specification issue 6.

6. In order to improve interoperability, should suggested minimums for max. block size,
number of threads, and range of virtaddr be provided?

The C99 specification includes an Annex E ("Implementation Limits"0
which defines various limits, for example:

"The contents of the header <limits.h> are given below, in alphabetical order. The
minimum magnitudes shown shall be replaced by implementation-defined magnitudes with
the same sign. The values shall all be constant expressions suitable for use in #if
preprocessing directives."

This proposal recommends that various UPC implementation limits be defined via pre-defined
macros and that the lower bound for those values is specified.  Suggested names and
values follow.

#define UPC_PHASE_MAX 1048575
#define UPC_THREAD_MAX 4095
#define UPC_VIRTADDR_MAX 4294967295

Arguably, it may be useful to also express these size values in base 2, as the number
of bits required to encode fields containing the relevant value.  Additional pre-defined
names should be considered for inclusion.

The names will be defined in the core library section of the specification.  Although
by analogy these values might be defined in a #include file of the name upc_limits.h,
defining them in upc.h is also a likely possibility.

Note that UPC_PHASE_MAX is can also likely be expressed as UPC_MAX_BLOCKSIZE, and that
UPC_MAX_BLOCKSIZE should also be listed in this new defined "Implementation Limits"
annex (appendix).

Are there other relevant and useful UPC-specific implementation limits?

Reported by gary.funck on 2012-05-21 20:32:11

Comments (19)

  1. Former user Account Deleted

    ``` I am in favor of defining an appropriate set of pre-defined macros to convey the implementation's limits and to specify minimum values to be allowed in a conforming implementation. However, I am not convinced this proposal has (yet) the right macros or value.

    I would favor putting them in upc_limits.h, AND requiring that this header be valid for inclusion by a C99 compiler. The objective here being to support separate compilation of code that must interoperate. The "legacy" macro UPC_MAX_BLOCK_SIZE could be defined in both upc.h and upc_limits.h with an appropriate "#ifndef".

    I do think UPC_PHASE_MAX is redundant unless somebody can offer a valid scenario in which it could differ from UPC_MAX_BLOCK_SIZE (note spelling has THREE underscores).

    I would propose that "VIRTADDR" become "ADDRFIELD" if it is intended to reflect the range of values from upc_addfield(). If that is NOT the intent, then please provide the intended definition.

    On the subject of VIRTADDR/ADDFIELD I assume the intent is to convey the limits of the PTS representation, right? The fact that I might have far less memory in the node(s) wouldn't be known at compile time.

    I am inclined to take issue with 4095 as a lower bound on the value of UPC_THREAD_MAX, in part because the default in Berkeley UPC is currently 1023. However, even if BUPC raises its default to comply, I can see a single-node (no network) implementation using an 8-bit field for the thread number in a PTS. So, I would say 255 is a reasonable limit in some implementations.

    Are there other relevant and useful UPC-specific implementation limits?

    While I don't know of any implementation that has such a limit, I could imagine a limit on the number of UPC locks that could be created.

    The UPC I/O extensions define a upc_off_t but seems to be lacking a UPC_OFF_MAX. HOWEVER, the status of I/O as optional would seem to place that outside the scope if this issue. I'll add another issue for that.

    ```

    Reported by `phhargrove@lbl.gov` on 2012-05-22 03:02:18

  2. Former user Account Deleted

    ``` Re: Pual's points in the previous reply.

    I agree that this proposal does not yet name the right macros, nor necessarily their values.

    Agreed on upc_limits.h, and the need to define the macros and other content in a way that the header file can be safely included by a "C" program.

    Regarding inclusion by a "C" program, at present GUPC defines UPC_MAX_BLOCK_SIZE as a compiler pre-defined macro. The regular "C" compiler will not have this capability. Thus, for this particular value, upc_limits.h will need to be built and customized based upon the target configuration. It can use the #ifndef test noted in the reply.

    UPC_PHASE_MAX is potentially not completely redundant, in that its value is likely (UPC_MAX_BLOCK_SIZE - 1). In UPC, the block size can very between 0 and UPC_MAX_BLOCK_SZE, but the phase varies between 0 and UPC_MAX_BLOCK_SIZE - 1.

    The reason for the proposed naming of UPC_SOMETHING_MAX is that many of the other values in limits.h use the SOMETHING_MAX naming scheme.

    Agreed that UPC_ADDRFIELD_MAX is a better name than UPC_VIRTADDR_MAX. Also, the intent was to to define the full range of the address field in a PTS representation.

    The reasoning behind the 12/20/32 split, is that 4096 threads, 1 million elements in a block, and 32 bits in the address field fit conveniently in 64 bits, which is likely a natural "packed PTS" size on a 32-bit target.

    Agreed that UPC locks may need to be limited in some contemplated implementations, or more accurately the number of locks that are currently held by a given thread.

    Regarding UPC_OFF_MAX (separate proposal), might UPC_IO_OFFSET_MAX or something to that effect be more descriptive?

    ```

    Reported by `gary.funck` on 2012-05-22 14:55:31

  3. Former user Account Deleted

    ``` Gary wrote:

    Agreed that UPC locks may need to be limited in some contemplated implementations,

    or more accurately the number of locks that are currently held by a given thread.

    The issue of the number HELD is actually not the one I had in mind, though it equally valid. I was thinking of the number allocated-but-not-yet-freed, and was motivated by the fact that the opaque nature of upc_lock_t allows them to be used to encode a "handle" such as an index into a table. Said table might have finite size.

    Gary also wrote:

    Regarding UPC_OFF_MAX (separate proposal), might UPC_IO_OFFSET_MAX or something to

    that effect be more descriptive?

    They called the type upc_off_t, not upc_io_offset_t. So, I figured I'd follow the principle of least surprise.

    ```

    Reported by `phhargrove@lbl.gov` on 2012-06-01 03:52:05 - Labels added: Spec-1.3

  4. Former user Account Deleted

    Reported by `phhargrove@lbl.gov` on 2012-06-01 06:08:04 - Labels added: Milestone-Spec-1.3 - Labels removed: Spec-1.3

  5. Former user Account Deleted

    ``` Set default Consensus to "Low". ```

    Reported by `gary.funck` on 2012-08-19 23:26:19 - Labels added: Consensus-Low

  6. Former user Account Deleted

    ``` Change Status to New: Requires review. ```

    Reported by `gary.funck` on 2012-08-19 23:37:41 - Status changed: `New`

  7. Former user Account Deleted

    ``` I'll handle this one.

    Some thoughts on the topic:

    For 1.3, I agree with: - the SOMETHING_MAX format for consistency; - keeping ADDRFIELD consistent with addrfield; - UPC_OFF_MAX is more consistent with upc_off_t; - upc_limits.h; - suggested values in the specification.

    Gary's observation about some of the constants being concocted by the compiler is a good one. HP UPC did at one time have a compile-time option to change the pointer size, affecting block size and number of threads. I can see the value in having these limits available to C compilers, though, so perhaps providing some means of capturing these values, perhaps by defining a macro to allow a C compiler to pick up the correct set, should be required, without specifying what that mechanism should be exactly.

    I don't understand the need for constraints on the definitions; perhaps somebody else could elaborate on that.

    The following items I think are worth discussing individually and tagging for a future version of the specification:

    - moving UPC_MAX_something to UPC_something_MAX; - changing the name of component/function/constant addrfield to something else; - changing upc_off_t and UPC_OFF_MAX to upc_io_off_t and UPC_IO_OFF_MAX. ```

    Reported by `brian.wibecan` on 2012-09-17 16:28:24 - Status changed: `Accepted`

  8. Former user Account Deleted

    ``` Leaving status as "New", pending discussion. ```

    Reported by `gary.funck` on 2012-09-17 17:27:19 - Status changed: `New`

  9. Former user Account Deleted

    ```

    The reasoning behind the 12/20/32 split, is that 4096 threads, 1 million elements

    in a block,

    and 32 bits in the address field fit conveniently in 64 bits, which is likely a natural

    "packed PTS" size on a 32-bit target.

    This is fine as an implementation strategy, but should NOT be used to define the "minimal max" values that we require for spec compliance. The specified "minimal max" value should be the SMALLEST value that we believe is meaningful and usable. Implementations can and will define their max limits to be LARGER than the minimal requirement, to match their implementation decisions.

    As Paul mentions, on a pure SMP implementation of UPC (eg targeting your laptop) a max thread value of 255 is sufficient, and even overkill. I think the smallest meaningful max thread limit is "1" - some compilers may support a pure sequential compilation mode where there is only one thread. Similarly, the ADDRFIELD max should not assume a 64-bit implementation with 32-bits reserved for shared address. On a 32-bit implementation with a 500MB/thread shared heap, you can get by with a 29-bit ADDRFIELD (max set to 536870912). I think the smallest meaningful ADDRFIELD is probably equal to UPC_PHASE_MAX, which you are currently proposing to be 1048575.

    As a comparison, C99 defines FLT_MAX_10_EXP, DBL_MAX_10_EXP, LDBL_MAX_10_EXP to all be the same value (37) corresponding to a single-precision 32-bit IEEE 754 representation. The vast majority of implementations support 64 or even 128 bit FP representations and set these defines higher, but the minimal requirement allows compliance for C compilers targeting hardware that only provides single-precision FP instructions. ```

    Reported by `danbonachea` on 2012-09-17 17:33:07

  10. Former user Account Deleted

    ``` Do we need to define "minimal max" values at all? I'm not convinced the specification shouldn't be silent on that point. ```

    Reported by `brian.wibecan` on 2012-09-17 18:52:18

  11. Former user Account Deleted

    ``` "Do we need to define "minimal max" values at all? I'm not convinced the specification shouldn't be silent on that point."

    Without a minimal value on UPC_MAX_BLOCK_SIZE it's impossible to portably write a strictly-conforming portable UPC program that uses blocksize > 1 (some would argue that feature should be removed entirely, but until such time the point remains).

    The others seem lower priority. So I guess one possible resolution would be to simply state a minimum required UPC_MAX_BLOCK_SIZE value in the existing definition of that symbol.

    ```

    Reported by `danbonachea` on 2012-09-17 19:15:08

  12. Former user Account Deleted

    ``` "impossible to portably write a strictly-conforming portable UPC program that uses blocksize

    1"

    Ok "impossible" is too strong a word. I mean a program with hard-coded integer constant blocksizes like "10" or "100", that aren't expressions computed from UPC_MAX_BLOCK_SIZE and doesn't use a bunch of preprocessor checks against UPC_MAX_BLOCK_SIZE. ```

    Reported by `danbonachea` on 2012-09-17 19:21:15

  13. Former user Account Deleted

    ``` I agree w/ the sentiment that Brian and Dan have expresses that minimal max values are less important than providing the constants and header at all.

    I propose that in the interest of time we split this issue into: + "Define UPC implementation limits and header" for UPC 1.3 + "Define required minimum values for UPC implementation limits: for UPC 1.4

    If others feel as Dan does about the need to define the minimal permitted UPC_MAX_BLOCK_SIZE for a conforming implementation, then I think his suggestion to define it with the corresponding text is sufficient for 1.3. ```

    Reported by `phhargrove@lbl.gov` on 2012-09-17 19:31:07

  14. Former user Account Deleted

    ``` Consolidating the suggested limits:

    UPC_ADDRFIELD_MAX UPC_PHASE_MAX (= UPC_BLOCK_SIZE_MAX - 1) UPC_THREAD_MAX UPC_BLOCK_SIZE_MAX = UPC_MAX_BLOCK_SIZE (define here for consistency) UPC_LOCK_MAX (= maximum locks allocated; perhaps UPC_LOCK_ALLOC_MAX?) UPC_LOCK_HELD_MAX

    Any others?

    If there is no defined limit for locks held/allocated, what's the desired definition?

    Let UPC_OFF_MAX be an issue for the I/O library and not part of this list.

    Will create a separate issue for minimum allowed value for the maximum block size. ```

    Reported by `brian.wibecan` on 2012-09-20 22:31:25

  15. Former user Account Deleted

    ``` For the case of "unlimited" (numbers of locks, for instance) I propose ZERO. That avoids the need to specificy the types, as would be required if max-for-type values from limits.h were used. Additionally, it lets one make all the constants unsigned for maximum range (which use of -1 for "unlimited" would not allow). The use of an UNsigned value is particularly important for ADDRFIELD on a 32-bit platform if more than 2GB of shared heap is to be supported.

    LOCKS_MAX can be tricky. What if an implementation supports N-per-caller when using upc_global_lock_alloc(), and some value M-per-job when using upc_all_lock_alloc()? What if those are "competitive" rather than "additive"?

    Also, there is still a significant unresolved issue to discuss: Use of the header from C. Given that some UPC compiler allow the shared-pointer representation to be modified at compile time (with appropriate command-line arguments, for instance) it is unclear to me how one creates a header with ACCURATE values of UPC_{ADDRFIELD,PHASE,THREAD,BLOCK_SIZE}_MAX that can be used by a C compiler which doesn't grok the UPC-specific command line args. One approach is to use the most conservative values when !defined(UPC), but if that means UPC_THREAD_MAX=1 (for example) then it becomes useless. ```

    Reported by `phhargrove@lbl.gov` on 2012-09-20 22:56:31

  16. Former user Account Deleted

    ``` If we reduce the contents of this hypothetical upc_limits.h file to values that likely have no practical use (such as defining the maximum number of locks to be 0) I wonder if this file is helpful in meeting the stated goal "to improve interoperability"?

    I'm not sure that I correctly understand the line of reasoning in comment 15 however. I may have missed the main point.

    Regarding matching up the contents of upc_limits.h with what the compiler actually supports ... there are two aspects to this proposal. One is to describe minimal supported values in Appendix E of the specification. The vendor is then free increase those values in the copy of upc_limits.h that is provided.

    I don't have an answer on how the contents of upc_limits.h can be constructed such that (for example) values provided on the UPC compiler's command line (or that are derived from the command line arguments) can be made available to a regular "C" program. Perhaps that "C" file has to be compiled with the "UPC" compiler.

    ```

    Reported by `gary.funck` on 2012-09-21 04:10:12

  17. Former user Account Deleted

    ``` In the conference call, I think the following were established: - We are not going to worry about C compilers having easy access to a list of limit constants for the 1.3 specification. - There seemed to be little clear motivation for providing maximum values for addrfield or thread. - No need is seen for upc_limits.h at this time. If limit constants are to be provided, they will be guaranteed to be defined if the program includes upc.h, although some or all of them may actually be defined by the compiler. - Some implementations have clearly specified limits on locks held, but there was little support for providing a constant in the specification.

    The primary concern, which apparently motivated this suggestion originally, is that there be a minimum value defined for UPC_MAX_BLOCK_SIZE. Once we decide on one, this minimum value can be specified in a sentence in the description of UPC_MAX_BLOCK_SIZE, without the need for a new appendix.

    I suggest a value of 2. This allows minimizing the number of bits representing phase, while still permitting block cyclic array distribution. ```

    Reported by `brian.wibecan` on 2012-09-21 21:37:06

  18. Former user Account Deleted

    ``` I concur on the summary of the call discussion.

    "I suggest a value of 2. This allows minimizing the number of bits representing phase, while still permitting block cyclic array distribution."

    I think choosing a value that low (2) is mostly indistinguishable from the current spec which provides no minimum - defeating the point of adding a minimum at all.

    I was thinking something on the order of 8 bits (ie minimum UPC_MAX_BLOCK_SIZE of 256), so that programmers can portably write declarations like: shared [100] int x[100*THREADS]; without any special #ifdef checks. Ensuring this works portably is mostly a usability feature, and is so basic that it affects many training materials.

    I understand the use case where a 32-bit implementation may want to provide a "special mode" that discards all the phase bits in favor of maximizing the thread and addrfield components. By adding this minimum we would be stating that such a "special mode" is not strictly compliant, which would help encourage implementations to NOT make that their default (which seems like a Good Thing, based on the principle of least surprise). Implementations can of course still provide that "special mode" for the benefit of specialized codes, with the caveat that enabling that mode technically makes the implementation non-compliant; which seems "right" to me, since for better or worse definite block sizes are a big part of the current language spec, and throwing away all the phase bits and setting UPC_MAX_BLOCK_SIZE=1 violates the spirit of the current language spec. In short, given a choice I would rather make the "special mode" non-compliant, rather than make user code containing small, simple blocksizes non-compliant.

    ```

    Reported by `danbonachea` on 2012-09-22 21:49:14

  19. Former user Account Deleted
    deferred to 1.4 at the 11/29 telecon
    

    Reported by danbonachea on 2012-11-29 19:22:30 - Labels added: Milestone-Spec-1.4 - Labels removed: Milestone-Spec-1.3

  20. Log in to comment