Adjust upc_type_t values/requirements for types with the same representation

Issue #115 new
Former user created an issue

Originally reported on Google Code with ID 115

Comment from Steve, moved from issue 10:

Is there a reason for the requirement that the predefined values of upc_type_t be "distinct
positive values less than 65536"?  It seems that the exact-width types should be mapped
onto the appropriate integer type for the platform.  For instance, on a system using
the amd64 abi, we'd have

UPC_INT8   == UPC_CHAR
UPC_UINT8  == UPC_UCHAR
UPC_INT16  == UPC_SHORT
UPC_UINT16 == UPC_USHORT
UPC_INT32  == UPC_INT
UPC_UINT32 == UPC_UINT
UPC_INT64  == UPC_LONG
UPC_UINT64 == UPC_ULONG
etc...

Do we envision a use-case where these types would actually need to be treated differently
within a single implementation?  If not, why bother requiring them to be unique?  If
so, why only these and not the other standard integer types?

On a related note, the table in 7.3.2 paragraph 2 giving the types corresponding to
the macros needs to be updated to explicitly used 'signed char' for 'UPC_CHAR' (and
probably add signed to the other signed integer types as well), as 'char' alone is
not guaranteed to be signed.  Note that the table in 7.4.3.1 paragraph 2 explicitly
uses 'signed'.

Reported by danbonachea on 2013-08-03 04:50:48

Comments (8)

  1. Former user Account Deleted
    Note there are 2 issues here.  The first, with the upc_type_t values that explicitly
    must be different despite not having any obvious (to me) semantic differences, isn't
    a show-stopper to me.  I just don't understand the reason behind it.  I'm not sure
    it needs to be changed for 1.3.
    
    The second (signed char versus char) is much more important, so I've split it off into
    its own issue (issue 116).
    

    Reported by sdvormwa@cray.com on 2013-08-06 16:39:14

  2. Former user Account Deleted
    From C99 7.18.1.1:
    The typedef name intN_t designates a signed integer type with width N, no padding
    bits, and a twos complement representation. Thus, int8_t denotes a signed integer
    type with a width of exactly 8 bits.
    ...
    These types are optional.
    
    We are accustomed to dealing with systems that provide all eight of these fixed-width
    types in stdint.h, but this is not required by C99 and I don't see a strong argument
    to require them for UPC. upc_types.h provides macro symbols for *naming* those types,
    and in order to be UPC compliant the implementation of upc_types.h must always provide
    all of those symbols, even if they name types which may not be provided in stdint.h
    for the current platform. By requiring them to all be distinct we ensure the library
    receiving the value can unambiguously identify the exact type requested by the user.
    This allows the library implementation to throw an appropriate fatal error if the combination
    of library and platform doesn't support a particular fixed-width type, rather than
    silently using a different type and possibly computing incorrect results. Alternatively,
    some libraries may be able to correctly emulate the behavior of the requested fixed-width
    integer type, even when the type in question is missing from stdint.h.
    
    In addition to fixed width, the fixed-width types also impose the stronger requirements
    of "no padding bits and two's complement representation". This has implications on
    the required behavior of integer overflow and bitwise operations. So for example an
    AMO library written for a hypothetical hardware that performs its AMO integer computations
    in 64-bit registers might provide a valid implementation for UPC_SHORT by promoting/truncating
    on memory read/write, but this version would not meet the stronger requirements of
    UPC_INT16 in all cases. 
    
    It's important to keep in mind that the implementation of upc_types.h (the definition
    of the symbol values) is part of the UPC standard library and thus defined by the UPC
    compiler (which is often target-specific or uses target specific information). The
    library implementation which accepts these symbol values may be completely separate
    code, written by a third party and possibly in a platform-independent and compiler-independent
    manner.
    
    In any case, it costs us nothing to provide the additional specificity and potentially
    alleviates the issues described above, so in my view the values should remain guaranteed
    distinct.
    

    Reported by danbonachea on 2013-08-17 23:59:20

  3. Former user Account Deleted
    "These types are optional."
    
    Then why are we bothering with mandatory UPC type definitions for them and ignoring
    other mandatory C99 integer types?  What do we do about intmax_t, ptrdiff_t, size_t,
    uintmax_t and wchar_t?
    
    "We are accustomed to dealing with systems that provide all eight of these fixed-width
    types in stdint.h, but this is not required by C99 and I don't see a strong argument
    to require them for UPC..."
    
    Read my first comment:  "It seems that the exact-width types should be mapped onto
    the appropriate integer type for the platform.".  Thus, if the platform doesn't have
    a basic integer type that matches the required semantics, then I agree they should
    be distinct.  But on platforms where basic integer types have the required semantics,
    such as amd64, I don't see it as being necessary.
    
    "It's important to keep in mind that the implementation of upc_types.h (the definition
    of the symbol values) is part of the UPC standard library and thus defined by the UPC
    compiler (which is often target-specific or uses target specific information). The
    library implementation which accepts these symbol values may be completely separate
    code, written by a third party and possibly in a platform-independent and compiler-independent
    manner."
    
    While this is true, for a platform like amd64 where the basic types provide the required
    semantics, the only benefit I can see for third-party implementations is that they
    can generate slightly better error messages in some circumstances.
    
    "In any case, it costs us nothing to provide the additional specificity and potentially
    alleviates the issues described above, so in my view the values should remain guaranteed
    distinct."
    
    If I have a shared size_t variable, how do I do an atomic on it?  Strictly speaking,
    I can't with the current spec, because there is no upc_type_t corresponding to it.
     But, knowing how big a size_t is on my platform--let's say 64-bit for arguments sake--I'll
    just use the corresponding integer type--UPC_UINT64.  Well, now my partner is writing
    another function that needs to atomically update the variable too.  He, knowing how
    big a size_t is on our platform, ends up using the corresponding integer type--UPC_ULONG.
     We have lots of fun tracking down errors due to the atomics not being coherent across
    types...
    
    Is this a programmer error?  YES!  However, I'd argue that making this work as expected
    for users ("but they're the same type underneath!") by permitting the implementation
    to use the same values for these types far outweighs any (yet to be provided...) advantage
    to requiring they be distinct.
    

    Reported by sdvormwa@cray.com on 2013-08-18 17:03:05

  4. Former user Account Deleted
    We're going to have similar issues without something corresponding to 'void *'.
    

    Reported by sdvormwa@cray.com on 2013-08-18 17:05:06

  5. Former user Account Deleted
    > Then why are we bothering with mandatory UPC type definitions for them and ignoring
    
    > other mandatory C99 integer types?  What do we do about intmax_t, ptrdiff_t, size_t,
    
    > uintmax_t and wchar_t?
    > We're going to have similar issues without something corresponding to 'void *'.
    
    We had strong requests from "VIP" users for atomics on the 32/64 fixed-width types.
    To my knowledge there's been no such user request for the other types you mentioned,
    but if that's of interest to real users in some library we envision, then we could
    consider adding it to upc_types.h (otherwise we're just adding flags with no client).
    The types supported by the AMO library was restricted to those with a strong user motivation
    (to alleviate implementation burden), and notably does NOT even include everything
    currently in upc_types.h (the unsupported integer and fixed-width types were added
    to upc_types.h just for symmetry with the supported ones, but could also be removed
    without breaking any current clients).
    
    > But on platforms where basic integer types have the required semantics, such as amd64,
    I don't see it as being necessary.
    
    You are assuming all the computations performed by any library that is a client of
    this interface are done with that particular type and on the host processor. The atomics
    example I gave in comment #2 gives a hypothetical situation where a performance gain
    may be possible through integer promotion by exploiting the weaker overflow and representation
    semantics of a standard C type relative to the corresponding fixed-width type. A second
    example is a system with a "smart" NIC, where a collectives library might perform some
    reduction operations on an auxilliary processor. Just because "int" happens to have
    the same width as "int32_t" on the host processor and C99 compiler does not imply they
    are semantically identical for all possible client libraries.
    
    > If I have a shared size_t variable, how do I do an atomic on it?  Strictly speaking,
    I can't with the current spec,
    > because there is no upc_type_t corresponding to it.  
    
    I'd argue the correct and portable way to write that would be to declare the actual
    variable as one of the types supported by the AMO library (uint64_t) and use the corresponding
    UPC_UINT64 flag variable. Any other "games" the programmer may play with noticing the
    types are the same size and "lying" about the declared type is already clearly undefined
    behavior and should be strongly discouraged. There are more semantics to an integer
    type than just its width - type alignment for one is a very real issue for AMOs on
    some platforms. 
    
    So in summary I could be convinced to add more C99 types to upc_types.h if we envision
    their use in future libraries, because that's essentially "free" (ie negligible implementation
    and documentation burden). However I'm opposed to weakening the semantics on the allowable
    flag values. Allowing two distinct type flags to expand to the same value in upc_types.h
    just creates ambiguity in the type naming functionality (which after all is the sole
    purpose of this header file), is potentially harmful to correctness, and provides no
    apparent benefit to client libraries or well-defined programs. Client library implementations
    are of course free to perform such type "aliasing" *internally* when they determine
    that type semantics are preserved for the operations in question, but the library-independent
    header file should not make that assumption.
    

    Reported by danbonachea on 2013-08-19 00:32:54

  6. Former user Account Deleted
    This is one of the few unresolved technical issues still slated for the 1.3 milestone,
    which has not been discussed in many months. We need to decide ASAP (hopefully by tomorrow's
    call) what to do with this issue.
    
    I'm still of the opinion I expressed in comment #5 that it should remain "as-is" and
    will move to close this issue with NoChange. Nobody seems to be making a correctness
    or ambiguity argument against the current language, whereas I've argued the proposed
    change can introduce semantic ambiguity in some limited cases while providing no substantial
    benefit.
    
    Steve, since you won't be on the call please indicate whether you agree or disagree,
    and if the latter, whether you consider this a "show-stopper" or if it can be removed
    from the milestone and tabled for discussion in a future spec revision.
    

    Reported by danbonachea on 2013-11-15 00:42:27

  7. Former user Account Deleted
    > I'm still of the opinion I expressed in comment #5 that it should remain "as-is" and
    will move to close this issue with NoChange. Nobody seems to be making a correctness
    or ambiguity argument against the current language, whereas I've argued the proposed
    change can introduce semantic ambiguity in some limited cases while providing no substantial
    benefit.
    
    That sounds fine to me.  This was a minor issue that struck me as odd when I updated
    our implementation in CCE 8.2.  I don't think it should hold up 1.3.
    

    Reported by sdvormwa@cray.com on 2013-11-15 04:11:05

  8. Log in to comment