Adjust upc_type_t values/requirements for types with the same representation
Issue #115
new
Originally reported on Google Code with ID 115
Comment from Steve, moved from issue 10:
Is there a reason for the requirement that the predefined values of upc_type_t be "distinct
positive values less than 65536"? It seems that the exact-width types should be mapped
onto the appropriate integer type for the platform. For instance, on a system using
the amd64 abi, we'd have
UPC_INT8 == UPC_CHAR
UPC_UINT8 == UPC_UCHAR
UPC_INT16 == UPC_SHORT
UPC_UINT16 == UPC_USHORT
UPC_INT32 == UPC_INT
UPC_UINT32 == UPC_UINT
UPC_INT64 == UPC_LONG
UPC_UINT64 == UPC_ULONG
etc...
Do we envision a use-case where these types would actually need to be treated differently
within a single implementation? If not, why bother requiring them to be unique? If
so, why only these and not the other standard integer types?
On a related note, the table in 7.3.2 paragraph 2 giving the types corresponding to
the macros needs to be updated to explicitly used 'signed char' for 'UPC_CHAR' (and
probably add signed to the other signed integer types as well), as 'char' alone is
not guaranteed to be signed. Note that the table in 7.4.3.1 paragraph 2 explicitly
uses 'signed'.
Reported by danbonachea
on 2013-08-03 04:50:48
Comments (8)
-
Account Deleted -
Account Deleted From C99 7.18.1.1: The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits. ... These types are optional. We are accustomed to dealing with systems that provide all eight of these fixed-width types in stdint.h, but this is not required by C99 and I don't see a strong argument to require them for UPC. upc_types.h provides macro symbols for *naming* those types, and in order to be UPC compliant the implementation of upc_types.h must always provide all of those symbols, even if they name types which may not be provided in stdint.h for the current platform. By requiring them to all be distinct we ensure the library receiving the value can unambiguously identify the exact type requested by the user. This allows the library implementation to throw an appropriate fatal error if the combination of library and platform doesn't support a particular fixed-width type, rather than silently using a different type and possibly computing incorrect results. Alternatively, some libraries may be able to correctly emulate the behavior of the requested fixed-width integer type, even when the type in question is missing from stdint.h. In addition to fixed width, the fixed-width types also impose the stronger requirements of "no padding bits and two's complement representation". This has implications on the required behavior of integer overflow and bitwise operations. So for example an AMO library written for a hypothetical hardware that performs its AMO integer computations in 64-bit registers might provide a valid implementation for UPC_SHORT by promoting/truncating on memory read/write, but this version would not meet the stronger requirements of UPC_INT16 in all cases. It's important to keep in mind that the implementation of upc_types.h (the definition of the symbol values) is part of the UPC standard library and thus defined by the UPC compiler (which is often target-specific or uses target specific information). The library implementation which accepts these symbol values may be completely separate code, written by a third party and possibly in a platform-independent and compiler-independent manner. In any case, it costs us nothing to provide the additional specificity and potentially alleviates the issues described above, so in my view the values should remain guaranteed distinct.
Reported by
danbonachea
on 2013-08-17 23:59:20 -
Account Deleted "These types are optional." Then why are we bothering with mandatory UPC type definitions for them and ignoring other mandatory C99 integer types? What do we do about intmax_t, ptrdiff_t, size_t, uintmax_t and wchar_t? "We are accustomed to dealing with systems that provide all eight of these fixed-width types in stdint.h, but this is not required by C99 and I don't see a strong argument to require them for UPC..." Read my first comment: "It seems that the exact-width types should be mapped onto the appropriate integer type for the platform.". Thus, if the platform doesn't have a basic integer type that matches the required semantics, then I agree they should be distinct. But on platforms where basic integer types have the required semantics, such as amd64, I don't see it as being necessary. "It's important to keep in mind that the implementation of upc_types.h (the definition of the symbol values) is part of the UPC standard library and thus defined by the UPC compiler (which is often target-specific or uses target specific information). The library implementation which accepts these symbol values may be completely separate code, written by a third party and possibly in a platform-independent and compiler-independent manner." While this is true, for a platform like amd64 where the basic types provide the required semantics, the only benefit I can see for third-party implementations is that they can generate slightly better error messages in some circumstances. "In any case, it costs us nothing to provide the additional specificity and potentially alleviates the issues described above, so in my view the values should remain guaranteed distinct." If I have a shared size_t variable, how do I do an atomic on it? Strictly speaking, I can't with the current spec, because there is no upc_type_t corresponding to it. But, knowing how big a size_t is on my platform--let's say 64-bit for arguments sake--I'll just use the corresponding integer type--UPC_UINT64. Well, now my partner is writing another function that needs to atomically update the variable too. He, knowing how big a size_t is on our platform, ends up using the corresponding integer type--UPC_ULONG. We have lots of fun tracking down errors due to the atomics not being coherent across types... Is this a programmer error? YES! However, I'd argue that making this work as expected for users ("but they're the same type underneath!") by permitting the implementation to use the same values for these types far outweighs any (yet to be provided...) advantage to requiring they be distinct.
Reported by
sdvormwa@cray.com
on 2013-08-18 17:03:05 -
Account Deleted We're going to have similar issues without something corresponding to 'void *'.
Reported by
sdvormwa@cray.com
on 2013-08-18 17:05:06 -
Account Deleted > Then why are we bothering with mandatory UPC type definitions for them and ignoring > other mandatory C99 integer types? What do we do about intmax_t, ptrdiff_t, size_t, > uintmax_t and wchar_t? > We're going to have similar issues without something corresponding to 'void *'. We had strong requests from "VIP" users for atomics on the 32/64 fixed-width types. To my knowledge there's been no such user request for the other types you mentioned, but if that's of interest to real users in some library we envision, then we could consider adding it to upc_types.h (otherwise we're just adding flags with no client). The types supported by the AMO library was restricted to those with a strong user motivation (to alleviate implementation burden), and notably does NOT even include everything currently in upc_types.h (the unsupported integer and fixed-width types were added to upc_types.h just for symmetry with the supported ones, but could also be removed without breaking any current clients). > But on platforms where basic integer types have the required semantics, such as amd64, I don't see it as being necessary. You are assuming all the computations performed by any library that is a client of this interface are done with that particular type and on the host processor. The atomics example I gave in comment #2 gives a hypothetical situation where a performance gain may be possible through integer promotion by exploiting the weaker overflow and representation semantics of a standard C type relative to the corresponding fixed-width type. A second example is a system with a "smart" NIC, where a collectives library might perform some reduction operations on an auxilliary processor. Just because "int" happens to have the same width as "int32_t" on the host processor and C99 compiler does not imply they are semantically identical for all possible client libraries. > If I have a shared size_t variable, how do I do an atomic on it? Strictly speaking, I can't with the current spec, > because there is no upc_type_t corresponding to it. I'd argue the correct and portable way to write that would be to declare the actual variable as one of the types supported by the AMO library (uint64_t) and use the corresponding UPC_UINT64 flag variable. Any other "games" the programmer may play with noticing the types are the same size and "lying" about the declared type is already clearly undefined behavior and should be strongly discouraged. There are more semantics to an integer type than just its width - type alignment for one is a very real issue for AMOs on some platforms. So in summary I could be convinced to add more C99 types to upc_types.h if we envision their use in future libraries, because that's essentially "free" (ie negligible implementation and documentation burden). However I'm opposed to weakening the semantics on the allowable flag values. Allowing two distinct type flags to expand to the same value in upc_types.h just creates ambiguity in the type naming functionality (which after all is the sole purpose of this header file), is potentially harmful to correctness, and provides no apparent benefit to client libraries or well-defined programs. Client library implementations are of course free to perform such type "aliasing" *internally* when they determine that type semantics are preserved for the operations in question, but the library-independent header file should not make that assumption.
Reported by
danbonachea
on 2013-08-19 00:32:54 -
Account Deleted This is one of the few unresolved technical issues still slated for the 1.3 milestone, which has not been discussed in many months. We need to decide ASAP (hopefully by tomorrow's call) what to do with this issue. I'm still of the opinion I expressed in comment #5 that it should remain "as-is" and will move to close this issue with NoChange. Nobody seems to be making a correctness or ambiguity argument against the current language, whereas I've argued the proposed change can introduce semantic ambiguity in some limited cases while providing no substantial benefit. Steve, since you won't be on the call please indicate whether you agree or disagree, and if the latter, whether you consider this a "show-stopper" or if it can be removed from the milestone and tabled for discussion in a future spec revision.
Reported by
danbonachea
on 2013-11-15 00:42:27 -
Account Deleted > I'm still of the opinion I expressed in comment #5 that it should remain "as-is" and will move to close this issue with NoChange. Nobody seems to be making a correctness or ambiguity argument against the current language, whereas I've argued the proposed change can introduce semantic ambiguity in some limited cases while providing no substantial benefit. That sounds fine to me. This was a minor issue that struck me as odd when I updated our implementation in CCE 8.2. I don't think it should hold up 1.3.
Reported by
sdvormwa@cray.com
on 2013-11-15 04:11:05 -
Account Deleted Reported by
danbonachea
on 2013-11-15 16:37:31 - Status changed:NoChange
- Log in to comment
Reported by
sdvormwa@cray.com
on 2013-08-06 16:39:14