Deploy non-trivial Serialization for function pointers

Issue #582 wontfix
Dan Bonachea created an issue

In the 2023-01-31 Serialization meeting, we resolved to change the Serialization status and semantics of pointer-to-function types:

Current Semantics (through release 2022.9.0)

Pointer-to-function types are TriviallySerializable (by virtue of being TriviallyCopyable). However due to ASLR and randomization of shared library load addresses, pointers-to-function values are generally NOT meaningfully portable across address spaces (ie. a raw function pointer address constructed by one process cannot reliably be used by another to invoke the function).

Proposed Semantics

Pointer-to-function types remain Serializable, but become non-TriviallySerializable. The UPC++ library defines the serialization for all such types, with an opaque implementation ensuring that serializing a valid pointer-to-function value on one process and subsequently deserializing it at another will result in a pointer-to-function value referencing "the same function" in the memory space of the target process. As a result, valid pointer-to-function values can be meaningfully transmitted across address spaces via RPC and reliably used for function invocation at another process.

The pointer-to-function translation described above was already being applied to pointer-to-function arguments passed as the func callable argument in rpc(), rpc_ff() and as_rpc(). This work extends the scope of that mechanism to include Serialization of all pointer-to-function values, regardless of where they appear in RPC arguments.

Breaking changes:

  1. pointer-to-function types are no longer TriviallySerializable.
  2. As a consequence of 1, user types containing non-static pointer-to-function fields may cease to be TriviallySerializable (and lacking serialization declarations, possibly also cease to be Serializable). Such types may need to deploy serialization declarations such as UPCXX_SERIALIZED_FIELDS(...) to restore Serializable.
  3. As a consequence of 1 and 2, objects having or containing a pointer-to-function type may no longer be communicated using RMA (rput*(), rget*()) or non-experimental data collectives (broadcast(), reduce_{one,all}).

Preserving legacy use cases

There are (currently hypothetical?) obscure yet valid use cases where it might make sense to transmit the raw bits of a pointer-to-function value (via trivial serialization), without using those transmitted values for function invocation on a different process (which would fail in general). Several potential workarounds exist to preserve such use cases, such as reinterpreting the raw bits into a TriviallySerializable type of suffiient size (e.g. uintptr_t), or embedding the pointer-to-function object in a struct S and specializing is_trivially_serializable<S>::value = true.

Work assignments

@Colin MacLean will implement this as a stand-alone Impl PR
@Amir Kamil will pursue a corresponding Spec PR.

Comments (1)

  1. Dan Bonachea reporter

    Unfortunately the design proposed here does not work as expected/hoped. In particular, I was incorrect in listing Breaking Change number 2: under the design outlined above, TriviallyCopyable structs containing a non-static function pointer field would continue to be TriviallySerializable (in the absence of explicit serialization declarations). Sending such a struct would continue to trivially serialize the function pointer field in a non-meaningful way, despite the changes proposed here. I find this unfortunate wrong-by-default behavior sufficiently non-intuitive and surprising that I hesitate to inflict this design upon unsuspecting users.

    Due to this (IMHO, fatal) design flaw, I no longer support deploying this design approach. We should explore alternate designs for meeting our use case (in a new proposal issue).

  2. Log in to comment