Consider splitting atomic.o

Issue #515 new
Dan Bonachea created an issue

As discussed in PR 392, libupcxx contains code corresponding to function bodies implementing the GASNet calls backing atomic_domain AMO operations for all six supported basic types: (signed int, unsigned int, float) x (32-bit, 64-bit). That PR introduces an additional factor of 21x in code bloat of these instantiations for each of the supported atomic operations.

All of this object code currently lands in atomic.o, such that an application using any atomic_domain operation (in the absence of LTO) will probably end up linking all of this object code into the executable (most of which is dead code at runtime). Measurements on platforms of interest show this corresponds to roughly 17kb to 34kb of object code expansion for the final linked executable using standard optimization levels.

We could potentially partition these (post PR) 126 function body instantiations into six or more separate object files, thereby allowing normal linking to statically discard object code for unreachable combinations in the current application.

However the object code in question amounts to less than 3% of the overall executable size for trivial microbenchmarks on platforms of interest, and would be expected to be a much smaller fraction of the object code for a real application. So currently I don't think it's worth the hassle in build system maintenance (and slightly increased libupcxx build times) to deploy this idea.

Comments (0)

  1. Log in to comment