- changed title to invalid inputs to std::aligned_storage lead to pgi/18.10 assertion failures
- marked as bug
invalid inputs to std::aligned_storage, std::aligned_alloc and posix_memalign are non-portable
I am working on determining the "floor" for PGI support.
In testing pgi/18.10 on Dirac I am seeing the assertion failures below.
@john bachan can you have a look at the corresponding source and let me know if you think these indicate a genuine problem with the compiler (vs the possibility the code is making in improper assumption)?
{rpc_barrier,view}-{seq,par}, view-par,
With 8 ranks on 2 nodes yielded the following:
*** FATAL ERROR (proc 4):
//////////////////////////////////////////////////
UPC++ assertion failure:
rank=4
file=/home/data2/upcnightly/dirac/EX-dirac-ibv-pgi_old/work/dbg/upcxx/.nobs/art/6e4f8906d31acfdeaea319f8183a046b816c2b96/upcxx/serialization.hpp:385
Failed condition: detail::is_aligned(initial_buf, serialization_align_max)
To have UPC++ freeze during these errors so you can attach a debugger, rerun the program with GASNET_FREEZE_ON_ERROR=1 in the environment.
//////////////////////////////////////////////////
view-histogram{1,2}-{seq,par}
With 8 ranks on 2 nodes yielded the following which is the same file/line, except the installed copy of the header
*** FATAL ERROR (proc 0):
//////////////////////////////////////////////////
UPC++ assertion failure:
rank=0
file=/home/data2/upcnightly/dirac/EX-dirac-ibv-pgi_old/work/dbg/upcxx-inst-gasnet_seq/include/upcxx/serialization.hpp:385
Failed condition: detail::is_aligned(initial_buf, serialization_align_max)
To have UPC++ freeze during these errors so you can attach a debugger, rerun the program with GASNET_FREEZE_ON_ERROR=1 in the environment.
//////////////////////////////////////////////////
Comments (5)
-
-
reporter FWIW, no such assertions are seen with the same compiler version when running on PPC64.
At the moment I am expecting to documenting different floor versions (18.10 on ppc64, 19.1 on x86_64) do to this issue. -
reporter - attached bt.txt
Here is a backtrace for one of the failures of
view-histogram2
, in case it helps. -
- changed title to invalid inputs to std::aligned_storage, std::aligned_alloc and posix_memalign are non-portable
Copied from pull request #110:
The
detail::alloc_aligned
function in src/utility.hpp appears to be relying upon non-portable behavior.C++17
std::aligned_alloc
is only guaranteed to work for "a valid alignment supported by the implementation", which is only guaranteed to include alignof() for some type (and in particular less thanalignof(max_align_t)
). It additionally requires that "size shall be an integral multiple of alignment". Otherwise undefined behavior (in particular, it is not required to return null if these preconditions are violated).posix_memalign
has a different set of preconditions - in particular it requires "The value of alignment shall be a multiple ofsizeof (void *)
, that is also a power of two.", which means for 64-bit it's only guaranteed to work for align==8*2^n for some n>=0. This function at least is specced to returnEINVAL
if it fails due to invalid inputs, but in that case the code in our function silently returns (potentially non-null) garbage in opt mode that is likely to lead to errors which may or may not SEGV (could just be silent data corruption). FWIW the same occurs for an out-of-memory failure. Both of these conditions could potentially be intermittent based on system state, and might even happen to only fail in opt mode (in which case the debug-only assertion doesn't help).I'd like to see some more careful handling of these preconditions to prevent silent misbehavior - the
std::aligned_alloc
branch should assert all its preconditions, and theposix_memalign
branch should check for an error return even in opt mode, or alternatively initializep
to NULL to ensure a SEGV upon use of the returned pointer.Note that both functions also have some squirrely corner-cases for
size==0
, so we should probably additionally assertsize > 0
, unless that's a case we want to try and handle. -
- changed status to resolved
pull request #110 at 0cff3e1 resolves this issue:
- preconditions for
posix_memalign
are now correctly respected - Use of C++17
std::aligned_alloc
is removed - Extended alignments no longer use
std::aligned_storage
and now usedetail::xaligned_storage
instead
- Log in to comment
The following code in looks like it relies upon undefined behavior, according to C++11:
Quoting from the C++11 spec for
std::aligned_storage<std::size_t Len, std::size_t Align = default-alignment>
:This means if the value 64 exceeds the fundamental alignment of any compiler-supported type, the behavior is undefined. (FWIW, similar UB applies to the
alignas(N)
specifier for unsupported alignment values of N, which I recall we previously encountered). The undefined behavior may very well exhibit as the compiler providing a smaller alignment than requested, which would directly lead to this exact assertion failure.The fact this error only happens to manifest so far on one particular PGI version is not a behavior I think we can rely upon. Gnu has some language extensions for alignment, which may be why it faithfully implements values greater than what's guaranteed by C++, although even the Gnu manual cautions that in some cases system limitations will prevent large
attribute(aligned)
values from being honored.IMO the hard-coded value 64 cannot be relied upon, and should either be relaxed to
alignof(std::max_align_t)
(which C++11 provides), or made conditional on compiler versions known to implement that extended alignment (and possibly use the Gnu attribute explicitly). Alternatively, this code could be re-designed with manual alignment arithmetic, eg if we think cache-line alignment is important (in which case 64 is also the wrong value for PPC).