- changed status to wontfix
Multiple test failures on Titan
Issue #171
wontfix
Multiple tests (the non-GEX ones) are failing in OLCF's Titan.
This is independent of PrgEnv-gcc vs -intel environment module.
This is independent of gcc/5.3.0, 6.3.0 or 7.3.0 environment module.
This is not reproducible (so far) on Edison, Cori or Theta.
However, similar (distinct) failure has been seen on Theta (but only for gcc/5.3.0).
The stack trace of a failing lpc_barrier
with PrgEnv-gnu and gcc/5.3.0:
Core was generated by `./lpc_barrier-par'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000000000 in ?? ()
(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x000000000042cda3 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
at /b/tmp/peint/build-cray-gcc-20180126.202153.829775000/cray-gcc/BUILD/snos_objdir/x86_64-suse-linux/libstdc++-v3/include/x86_64-suse-linux/bits/gthr-default.h:668
#2 std::thread::join (this=0x7b36e0)
at ../../../../../cray-gcc-7.3.0-201801270210.d61239fc6000b/libstdc++-v3/src/c++11/thread.cc:136
#3 0x0000000000401d90 in main ()
at /lustre/atlas2/csc296/scratch/hargrove/upcnightly-titan/EX-titan-gemini-gcc/runtime/work/dbg/upcxx/test/lpc_barrier.cpp:164
However, this fails in the same manner with the following non-UPC++ code:
#include <atomic>
#include <iostream>
#include <thread>
#include <sched.h>
const int thread_n = 8;
int main() {
std::atomic<int> setup_bar{0};
auto thread_fn = [&](int me) {
setup_bar.fetch_add(1);
while(setup_bar.load(std::memory_order_relaxed) != thread_n)
sched_yield();
};
std::thread* threads[thread_n];
for(int t=1; t < thread_n; t++)
threads[t] = new std::thread{thread_fn, t};
thread_fn(0);
for(int t=1; t < thread_n; t++) {
threads[t]->join();
delete threads[t];
}
std::cout << "Done.\n";
return 0;
}
Comments (2)
-
reporter -
Turns out this is not specific to Titan, or even gemini-conduit.
We are tracking the status of this external issue here: https://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=3813
- Log in to comment
This is a bug in non-upcxx code, reproducible only on a non-supported (gemini-conduit) configuration.