- changed status to resolved
thread-safety bug in dist_array::ptr
Issue #366
resolved
There is a thread-safety bug in dist_array<T,BS,/*thread-safe=*/true>::ptr
.
This rarely results in returning bad global_ptr values, as in this example from our CI
The bug is on these lines in dist_array.hpp:
727 global_ptr<T> find(const intrank_t target) {
728 typename std::unordered_map<intrank_t,global_ptr<T>>::const_iterator elem;
729 if (thread_safe) {
730 std::lock_guard<std::recursive_mutex> lg(mut);
731 elem = list.find(target);
732 if (elem == list.end()) return global_ptr<T>(nullptr);
733 } else {
734 elem = list.find(target);
735 if (elem == list.end()) return global_ptr<T>(nullptr);
736 }
737 return elem->second;
738 }
739
Note that in the thread-safe case, find
grabs a mutex and performs a std::unordered_map::find
lookup, but the mutex is released before dereferencing the returned const_iterator
to retrieve the value. This creates a race with other threads that might concurrently update the unordered_map
and invalidate the const_iterator
, leading to returning garbage.
Comments (1)
-
reporter - Log in to comment
Fixed in extras commit ef0e1ba