ParallelSection automatically create critical section instead of throwing exceptions

Issue #22 wontfix
byzhang created an issue

First, here is a simplified code

void func() {
  blaze::DynamicVector<float, true> pv(100, 1);
  pv /= 100;
} 
#pragma omp parallel for schedule(static)
for (int i = 0; i < 10000; ++i) {
  func();
}

The above code snippet will core dumped with the "Nested parallel sections detected" exception. The exception is thrown by ParallelSection.

I'm wondering if there is a better approach to handle this kind of SMP assignment issue. The func() might be executed serially (so no exception) or parallel (via OpenMP or pthreads), so can BLAZE_PARALLEL_SECTION synchronize the operations instead of throwing an exception?

Or do you have some suggestions how to efficiently handle it in the application side?

Comments (5)

  1. Klaus Iglberger

    Thank you very much for raising this issue. We highly appreciate this discussion. However, from our point of view this is documented behavior and therefore not a bug. Let me explain in detail.

    In Blaze we have the basic assumption that all available threads are used to speed up the computation of a single operation. Thus as explained in the wiki it is considered an error to execute a semantically parallel operation within a parallel region:

    blaze::DynamicVector<double> x, y;
    blaze::DynamicMatrix<double> A;
    
    #pragma omp parallel
    {
       y = A * x;  // Parallel operation within a parallel region -> not allowed in Blaze
    }
    

    The main reason is that an assignment contains more logic than the actual assignment, such as resizing the target operand, allocating temporaries, etc. These things must not be executed by all threads.

    The example that you provide is a good example for an operation that would not entail these extra steps. Thus theoretically it could work perfectly. Still, it contains an assignment:

    void func() {
       blaze::DynamicVector<float, true> pv(100, 1);
       pv = pv / 100;  // Same as pv /= 100;
    } 
    

    Thus semantically it is a violation of our basic assumption. The reason is that unfortunately we cannot differentiate between the following two cases:

    blaze::DynamicVector<float,true> pv;  // One vector, shared by all threads
    
    #pragma omp parallel
    {
       pv /= 100;  // Executed on a single vector -> Bad case!
    }
    
    #pragma omp parallel
    {
       blaze::DynamicVector<float,true> pv;  // Multiple vectors, one for each thread
       pv /= 100;  // Executed on multiple vectors -> Good case!
    }
    

    Whereas the first case would be in violation of our assumptions, the second case would work well if executed serially (as you suggested). However, since we cannot determine whether we are in a good or a bad case, we have to be defensive and consider both cases an error.

    Luckily, Blaze provides a solution for the good case category. In order to make your example work, all that is necessary is to make it explicit to Blaze that the operation should be executed serially. By using the serial() function (see the wiki) you can explicitly enforce the serial execution of an operation:

    void func() {
       blaze::DynamicVector<float, true> pv(100, 1);
       pv = serial( pv / 100 );  // Explicit serial execution of the division
    } 
    #pragma omp parallel for schedule(static)
    for (int i = 0; i < 10000; ++i) {
       func();
    }
    

    Note that this solution is not a general solution, but works well for the good case category since every thread is working on a different vector.

    Alternatively, if you never need to execute an operation in parallel, you can also completely deactivate the parallel execution in Blaze in the SMP.h config file. Then Blaze will never try to use any threads by itself and you can safely execute multiple operations in parallel.

    I hope this explanation is helpful and is comprehensive enough to understand our rational for the implementation. Thanks again for raising this issue!

  2. byzhang reporter

    Hi Klaus, Thank you so much for the detail explanation. serial() works well if the func() is executed within a parallel section. But when func() is executed within a single thread, can I still enable parallel execution of pv /= 100? Is there OpenMP or other thread functions can detect whether func() is executed in parallel or in serial?

    Thanks,

  3. Klaus Iglberger

    Conceptually it is only possible to run a single operation in parallel or several serial operations in parallel. In other words: Either Blaze uses all available threads for a single operation or it executes serially. Thus in your example there is unfortunately no way to run the division in parallel within a parallel section.

    I'm not aware of any OpenMP function that you can use to detect whether a function is executed in parallel. We also very much like to have such a function.

  4. byzhang reporter

    Hi Klaus, I tried BLAZE_SERIAL_SECTION within a #pragma omp parallel for schedule(static), but failed with the following exception:

    terminate called after throwing an instance of 'std::runtime_error' what(): Nested serial sections detected

    Why is it an exception?

    Thanks,

  5. Klaus Iglberger

    As stated in the last paragraph of the BLAZE_SERIAL_SECTION documentation in the wiki, BLAZE_SERIAL_SECTION cannot be used within a parallel region. The serial section guarantees that no additional threads are spawned, but does not work when there are already multiple threads.

    Thanks for pointing this detail out explicitly. I realize that from a usage point of view this is unexpected. I will see what I can do to improved the situation.

  6. Log in to comment