Assignment of ZeroMatrix to DynamicMatrix is extremely slow

Issue #230 resolved
Mikhail Katliar created an issue

Assigment of a ZeroMatrix to DynamicMatrix is >400 times slower than the uniform scalar assignment. A benchmark:

#include <blaze/Math.h>
#include <benchmark/benchmark.h>

template <typename Real, size_t M, size_t N>
static void BM_DynamicMatrixZeroMatrixAssign(::benchmark::State& state)
{
    blaze::DynamicMatrix<Real> A(M, N);

    for (auto _ : state)
        ::benchmark::DoNotOptimize(A = blaze::ZeroMatrix<Real>(M, N));
}

template <typename Real, size_t M, size_t N>
static void BM_DynamicMatrixZeroAssign(::benchmark::State& state)
{
    blaze::DynamicMatrix<Real> A(M, N);

    for (auto _ : state)
        ::benchmark::DoNotOptimize(A = Real {0});
}

BENCHMARK_TEMPLATE(BM_DynamicMatrixZeroMatrixAssign, double, 4, 1);
BENCHMARK_TEMPLATE(BM_DynamicMatrixZeroAssign, double, 4, 1);

Output:

2019-02-19 16:24:13
Running build/bin/tmpc_bench
Run on (12 X 4100 MHz CPU s)
CPU Caches:
  L1 Data 32K (x6)
  L1 Instruction 32K (x6)
  L2 Unified 256K (x6)
  L3 Unified 9216K (x1)
Load Average: 0.27, 0.62, 1.11
-----------------------------------------------------------------------------------------
Benchmark                                               Time             CPU   Iterations
-----------------------------------------------------------------------------------------
BM_DynamicMatrixZeroMatrixAssign<double, 4, 1>       2372 ns         2372 ns       317329
BM_DynamicMatrixZeroAssign<double, 4, 1>             4.99 ns         4.99 ns    118211356

Compiler: gcc-8.2.0, compiler flags: -O2 -g -DNDEBUG

Comments (5)

  1. Klaus Iglberger

    Hi Mikhail!

    Thanks a lot for pointing to this defect: You are correct: Combining DynamicMatrix, ZeroMatrix and OpenMP will result in a significantly more expensive assignment for tiny matrices. We apologize for the inconvenience and will fix the problem as quickly as possible.

    Best regards,

    Klaus!

  2. Klaus Iglberger

    Commit b742cdd resolves the significant performance penalty when assigning a ZeroMatrix to a DynamicMatrix while using OpenMP. The fix is immediately available via cloning the Blaze repository and will be officially released in Blaze 3.5.

  3. Log in to comment