Benchmark with Matlab/Octave and Python

Issue #94 wontfix
Erik created an issue

Hi,

I don't know if this is the right forum for this type of question. Please let me know if this should be some other place.

Have someone compared the performance of Blaze to Matlab/Octave and Python?

I started to do some testing my self. First trial was dense matrix multiplication. The Octave implementation, see attached file, runs faster than when I tries to use Blaze, see the code blow. Is this the expected results?

I'm looking for issues on my side. What are the plausible reasons or things that I can do better? Do I need to use a better Blas implementation, I think I use openblas. Would perhaps be better with Intel MKL. Or could it be that I'm not using the right compile flags. I'm using:

"-Wall -Wshadow -Woverloaded-virtual -pedantic -O3 -mavx -mfma -fopenmp -std=c++14 -DBLAZE_USE_CPP_THREADS -DNDEBUG -DMTL_HAS_BLAS"

BR, Erik

---my_dmatdmatmult

double benchmark::dmatdmatmult( size_t N, size_t steps ) const { std::chrono::high_resolution_clock::time_point t1, t2;

blaze::DynamicMatrix<double,blaze::rowMajor> A( N, N ), B( N, N ), C( N, N );

init(A); init(B);

C = A*B;

t1 = std::chrono::high_resolution_clock::now(); for( size_t step=0UL; step<steps; ++step ) { C = A*B; } t2 = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::milliseconds>( t2 - t1 ).count();

return duration/steps; }


Comments (1)

  1. Klaus Iglberger

    Hi Erik!

    Your benchmark for the matrix-matrix multiplication looks fine. The compilations flags are also ok, but you could drop the -DBLAZE_USE_CPP_THREADS flag since you also specify -fopenmp:

    -Wall -Wshadow -Woverloaded-virtual -pedantic -O3 -mavx -mfma -fopenmp -std=c++14 -DNDEBUG -DMTL_HAS_BLAS
    

    In order to really make use of OpenBLAS, please make sure to enable the use of BLAS libraries in the <blaze/config/BLAS.h> header. Both the BLAZE_BLAS_MODE and the BLAZE_USE_BLAS_MATRIX_MATRIX_MULTIPLICATION have to be set to 1.

    We don't consider this a bug in the library since this is either a configuration problem or a comparison of different BLAS implementations. Therefore we close the ticket as "Won't fix".

    Best regards,

    Klaus!

  2. Log in to comment