Making Blaze faster on the other side of the spectrum (i.e. lower dimensions)

Issue #294 new
Matthias Moulin created an issue

It would be nice to have a separate benchmark and associated graphs focusing on the first four dimensions only, as these are frequently used in for example rendering and game development; and to compare in addition against DirectXMath (https://github.com/microsoft/DirectXMath) (which comes together with Visual Studio) and enoki (https://github.com/mitsuba-renderer/enoki). The latter is a library for exploiting SoA with wide vectorization (similiar to ISPC but written in C++); so it has different goals than Blaze, but seems astonishingly fast (could use AVX-512) after some initial experiments. (Concretely, a colleague of mine only did a quick Google benchmark for matrix inversion, which is I know more tricky and involved than matrix-vector and matrix-matrix multiplications. Still it would be nice to compare against enoki.)

Comments (4)

  1. Klaus Iglberger

    Hi Matthias!

    Thanks for creating this issue. In order to give an impression on the performance for tiny vectors and matrices we are using a logarithmic scale for the x-axis (i.e. the dimension) for all graphs in our benchmark results. Do you feel that the level of detail for tiny vectors and matrices is insufficient?

    We will eventually create new performance graphs and potentially update the libraries we are comparing. Whereas we will not consider DirectXMath for our benchmarks, enoki might be an option. However, we don’t plan to update the graphs in the near future. We recommend that you perform these performance comparisons yourself, in particular because this will enable you to get accurate results for your specific application. In case you detect that Blaze performs badly in any of your benchmarks, we would appreciate if you create a performance related issue.

    Best regards,

    Klaus!

  2. Matthias Moulin reporter

    The problem with the current graphs is that it is difficult to see differences and reason about them as most curves overlap for low lengths. So it would be nice to have some graphs focusing on that specific interval while stretching the linear y-axis.

  3. Klaus Iglberger

    Hi Matthias!

    With the latest push we have updated the kernels for dense matrix/dense vector multiplications and dense matrix/dense matrix multiplications. Due to this you should experience a significant performance gain for small matrices. Whereas this doesn’t resolve the request for the special set of performance graphs, it is hopefully still in the spirit of this issue.

    Best regards,

    Klaus!

  4. Log in to comment