Improve the performance of sparse matrix/dense matrix multiplication kernels

Issue #11 resolved
Klaus Iglberger created an issue

Description

The primary goal of the Blaze library is to provide maximum performance for all operations. Still, the sparse matrix/dense matrix multiplication kernels do not perform as efficiently as expected. Therefore the following kernels have been selected for a performance upgrade:

  • row-major sparse/row-major dense matrix multiplication (SMatDMatMultExpr)
  • row-major sparse/column-major dense matrix multiplication (SMatTDMatMultExpr)
  • column-major sparse/row-major dense matrix multiplication (TSMatDMatMultExpr)
  • column-major sparse/column-major dense matrix multiplication (TSMatTDMatMultExpr)

Tasks

  • optimize the performance of the SMatDMatMultExpr kernel
  • optimize the performance of the SMatTDMatMultExpr kernel
  • optimize the performance of the TSMatDMatMultExpr kernel
  • optimize the performance of the TSMatTDMatMultExpr kernel
  • update symmetric refactoring operations as required
  • guarantee correctness and robustness for all modified kernels

Comments (3)

  1. Klaus Iglberger reporter

    The performance of all sparse matrix/dense matrix kernels has been significantly improved. The following tables give an impression on the performance before and after the update. All results were computed on a single core of a Core i7 with 2.6 GHz and double precision element types. All results are given in MFlops:

    Assignment to row-major matrices, filling degree 10%

    kernel before after
    smatdmatmult 4138.59 4138.59
    smattdmatmult 2038.16 2443.15
    tsmatdmatmult 1829.27 4079.71
    tsmattdmatmult 811.424 2405.76

    Assignment to row-major matrices, filling degree 40%

    kernel before after
    smatdmatmult 4151.3 4151.3
    smattdmatmult 2507.88 3583.93
    tsmatdmatmult 1761.86 4134.96
    tsmattdmatmult 1086.07 3510.59

    Assignment to column-major matrices, filling degree 10%

    kernel before after
    smatdmatmult 513.623 2674.4
    smattdmatmult 1528.91 2178.77
    tsmatdmatmult 1274.34 2682.82
    tsmattdmatmult 1068.57 2147.34

    Assignment to column-major matrices, filling degree 40%

    kernel before after
    smatdmatmult 531.354 2704.59
    smattdmatmult 2083.3 3497.45
    tsmatdmatmult 2349.02 2769.74
    tsmattdmatmult 1902.57 3440.72

    This improvement will be part of Blaze 2.5. Still, due to the massive performance improvement of some kernels it is recommended to use the current master revision!

  2. Log in to comment