Improve the performance of dense matrix/sparse matrix multiplication kernels

Issue #9 resolved
Klaus Iglberger created an issue

Description

The primary goal of the Blaze library is to provide maximum performance for all operations. Still, the dense matrix/sparse matrix multiplication kernels do not perform as efficiently as expected. Therefore the following kernels have been selected for a performance upgrade:

  • row-major dense/row-major sparse matrix multiplication (DMatSMatMultExpr)
  • row-major dense/column-major sparse matrix multiplication (DMatTSMatMultExpr)
  • column-major dense/row-major sparse matrix multiplication (TDMatSMatMultExpr)
  • column-major dense/column-major sparse matrix multiplication (TDMatTSMatMultExpr)

Tasks

  • optimize the performance of the DMatSMatMultExpr kernel
  • optimize the performance of the DMatTSMatMultExpr kernel
  • optimize the performance of the TDMatSMatMultExpr kernel
  • optimize the performance of the TDMatTSMatMultExpr kernel
  • update symmetric refactoring operations as required
  • guarantee correctness and robustness for all modified kernels

Comments (3)

  1. Klaus Iglberger reporter

    The performance of all dense matrix/sparse matrix kernels has been significantly improved. The following tables give an impression on the performance before and after the update. All results were computed on a single core of a Core i7 with 2.6 GHz and are given in MFlops:

    Assignment to row-major matrices, filling degree 10%

    kernel before after
    dmatsmatmult 1315.56 2088.99
    dmattsmatmult 1473.77 2072.93
    tdmatsmatmult 1191.96 2618.87
    tdmattsmatmult 512.772 2595.33

    Assignment to row-major matrices, filling degree 40%

    kernel before after
    dmatsmatmult 2059.11 3348.41
    dmattsmatmult 1531.54 3306.33
    tdmatsmatmult 1803.36 2695.98
    tdmattsmatmult 534.62 2663.28

    Assignment to column-major matrices, filling degree 10%

    kernel before after
    dmatsmatmult 803.514 2351.86
    dmattsmatmult 1344.44 2393.34
    tdmatsmatmult 2304.64 4090.54
    tdmattsmatmult 4113.93 4113.93

    Assignment to column-major matrices, filling degree 40%

    kernel before after
    dmatsmatmult 733.848 3420.31
    dmattsmatmult 1528.73 3437.07
    tdmatsmatmult 2352.15 4167.31
    tdmattsmatmult 4181.47 4181.47

    This improvement will be part of Blaze 2.5. Still, due to the massive performance improvement of some kernels it is recommended to use the current master revision!

  2. Log in to comment