- changed status to open
Improve the performance of sparse matrix/dense matrix multiplication kernels
Description
The primary goal of the Blaze library is to provide maximum performance for all operations. Still, the sparse matrix/dense matrix multiplication kernels do not perform as efficiently as expected. Therefore the following kernels have been selected for a performance upgrade:
- row-major sparse/row-major dense matrix multiplication (
SMatDMatMultExpr
) - row-major sparse/column-major dense matrix multiplication (
SMatTDMatMultExpr
) - column-major sparse/row-major dense matrix multiplication (
TSMatDMatMultExpr
) - column-major sparse/column-major dense matrix multiplication (
TSMatTDMatMultExpr
)
Tasks
- optimize the performance of the
SMatDMatMultExpr
kernel - optimize the performance of the
SMatTDMatMultExpr
kernel - optimize the performance of the
TSMatDMatMultExpr
kernel - optimize the performance of the
TSMatTDMatMultExpr
kernel - update symmetric refactoring operations as required
- guarantee correctness and robustness for all modified kernels
Comments (3)
-
reporter -
reporter - changed status to resolved
The performance of all sparse matrix/dense matrix kernels has been significantly improved. The following tables give an impression on the performance before and after the update. All results were computed on a single core of a Core i7 with 2.6 GHz and double precision element types. All results are given in MFlops:
Assignment to row-major matrices, filling degree 10%
kernel before after smatdmatmult 4138.59 4138.59 smattdmatmult 2038.16 2443.15 tsmatdmatmult 1829.27 4079.71 tsmattdmatmult 811.424 2405.76 Assignment to row-major matrices, filling degree 40%
kernel before after smatdmatmult 4151.3 4151.3 smattdmatmult 2507.88 3583.93 tsmatdmatmult 1761.86 4134.96 tsmattdmatmult 1086.07 3510.59 Assignment to column-major matrices, filling degree 10%
kernel before after smatdmatmult 513.623 2674.4 smattdmatmult 1528.91 2178.77 tsmatdmatmult 1274.34 2682.82 tsmattdmatmult 1068.57 2147.34 Assignment to column-major matrices, filling degree 40%
kernel before after smatdmatmult 531.354 2704.59 smattdmatmult 2083.3 3497.45 tsmatdmatmult 2349.02 2769.74 tsmattdmatmult 1902.57 3440.72 This improvement will be part of Blaze 2.5. Still, due to the massive performance improvement of some kernels it is recommended to use the current master revision!
-
reporter -
assigned issue to
-
assigned issue to
- Log in to comment