1. CarloWood
  2. M4RI

Commits

Author Commit Message Date Builds
Clem...@gmail.com
some more stuff on the weird addmul
Clem...@gmail.com
Martin patch:"more experimental permutation code, needs testing"
Clem...@gmail.com
* new matrix_addmul with any weird dimensions (still need to be tested) * lqup in progress
Clem...@gmail.com
fixing trsm calls to addmul further work on lqup
Clem...@gmail.com
* add permutation window * work in progress in lqup
Martin Albrecht
initial untested code for permutations
Clem...@gmail.com
work in progress in lqup
Martin Albrecht
merging Clement's patch, everything should work
Martin Albrecht
API CHANGE, dropping all _impl's. also improved MP Strassen slightly
Martin Albrecht
sane default value for Strassen cutoff
Martin Albrecht
M4/autoconf trickery
Martin Albrecht
2nd attempt at col_rotate, doesn't update permutation yet
Martin Albrecht
first version of col_rotate
Martin Albrecht
macros more robust by adding lots of brackets
Martin Albrecht
added a bunch of functions and CHANGED THE API!
Martin Albrecht
fixed dimensions of X0,X1,X2 in addmul_strassen
Martin Albrecht
added mzd_col_swap
Martin Albrecht
implemented memory efficient addmul
Martin Albrecht
fix typo in documentation
Martin Albrecht
fix printing for ncols%RADIX == 0
Martin Albrecht
work in progress: mzd_addmul_strassen
Martin Albrecht
adapted parameter k for top_reduce too
Martin Albrecht
slightly improved the k parameter for reduction, the M4RM k parameter can be adapted for the Core2 but not for the Opteron
Martin Albrecht
fix Gaussian reduction for full=FALSE, reported by Wael Said
Martin Albrecht
added documentation for lacking bounds checks print matrices only up to ncols not up to RADIX*width
Martin Albrecht
big check-in (sorry): - mzd_transpose much faster due to improved data locality - parity.h documented - mzd_reduce_m4ri uses 4 Gray code tables now - removed a couple "unsigned" since MSVC doesn't like comparison between signed and unsigned and it is nice to detect overflows to have the sign bit, also you can check i > 0, which is also nice
Martin Albrecht
don't reduce a row if it is already reduced, slight overhead for random matrices, huge gain for e.g. GB matrices
Martin Albrecht
4 Graycode tables seem to be good, need to test on Opteron. For large matrices we hit L2 so we might reconsider block'ing. For a 19907x 29323 we are twice as slow as M4RM multiplication, which needs way more RAM.
Martin Albrecht
another attempt at speed improvements
Martin Albrecht
avoid potential memleak in shared library mode where the Gray codes are rebuild several times.
  1. Prev
  2. Next