1. CarloWood
  2. M4RI

Commits

Show all
Author Commit Message Date Builds
Martin Albrecht
more documentation for the Opteron vs. Core2Duo performance compromise
Martin Albrecht
adapt documentation: We use Strassen-Winograd not Strassen
Martin Albrecht
using XOR directly rather than calling mzd_combine gives a significant speed-up so we do that for now. Need to check if this is related to SSE2 and if we can re-introduce it
Martin Albrecht
added support for Visual Studio 2008 Express
Martin Albrecht
remove unecessary local variables, add explicit casts as picked up by MSVC
Martin Albrecht
fixed compilation under OSX (32-bit) and under OpenSolaris (32-bit)
Martin Albrecht
docstring updates and API unification
Martin Albrecht
declaring more parameters const
Martin Albrecht
some cosmetic changes to packedmatrix.c
Martin Albrecht
marking more parameters const
Martin Albrecht
slightly improved clearing of target matrix in _mzd_mul_m4rm_impl
Martin Albrecht
SAFECHAR = (1.3 * RADIX) is sufficient
Martin Albrecht
moved mzd_combine to packedmatrix.[c|h] _mzd_add_impl uses mzd_combine
Martin Albrecht
removed dead test code, added strassen.h to m4ri.h
Martin Albrecht
implemented memory efficient strassen multiplication operation schedule
Martin Albrecht
Doxygen coverage 100%
Martin Albrecht
fix version-info
Martin Albrecht
misc cleanups
Martin Albrecht
a potentially more cache-friendly implementation, needs checking
Martin Albrecht
doxygen updates
Martin Albrecht
simplified combine, don't try to outsmart the compiler
Martin Albrecht
refactoring should be done
Martin Albrecht
continued refactoring (should be almost done) and fixed bug in naiv multiplication
Martin Albrecht
fix build on PPC
Martin Albrecht
- added support for SSE2 if available (autodetection) - implemented Strassen multiplication - made API more C-ish (this is work in progress but most functions are done) - added lots of documentation in Doxygen style - added some tests to the test suite (still incomplete)
Martin Albrecht
Strassen multiplication seems to work now
Martin Albrecht
added support for SSE2 instructions (for now these need to be enabled by hand). The speed-up is hardly noticable for realistic examples though. Also renamed a bunch of functions.
Martin Albrecht
Strassen seems to work if the matrix dimensions are exactly right
Martin Albrecht
- refactoring (renaming of functions, files) - more documentation - added topReduceM4RI function - added first steps towards a testsuite
Martin Albrecht
initial commit
  1. Prev
  2. Next