Martin Albrecht  committed 025fae2

updated README and AUTHORS

  • Participants
  • Parent commits f284b84
  • Branches default

Comments (0)

Files changed (2)

-main author:
+ * Tim Abbott: Debian-isation & advice on correct libtool versioning,
-Gregory Bard <>
+ * Martin Albrecht <>: release manager,
+   peformance tuning (M4RM, M4RI, Strassen), initial M4RM
+   implementation, Strassen-Winograd implementation, parallelisation,
+   LQUP factorisation.
+ * Gregory Bard <>: initial author, M4RI algorithm and
+   initial implementation.
-Martin Albrecht <>
+ * Michael Brickenstein: PolyBoRi author, standard conformity
+   contributions for ANSIC, test data, discussion/suggestion of
+   performance improvements.
+ * Alexander Dreyer: PolyBoRi author, standard conformity
+   contributions for ANSIC.
+ * William Hart: many performance improvements for matrix
+   multiplication and in general.
+ * David Harvey: parallel parity function used in classical
+   multiplication.
+ * Clément Pernet: LQUP factorisation, triangular system solving.
-This library implements the "Method of the Four Russians" or Kronrad
-Method for matrix multiplication, reduction and inversion over GF(2).
-There is no official website right for now.
+M4RI is a library for fast arithmetic with dense matrices over F2. The
+name M4RI comes from the first implemented algorithm: The "Method of
+the Four Russian"” inversion algorithm published by Gregory Bard. This
+algorithm in turn is named after the "Method of the Four Russians"
+multiplication algorithm which is probably better referred to as
+Kronrod's method. 
+M4RI is available at
+ * basic arithmetic with dense matrices over F2 (addition, equality
+   testing, stacking, augmenting, sub-matrices, randomisation),
+ * asymptotically fast O(n^log_2(7)) matrix multiplication via the "Method
+   of the Four Russians" (M4RM) & Strassen-Winograd algorithm,
+ * fast row echelon form computation and matrix inversion via the "Method
+   of the Four Russians" (M4RI, O(n^3/log(n))), 
+ * support for the x86/x86_64 SSE2 instruction set where available,
+ * preliminary support for parallisation on shared memory systems via
+   OpenMP 
+ * and support for Linux and OS X (GCC), support for Solaris (Sun
+   Studio Express) and support for Windows (Visual Studio 2008 Express).
+OpenMP support for parallel multiplication and elimination is enabled
+with the
+  --with-openmp 
+configure switch. If GCC is used to compile the library it is avised
+to use at least GCC 4.3 since earlier versions have problems with
+OpenMP in shared libraries. OpenMP support was introduced in GCC
+4.2. Both MSVC and SunCC support OpenMP but we have no experience with
+these yet.
+Generally speaking better performance improvements can be expected on
+dual-core Opteron CPUs than on dual-core Core2Duo CPUs. This is
+because the later has a shared L2 cache which is already almost fully
+utilised in the single-core implementation.
+Overall, the speed-up is considerably but sublinear. See
+for details.