malb / M4RI (http://m4ri.sagemath.org/)

M4RI is a library for fast arithmetic with dense matrices over F2. It was started by Gregory Bard, is maintained by Martin Albrecht. Several people contributed to it. The name M4RI comes from the first implemented algorithm: The "Method of the Four Russians" inversion algorithm published by Gregory Bard. This algorithm in turn is named after the "Method of the Four Russians" multiplication algorithm which is probably better referred to as Kronrod's method. M4RI is used by the Sage mathematics software and the PolyBoRi library. M4RI is available under the General Public License Version 2 or later (GPLv2+).

Clone this repository (size: 730.9 KB): HTTPS / SSH
$ hg clone http://bitbucket.org/malb/m4ri/
commit 314: a2a70d533ae6
parent 313: 39b9a8b9767e
child 315: dfdd4bce3eac
renamed mzd_apply_p_right_tri to mzd_apply_p_right_trans_tri because this is what it does some some sparse-ish performance enhancements default
Martin Albrecht / malb
4 months ago

 NB: This is not the latest revision. For the latest view, go to tip.

View at rev
M4RI / src /
filename size last modified message
Doxyfile 55.2 KB 21 months ago patch bomb:
brilliantrussian.c 39.0 KB 12 months ago refactoreded packedmatrix to allow more than one malloc…
brilliantrussian.h 7.5 KB 12 months ago refactoreded packedmatrix to allow more than one malloc…
config.h.in 2.2 KB 4 months ago considerable protability improvement in configure.ac due to…
grayflex.c 2.5 KB 15 months ago better strategy for column swaps in mzd_pluq_mmpf (still…
grayflex.h 3.5 KB 17 months ago update/correct license statement in source files. M4RI was…
lqup.c 8.6 KB 4 months ago renamed mzd_apply_p_right_tri to…
lqup.h 4.5 KB 5 months ago fixed doxygen warnings
m4ri.h 1.7 KB 15 months ago changed API and updated docs: mzd_reduce_ mzd_echelonize_
misc.c 2.8 KB 12 months ago refactoreded packedmatrix to allow more than one malloc…
misc.h 11.8 KB 9 months ago use L2_CACHE_SIZE for PLUQ cutoff (experimental)
packedmatrix.c 32.6 KB 6 months ago improve performance of mzd_transpose using Hacker's Delight…
packedmatrix.h 22.7 KB 5 months ago fixed potential segmentaton fault in mzd_row_add_offset
parity.h 3.1 KB 17 months ago update/correct license statement in source files. M4RI was…
permutation.c 12.9 KB 4 months ago renamed mzd_apply_p_right_tri to…
permutation.h 5.0 KB 4 months ago renamed mzd_apply_p_right_tri to…
pluq_mmpf.c 16.1 KB 4 months ago renamed mzd_apply_p_right_tri to…
pluq_mmpf.h 2.7 KB 5 months ago fixed doxygen warnings
solve.c 4.8 KB 10 months ago added test code for mzd_kernel_left_pluq()
solve.h 4.6 KB 10 months ago added test code for mzd_kernel_left_pluq()
strassen.c 42.0 KB 9 months ago switch back to using threads if any additional thread is…
strassen.h 4.8 KB 5 months ago implemented timing experiment to calculate L1 and L2 cache…
trsm.c 22.1 KB 12 months ago refactoreded packedmatrix to allow more than one malloc…
trsm.h 4.7 KB 12 months ago refactoreded packedmatrix to allow more than one malloc…

README

INTRODUCTION
============

M4RI is a library for fast arithmetic with dense matrices over F2. The
name M4RI comes from the first implemented algorithm: The "Method of
the Four Russian"” inversion algorithm published by Gregory Bard. This
algorithm in turn is named after the "Method of the Four Russians"
multiplication algorithm which is probably better referred to as
Kronrod's method. 

M4RI is available at http://m4ri.sagemath.org

FEATURES
========

 * basic arithmetic with dense matrices over F2 (addition, equality
   testing, stacking, augmenting, sub-matrices, randomisation);

 * asymptotically fast O(n^log_2(7)) matrix multiplication via the "Method
   of the Four Russians" (M4RM) & Strassen-Winograd algorithm;

 * fast row echelon form computation and matrix inversion via the "Method
   of the Four Russians" (M4RI, O(n^3/log(n)));

 * asymptotically fast PLUQ factorisation; 

 * support for the x86/x86_64 SSE2 instruction set where available;

 * preliminary support for parallisation on shared memory systems via
   OpenMP;

 * and support for Linux and OS X (GCC), support for Solaris (Sun
   Studio Express) and support for Windows (Visual Studio 2008 Express).

OPENMP SUPPORT
==============
OpenMP support for parallel multiplication and elimination is enabled
with the

  --enable-openmp 

configure switch. If GCC is used to compile the library it is avised
to use at least GCC 4.3 since earlier versions have problems with
OpenMP in shared libraries. OpenMP support was introduced in GCC
4.2. Both MSVC and SunCC support OpenMP but we have no experience with
these yet.

Generally speaking better performance improvements can be expected on
dual-core Opteron CPUs than on dual-core Core2Duo CPUs. This is
because the later has a shared L2 cache which is already almost fully
utilised in the single-core implementation.

Overall, the speed-up is considerably but sublinear. See

  http://m4ri.sagemath.org/performance.html 

for details.