Wiki

Clone wiki

m4ri / Performance

Performance

This page contains some benchmarketing results for M4RI and other systems. If you think that any result presented here is wrong and/or misleading, please let us know so that we can fix it. All times are in seconds.

Matrix Multiplication

Multiplying two random square matrices over \(\mathbb{F}_2\) of the given dimension. Magma and M4RI both implement Strassen-Winograd on top of M4RM (cf. Further Reading). Our multicore code maxes out at 4 cores.

Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz (strombenzin@rhul)

n M4RI/1 core b09b6ef M4RI/2 cores b09b6ef M4RI/4 cores b09b6ef Magma V2.20-5
8192 0.33184 0.19273 0.10365 0.430
10000 0.62534 0.37489 0.21772 0.710
16384 2.37502 1.38256 0.73492 2.880
32768 16.73710 9.71417 5.12534 19.840
65536 121.48419 68.23255 35.74736 137.580
  • M4RI timings were produced as follows:

    $ export OMP_NUM_THREADS=1; for n in 8192 10000 16384 32768 65536; ./bench_multiplication -t 7200 $n $n $n 0 0
    $ export OMP_NUM_THREADS=2; for n in 8192 10000 16384 32768 65536; ./bench_multiplication -t 7200 $n $n $n 0 1
    $ export OMP_NUM_THREADS=4; for n in 8192 10000 16384 32768 65536; ./bench_multiplication -t 7200 $n $n $n 0 1
    
  • Magma timings were produced as follows:

    n:=8192; A:=RandomMatrix(GF(2), n, n); B:=RandomMatrix(GF(2), n, n); time C:=A*B;
    

Rank

Computing the rank of random square matrices over \(\mathbb{F}_2\). We are comparing M4RI, Magma and GF2Toolkit. The benchmarks below indicate that we still have some work to do.

Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz (strombenzin@rhul)

n M4RI/PLE b09b6ef M4RI/M4RI b09b6ef M4RI/MMPF b09b6ef Magma V2.20-5 GF2Toolkit
8192 0.15024 0.12097 0.13964 0.750 0.12268
10000 0.27492 0.21744 0.26816 1.300 --
16384 1.08527 1.04686 1.24549 3.080 0.86451
32768 7.88854 9.11358 14.85795 20.830 6.30347
65536 56.96278 69.32725 166.14155 111.790 46.49620
  • M4RI timings were produced as follows:

    $ for n in 8192 10000 16384 32768 65536; ./bench_rank -t 7200 $n $n ple 0
    $ for n in 8192 10000 16384 32768 65536; ./bench_rank -t 7200 $n $n m4ri 0
    $ for n in 8192 10000 16384 32768 65536; ./bench_rank -t 7200 $n $n mmpf 0
    
  • Magma timings were produced as follows:

    n:=2^16; A:=RandomMatrix(GF(2),n,n); time E:=EchelonForm(A);
    
  • GF2Toolkit timings were produced by patching testRankComputation.cc and using the best timing reported (typically for 128-bit wide words).

Old

Timings for previous versions of M4RI are archived here

Updated