Investigate on timings for Gao algortihms
I ran timing tests on Gao algorithm. I picked a GRS code of length i
, dimension i/2
over GF(1009). Iterate on i
between 100 and 1000, increasing i
by 100 at each iteration. I measured timings for 10 successive decodings at each iteration (on a random "codeword" with as many errors as half the minimum distance), and plotted the following timing graph, by taking the median of the 10 timings measured at each step.
wPf is in red, and partial xgcd in green.
It's not really regular, which seems a bit weird...
Comments (12)
-
reporter -
Wow, that's a surprisingly big difference. Is the timings again over an decoding, or only the euclidean algorithm-part? Also, did you compare with the inbuilt xgcd, to see how much could potentially be gained by optimising/cythonizing your partial xgcd implementation?
-
reporter Surprising indeed.
It's only over the euclidean part.
What you have is :
generate random codeword perform pre-computations (Lagrange interpolation, ...) create matrix for wPf computation and apply weights measure time: xgcd measure time: wpf
Comparison with inbuilt xgcd is on the way, I'll update this thread as soon as I have the results.
-
reporter Here it is. Green is my implementation, red is inbuilt.
I picked the same method as before to build the code. For partial xgcd, stop occurs when one of the Euler coefficients's degree is below
(n + k)/2
.I picked for each code length 100 couples of random polynomials of degree 1000 (coefficients in GF(1009)) and performed the two xgcds using these polynomials.
-
Why is your code getting faster with longer polynomials?
I don't understand what you say: I guess you don't pick random polynomials of degree 1000, but degree = code length? Be aware that the polynomials in decoding don't behave as random polynomials, and that the speed profiles might be somewhat different for random polynomials. However, your test here is very fine since we're just trying to get an impression on the potential speed gains.
Btw. the inbuilt
xgcd
, does this call a low-level libraryxgcd
? Note that this question might not have a unique answer since different libraries are used for different fields (e.g.GF(2)
is different fromGF(2^n)
is different fromGF(p)
forp > 2
, etc.), and for some of these libraries a low-levelxgcd
might be employed, while others not. -
reporter No, I was always picking polynomials of degree 1000. It wasn't smart, I redid the timings with degree = code length, plot below.
For your question regarding the inbuilt
xgcd
, it a different xgcd implementation depending on the nature of the input. One is written in rings/integer.pyx, several other in rings/polynomial/ (one using Flint, another using NTL, another calling the one for univariate polynomials). I'm not sure it depends on the nature of the coefficients, I'd tend to say it depends on the library used to create the polynomials. The one for univariate polynomials is in rings/ring.pyx and is called_xgcd_univariate_polynomail -
Ok, good job. These timings are very encouraging: basically, it's not worth it (IMHO) to spend too much time chasing after that factor <2. Thus we can just be happy with your partial xgcd implementation :-)
What you wrote "it depends on the library used to create the polynomials" is exactly what I meant. It speaks of the efficiency of Sage/Python's layer to C/C++ that there is so little loss in speed between your Sage implementation and the library implementation.
-
Very nice work David. How does the comparison goes for larger instances (maybe unreallistic in coding theory), say length 10^5 or 10^6 I kind of disagree with Johan: if the 2 codes behave similarly (say factor <2) then, let's use the one that is already there, at the right place, and not duplicate code for some specific instance of a computation! This way we ensure that if someone shows up with a drastically faster code for xgcd in the future, then we'll silently and smoothly benefit from the upgrade.
-
The problem is that the in-built
xgcd
does not support early termination. And it's a lot of work to mod that in all the polynomial libraries that are used. And I don't know if this particular variant ofxgcd
(early termination + compute only the "left" auxiliary polynomials) is generally useful. -
reporter And they're all written in Cython, which implies extra work.
Anyway, here are the timings Clément requested! I did only 10 instances of each xgcd per step, I assume you'll guess why. It's over
GF(100003)
, code length increases by10^4
at each step. -
Ah, ok, for large degrees and over the prime field, the underlying polynomial library has an d*log(d) algorithm for gcd. David could also quite easily implement that if we wanted :-)
-
reporter - changed status to resolved
- Log in to comment
I did my timings again, with a GRS code built as explained above. This time though, it's 100 decodings instead of 10 per iteration (still random codewords with as many errors as
floor((n-k)/2)
). Weak Popov form is still in red, partial xgcd in green. y scale is in seconds.And this time, it has only been tested on the algorithms, not on the whole decoding process.