Clone wiki

sage_coding_project / Encoders_and_Decoders

Encoding: the difference between a code and its encoding

Mathematically, a code is usually defined as a subset of some ambient space (usually F^n), and does therefore not specify an encoding: the mapping from the message space (usually F^k) into the code. Indeed, the encoding process is very much non-canonical: for linear codes, any matrix unimodular equivalent to a given generator matrix is also a generator matrix specifying a different encoding.

A model is therefore to have Encoder-objects (similar to Decoder objects, see below), which are constructed with the input of a given code. For linear codes, an Encoder object is quite simple and can usually just multiply the message vector with the (standard) generator matrix. This could be the EncoderStandard. Another broad encoder for linear codes would be EncoderSystematic. Contrariwise, for LDPC codes, for instance, an efficient encoder is a slightly more sophisticated algorithm.

To any Encoder for a linear code corresponds a generator matrix, and this is actually the more obvious place for such a function. If we give a generator_matrix for any linear code, we therefore have a design choice: does Code.generator_matrix return the generator_matrix of EncoderStandard called on this Code object, or does EncoderStandard reversely embody the encoder induced by Code.generator_matrix?


Decoding is intimately connected to Metrics and Channels.

Common decoding models

  • Decoding Hamming errors up to half-the-minimum distance
  • Decoding erasures
  • Decoding mixed errors/erasures
  • Soft-decoding
  • Local decoding

Multiple Decoding algorithms, testing and benchmarking

Even within a given metric (see below), a given code class can have multiple decoding algorithms, each requiring different sets of parameters and having different decoding capabilities. Furthermore, not all decoding algorithms are going to have the same signature: list decoders will return lists, soft-out will return soft codewords, LDPC-decoder might return something which is not a codeword but with most errors corrected, etc.

On the other hand, we would also like to have relatively general functions for testing and benchmarking decoders. Ideally, I (jsrn) would like to write something like:

sage: Algo = MyDecodingAlgo(C)
sage: test_decoding_algorithm(C, Algo)
All tests passed!
Decoding took on average 0.2141211 seconds

In the above example, Algo could be a DecodingAlgorithm object, or it could simply be a function, taking a received word in the ambient space of C and doing something to it (or failing).

I (jsrn) suggest that decoders should be implemented as objects, members of a class DecodingAlgorithm. This object can tell things about itself, such as decoding_radius and is_list_decoder, and possibly even its decoding metric (and channel). It also allows a natural place to put algorithm-specific parameters, namely in the constructor of the decoding algorithm. Lastly, and importantly, it supports pre-computation of certain values at construction time of the decoder, reducing the on-line decoding time.

On the other hand, people caring less about the specific decoding algorithm should have access to a decoder directly on the code object. This should be a sensible default, chosen depending on the code, etc. One could possibly provide an optional argument for choosing between the known decoders. The decoder object would then be created and disposed of after the call, or possibly cached inside the code object.