Encoding: the difference between a code and its encoding
Mathematically, a code is usually defined as a subset of some ambient space (usually F^n), and does therefore not specify an encoding: the mapping from the message space (usually F^k) into the code. Indeed, the encoding process is very much non-canonical: for linear codes, any matrix unimodular equivalent to a given generator matrix is also a generator matrix specifying a different encoding.
A model is therefore to have Encoder-objects (similar to Decoder objects, see below), which are constructed with the input of a given code. For linear codes, an Encoder object is quite simple and can usually just multiply the message vector with the (standard) generator matrix. This could be the EncoderStandard. Another broad encoder for linear codes would be EncoderSystematic. Contrariwise, for LDPC codes, for instance, an efficient encoder is a slightly more sophisticated algorithm.
To any Encoder for a linear code corresponds a generator matrix, and this is actually the more obvious place for such a function. If we give a generator_matrix for any linear code, we therefore have a design choice: does Code.generator_matrix return the generator_matrix of EncoderStandard called on this Code object, or does EncoderStandard reversely embody the encoder induced by Code.generator_matrix?
Decoding is intimately connected to Metrics and Channels.
Common decoding models
- Decoding Hamming errors up to half-the-minimum distance
- Decoding erasures
- Decoding mixed errors/erasures
- Local decoding
Multiple Decoding algorithms, testing and benchmarking
Even within a given metric (see below), a given code class can have multiple decoding algorithms, each requiring different sets of parameters and having different decoding capabilities. Furthermore, not all decoding algorithms are going to have the same signature: list decoders will return lists, soft-out will return soft codewords, LDPC-decoder might return something which is not a codeword but with most errors corrected, etc.
On the other hand, we would also like to have relatively general functions for testing and benchmarking decoders. Ideally, I (jsrn) would like to write something like:
sage: Algo = MyDecodingAlgo(C) sage: test_decoding_algorithm(C, Algo) All tests passed! Decoding took on average 0.2141211 seconds
In the above example,
Algo could be a
DecodingAlgorithm object, or it could
simply be a function, taking a received word in the ambient space of
doing something to it (or failing).
I (jsrn) suggest that decoders should be implemented as objects, members of a
DecodingAlgorithm. This object can tell things about itself, such as
is_list_decoder, and possibly even its decoding metric
(and channel). It also allows a natural place to put algorithm-specific
parameters, namely in the constructor of the decoding algorithm. Lastly, and
importantly, it supports pre-computation of certain values at construction time
of the decoder, reducing the on-line decoding time.
On the other hand, people caring less about the specific decoding algorithm should have access to a decoder directly on the code object. This should be a sensible default, chosen depending on the code, etc. One could possibly provide an optional argument for choosing between the known decoders. The decoder object would then be created and disposed of after the call, or possibly cached inside the code object.