Skip to content

Concepts

Ricardo Fonseca edited this page Nov 18, 2017 · 16 revisions

Here are the most relevant concepts related to RaptorQ that a user of OpenRQ needs to be aware of. For a brief introduction to RaptorQ, visit the Wiki page "What is RaptorQ?".


Source data

Source data is the generic term we give to the object to be encoded/decoded by RaptorQ. It can be, for example, a fixed-size sequence of bits, a file or a stream of bits.

Source blocks

Source data is first partitioned into contiguous portions, called source blocks, in order to allow efficient decoding with limited working memory.

There is a pair of RaptorQ encoder/decoder per source block and each source block can be processed independently. A sender may for example encode multiple source blocks in parallel if there is sufficient working memory to do so. A receiver with limited working memory may configure a maximum size for a source block, which in turn enforces the sender to partition the source data appropriately. The source data is reconstructed from the source blocks by a receiver as soon as every source block is fully decoded.

Symbols

A symbol is the most basic data unit with a fixed size used in RaptorQ.

It is categorized as a source symbol if it is a contiguous portion of a source block, or as a repair symbol if it is a generated piece of data containing necessary redundancy to assure error recovery with high probability. Both repair and source symbols are part of an encoder output, as a consequence of RaptorQ being a systematic code. We refer to these in general as encoding symbols, which are transmitted inside encoding packets.

An encoding packet contains one or more encoding symbols, as well as extra information that allows the receiver to identify the individual symbols. As the encoded symbols are received, they are being collected by the decoder. When sufficient symbols are available, the receiver starts the decoding process and obtains the source data with high probability (decoding failures are addressed in another section).

Diagram of RaptorQ

Communication using RaptorQ

FEC Parameters

When transferring encoded data to a receiver, the receiver needs to know the source data length since encoded data may be padded, which increases the amount of decodable data. The receiver also needs to know into how many source blocks should the data be partitioned, as well as the number of source symbols in each source block. All of this information must be delivered from the sender to the receiver before the receiver starts collecting encoding symbols. We call this information the FEC parameters.

The FEC parameters are:

  • source data length: the length of the source data, in number of bytes. Since encoded data may contain extra padding bytes, this value allows a decoder to infer the real size of the decoded data.
  • symbol size: the size of a symbol, in number of bytes. This value represents the size of an encoding symbol (source or repair), except the size of the last source symbol, which may be smaller.
  • number of source blocks: the number of source blocks into which the source data is partitioned. Each source block is encoded/decoded independently, and not every one is divided into the same number of source symbols.
  • interleaver length*: the number of sub-blocks per source block into which the source data is interleaved. This value influences the level of uniform interleaving used before encoding a source block. A value of 1 means no interleaving is used, and a higher value means more interleaving per source block.

*Currently, interleaving is disabled so the interleaver length can only be 1.

Symbol overhead

Imagine a source block being divided into K source symbols. Let N be the number of received encoding symbols (source or repair) that are available to a RaptorQ decoder.

Since RaptorQ is a systematic code, if all K source symbols are received then the decoding is immediate. When that is not the case, the decoder will try to fill in the gaps of the missing source symbols with the received repair symbols. Whichever the case, the decoder requires at least N = K encoding symbols in order to try recovering the source data.

However, K encoding symbols may not be sufficient for a successful decoding when some of those are repair symbols (RaptorQ is a probabilistic code). To increase the probability of successful decoding in this case, a decoder may be configured to start the decoding process only when it has received N > K encoding symbols. The higher N is, the higher the probability. We call the N - K symbols the symbol overhead.

Decoding failure

When decoding a source block with a given symbol overhead, there is a small probability for decoding failure. This probability decreases as the symbol overhead increases. However small this probability is, decoding failures must still be addressed appropriately. There are multiple strategies for handling decoding failures, each dependent on the properties of the application or communication channel. For example, possible strategies include:

  • Requesting missing source symbols. This strategy works well if there is a back channel from receivers to senders of data. Additionally, receivers may also indicate which repair symbols have been received, so that senders may transmit new ones.
  • Waiting for more encoding symbols. This strategy works well if there is no back channel (or if it would be expensive to have one) and the sender is continually transmitting encoding packets in a cyclic fashion.
  • Working with available data. Sometimes applications, such as video streaming ones, can simply tolerate data losses (video frames for example). In these situations, the receiver may use the available data composed of the decoded source symbols so far.