PARALLEL IMPLEMENTATION FOR VITERBI-BASED DETECTION
FIELD OF THE INVENTION
The invention deals with a Niterbi-based detection method for generating decisions corresponding to an input data sequence. The invention also deals with a detection device implementing such a detection method, and with a player and/or recorder for a recorded and/or recordable carrier comprising such a detection device.
The invention also deals with a receiver intended for receiving an input data sequence through a transmission channel, said receiver comprising such a detection device. The invention also deals with a transmission system comprising such a receiver.
Such a detection method may be advantageously used in many applications, in particular in magneto, magneto-optical disc systems, hard drives, satellite and mobile transmission systems.
BACKGROUND OF THE INVENTION
The Niterbi algorithm is known as the most efficient way of decoding convolutional codes. As explained in paragraph 7 of the book "Digital Baseband Transmission and Recording" by Jan W.M. Bergmans published in 1996 by Kluwer
Academic Publishers, the Niterbi algorithm consists in reformulating the detection of a data sequence as the search for the shortest path through a trellis of states describing the behaviour of the convolutional code used to generate the data sequence.
Each transition T(i,j) is defined by an initial state i of the trellis and a branch j connecting this initial state to a subsequent state of the trellis. Each branch is uniquely associated with a code word of the convolutional code.
When receiving an input word Z , the Niterbi decoder computes a branch metric BM(i,j,k) for each possible transition T(i,j). The branch metric BM(ij,k) represents the distance between the input word Z and the code word associated with the transition T(i,j). The length of a path in the trellis is the sum of the branch metrics along that path. With a view to determining the shortest path through the trellis, a Niterbi detector keeps track, for each possible state i, of the surviving path P(i,k) that leads to that state, and of the associated path metric PM(i,k). Addition of the branch metrics BM(ij,k) and of the path metrics PM(i,k-l) of the surviving paths leads to stage k from stage k-1. And of all extended
paths that lead to stage k, the one with the smallest path metric survives for each possible state i while the others are discarded.
It can be seen from the above that the Niterbi algorithm comprises a data- dependent feedback loop that performs a so-called Add-Compare-Select operation. The speed of the Niterbi algorithm is intrinsically limited by this data-dependent feedback loop.
It is an object of the present invention to propose a Niterbi-based detection method and a detection device that overcome this limitation.
SUMMARY OF THE INVENTION According to the invention, a detection method for generating decisions corresponding to an input data sequence comprises the steps of:
- dividing said input data sequence into overlapping blocks so as to provide a training region at the beginning of the blocks,
- distributing the blocks amongst Q parallel Viterbi detecting units so as to process Q consecutive blocks in parallel, thereby generating Q sets of decisions in parallel,
- discarding the decisions corresponding to said training regions,
- reordering the other decisions so as to generate at least one sequence of decisions.
In the detection method of the invention a training region is provided to each Viterbi detecting unit. This is achieved by dividing the input data sequence into overlapping blocks. In the regions of overlap the same data are available in at least two different Viterbi detecting units: one Viterbi detecting unit has them at the beginning of the block while the other one has them at the end of the block. The Viterbi detecting units use the overlap at the beginning of the block for training. The decision coming from the training regions are discarded. The final decisions are taken from the other Viterbi detecting unit which has the same data at the end of its block.
Providing a training region allows to avoid deterioration of the error rate while using Parallel Viterbi detecting units.
The proposed parallel implementation of the Viterbi algorithm allows to recover data at high speed from a carrier or a transmission system without requiring the use of expensive and power-consuming high clock rate digital hardware.
The proposed solution is particularly simple and flexible. It may be used with any type of Viterbi detecting units.
It is transparent to the rest of the system. Therefore it is easy to integrate in existing data flows within a chip. In particular, it is compatible with chips using parallel data
processing. The embodiment claimed in claims 2 and 4 is directed to an integration into such chips: the detection device is designed to receive a number of parallel samples per clock cycle on its input and to generate a number of parallel decisions per clock cycle on its output.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other aspects of the invention are further described with reference to the following drawings:
- figure 1 is a schematic diagram of a detection device according to the invention,
- figures 2 to 4 are illustrations of different implementations of the dividing step of a detection method according to the invention;
- figure 5 is a block diagram of the Viterbi detecting unit,
- figure 6 is a schematic diagram of a player according to the invention,
- figure 7 is a schematic diagram of a transmission system according to the invention.
DESCRIPTION OF PREFERRED EMBODIMENTS Figure 1 gives an example of a detection device according to the invention. It comprises an input unit IN, Q parallel Viterbi detecting units Vi, ..., VQ, and an output unit OUT. The input unit IN receives an input data sequence IS. In the embodiment described in Figure 1, the input data sequence IS is organized in P parallel sequences Ii, ..., Ip. This is not restrictive. The function of the input unit IN is to divide the input data sequence IS in overlapping blocks so as to provide a training region at the beginning of each block, and to distribute the blocks to the Q Viterbi detecting units so that Q consecutive blocks can be processed in parallel.
Examples of how to divide the input data sequence IS into overlapping blocks will be described later on with reference to Figures 2 to 4.
The Viterbi detecting units Ni, ..., VQ output decisions from the blocks they receive. These decisions are forwarded to the output unit OUT.
The function of the output unit OUT is to discard the decisions corresponding to the training regions, and to reorder the other decisions so as to generate a sequence of decisions OS. In the embodiment described in Figure 1, the sequence of decisions OS is organized in P parallel sequences of decisions Oι,...,Op, which again is not restrictive. The number Q of Viterbi detecting units is given by Q=KP+M where:
- K is the number of clock cycles used by each Viterbi detecting unit to process a data symbol
of the input data sequence IS,
- M is the number of additional Viterbi units needed to handle the overhead due the training regions.
The value of K, P and M may vary depending on the requirements to be achieved. It is to be understood that when the input data sequence IS is organized into P parallel sequences, the input unit IN virtually reconstitutes the input data sequence IS in order to perform the division into overlapping blocks. And in a similar way, the output unit OUT virtually constructs a single sequence of decisions from which the P parallel sequences of decisions are formed. Figures 2 to 4 represent examples of divisions of the input data sequence into overlapping blocks. Four consecutive overlapping blocks Kl, K2, K3 and K4 are represented.
By way of example, if three Viterbi detecting units Vi, V2 and V are used, blocks Kl and K4 are sent to the Viterbi detecting unit Vi, block K2 is sent to the Viterbi detecting unit V2 and block K3 is sent to the Viterbi detecting unit V3. In Figures 2 and 3 the same samples are available in two different Viterbi detecting units. In Figure 4 same the samples are available in two or three different Viterbi detecting units.
In Figure 2, each block comprises three regions: a first region in which it overlaps with the previous block, a second region without any overlap, and a third region in which it overlaps with the next block. In this case, the training region is the first region of each block. The number M of additional Viterbi detecting units needed here is lower than
KP.
In Figure 3, each block comprises two regions: a first region in which it overlaps with the previous block and a second region in which it overlaps with the next block. In this case, the training region is also the first region of each block. The number M of additional Viterbi detecting units needed here is equal to KP.
In Figure 4, each block comprises five regions: a first region in which it overlaps with the two blocks that precede, a second region in which it overlaps with the previous block only, a third region in which it overlaps with the previous block and with the next block, a fourth region in which it overlaps with the next block only, and a fifth region in which it overlaps with the two blocks that follow. In this example, the training region corresponds to the first, second and third regions. The number M of additional Viterbi detecting units needed here is higher than KP and lower than 2KP.
5 26-07-2003
A basic diagram of a Viterbi detecting unit is represented in Figure 5. It comprises a branch metric calculation unit BMU, a path metric calculation unit PMU and a backtracking array BKU. The backtracking array is responsible for storing the surviving path and for taking the decisions at each stage. The training regions are used to initialise the path metric unit PMU and the backtracking array BKU. For storage applications, the size of the training region required to keep the error rate of the detection device unchanged compared with standard sequential Niterbi detection devices is small. Typically, it is in the order of 50 to 100 input samples: 30 to 50 input samples are used to initialise the backtracking array BKU while the remaining 20 to 50 input samples are used to initialise the path metric calculation unit PMU. Since only a small training period is required, the input data sequence IS can be divided into small blocks. Typically the size of the blocks is in the order of 250 to 500 samples for the existing optical storage systems.
Figure 6 gives a schematic diagram of an example of a player according to the invention. The player of Figure 6 is intended to play recorded media (for instance discs compliant with the standards CD, DND, DND+RW, Blu-ray...). It comprises:
- a reading unit RD intended to read data written on the recorded media DK,
- an equalizer EQ intended to filter the output of the reading unit RD,
- a detection device DN according to the invention, - an error corrector COR intended to reduce the bit error rate,
- a decoder DEC,
- and a playback unit PL.
Figure 7 gives a schematic diagram of a transmission system according to the invention. The transmission system of Figure 4 comprises a transmitter TR, a transmission channel CX and a receiver RR. The reception chain is similar to the reproduction chain described with reference to Figure 6: it comprises an equalizer EQ', a detection device DN' according to the invention, an error corrector COR', and an application unit APPL.
It is to be noted that the detection method of the invention may be implemented either in hardware or in software on a number of digital signal processors running in parallel. When implemented in software it doesn't require any shared memory nor high bandwidth inter-processor connections. This is an additional advantage of the invention.
With respect to the described detection method, detection device, player/recorder, receiver and transmission system, modifications or improvements may be
proposed without departing from the scope of the invention. The invention is thus not limited to the examples provided.
The word "comprising" does not exclude the presence of elements or steps other than those listed in the claims.