GB2343322A

GB2343322A - Matching basis functions

Info

Publication number: GB2343322A
Application number: GB9823957A
Authority: GB
Inventors: Colin Davies
Original assignee: NDS Ltd; Tandberg Television AS
Current assignee: Synamedia Ltd; Ericsson Television AS
Priority date: 1998-10-31
Filing date: 1998-10-31
Publication date: 2000-05-03
Also published as: GB9823957D0

Abstract

A set of basis functions is matched to a digital signal which may be a broadcast television signal including a succession of television frames. The basis functions are used for compression encoding by a process known as matching pursuits. The basis functions are separable along first and second dimensions and the process of compression encoding performs, in parallel, correlations between the digital signal and each basis function in the first dimension 31a, 31b, 31c, and uses the results of each correlation in the first dimension to perform, in parallel, correlations between the results and each basis function in the second dimension 32a, 32b, 32c.

Description

Digital signal Processing This invention relates to improvements in or relating to the processing of digital signals, for example broadcast digital signals.

Broadcasting of digital signals is now a well-established technology. Information in the digital domain can undergo a number of signal processing processes which essentially compress the amount of data that needs to be transmitted.

For example in digital television, it is possible to compress the signal such that less than 1% of the required information needs to be transmitted. This means that many digital television channels can be transmitted in the same bandwidth that would previously be used for only one analogue television channel.

There are a number of well-known techniques for compressing digital data; these include the MPEG II standard for digital compression and wavelet compression techniques. A problem that exists with this type of signal processing is to find a numerical representation of signals that is well suited for performing the processes of coding, noise removal, feature enhancement or pattern recognition. It is desirable to decompose the signals into basis elements that are locally adapted to the signal structures and frequency content. The classical methods identified above build signal representations with linear transforms. These transforms build signal representations that are not adapted to the local signal properties.

Matching pursuits is an alternative method of decomposing a signal into elementary structures that are locally adapted to the local signal properties. Matching pursuits has excellent performance when compared with other compression techniques. Matching pursuits performs signal analysis using an over-complete dictionary set of basis waveforms. With a complete set of basis waveforms it is possible to completely represent a given signal by a unique linear combination of the waveforms. With an overcomplete basis set there may be many different combinations of the basis waveforms which can represent the same given signal. This means that the most appropriate representation can be chosen for different circumstances.

Reference may be had to our co-pending UK Patent Application Nos. 9802233.8,9802235.3 and 9802238.7 for further information on the application of matching pursuits to decompose a video signal into elementary structures using basis functions that are separable with respect to orthogonal axes.

Two main problems occur in designing a real-time matching pursuits encoder to encode television signals. Firstly, the correlation performed by the encoder is very complex and must be repeated many times within an overall time constraint less than a frame period. Secondly, an entire frame has to be loaded into the encoder before the search to select the best basis functions can be made. This is due to the fact that the algorithm used in the encoder is a frame based algorithm. The vertical blanking interval between frames is too short to allow currently available encoders to perform real-time matching pursuits encoding.

The present invention aims to speed the operation of matching a set of basis functions to a digital signal.

According to one aspect of the invention, there is provided a method of matching a set of basis functions to a digital signal, the basis functions being separable along first and second dimensions, the method comprising the steps of performing, in parallel, correlations between the digital signal and each basis function in the first dimension, and using the results of each correlation in the first dimension to perform, in parallel, correlations between the said results and each basis function in the second dimension.

According to a further aspect of the invention, there is provided apparatus for matching a set of basis functions to a digital signal, the basis functions being separable along first and second dimensions, the apparatus comprising first correlation means to perform, in parallel, correlations between the digital signal and each basis function in the first dimension, and second correlation means for using the results of each correlation in the first dimension to perform, in parallel, correlations between the said results and each basis function in the second dimension.

The invention will now be described, by way of example, with reference to the accompanying drawings in which; Figure 1 is a prior art block diagram illustrating signal decomposition using matching pursuits; Figure 2 shows some example two-dimensional functions which are the products of separable one-dimensional basis functions in orthogonal x and y-axis directions; Figure 3 is a block diagram of a known video signal encoder; Figure 4 shows a parallel structure, according to the present invention, of a two-dimensional correlator for use in the encoder of Figure 3; Figure 5 shows a parallel structure, according to the present invention, for a one-dimensional correlator for use in the encoder of Figure 3; and Figure 6 shows an improved video signal encoder according to the present invention.

The iterative matching pursuits process of decomposing a signal with elementary structures of an over-complete dictionary basis set is shown in the block diagram of Figure 1. The original signal is entered in step 10. In step 11, the algorithm initialises by calculating and storing the correlation results between the signal and the dictionary set of basis waveforms. In step 12, the data for one or more of the best correlation results is stored in a structure codebook. In step 13, a decision is made whether to continue coding. For compression purposes this may be based on whether a given bit budget has been reached. If the decision is yes, then the selected correlation elements of step 12 are subtracted, in step 14, from the original signal entered at step 10 and the stored correlation results are updated. The process loops back to step 12. If the decision in step 13 is no, then the signal can be reconstructed in step 15 from the stored structure book. The stored matching pursuits structure book will be an efficiently coded representation of the original signal which can be stored or transmitted for decoding by a decoder.

The analysis of multi-dimensional signals with multidimensional functions is a computationally intensive task. Consider a discrete 2-dimensional signal, e. g. an image, and its correlation with a discrete 2-dimensional function of width 20 and height 20 units. To calculate the correlation at each location in the discrete 2 dimensional signal will require 20*20 = 400 multiplies and adds.

If the discrete multi-dimensional function is separable with respect to its co-ordinate axes, then the computation can be significantly reduced. Figure 2 shows some examples of two-dimensional functions which are the products of separable one-dimensional basis functions in the orthogonal x and y-axis directions.

With separable functions the correlation of the discrete two-dimensional signal and the discrete two-dimensional function can be broken into two stages. First the results of correlating the discrete two-dimensional signal with the one-dimensional function in the x-direction are calculated and stored. Then the correlation of these intermediate results and the one-dimensional function in the y-direction are calculated to produce the twodimensional correlation result.

To calculate the correlation of a separable discrete twodimensional function of width 20 units and height 20 units at each location in the discrete two-dimensional signal will require 20+20 = 40 multiplies and adds. This is a significant saving in computation over the conventional two-dimensional correlation.

For the two-dimensional functions of Figure 2, the computation savings when using separable functions will be even greater since intermediate x-correlation results can be used in the calculations of multiple twodimensional functions. The savings discussed for the twodimensional example will be proportionally greater for functions in higher dimensions.

Even though the computational complexity can be reduced by using separable basis functions, the fact remains that a correlation result must be calculated at every pixel location in the image for all basis functions. Assuming an image size of 720 x 576 = 414720 pixels and assuming a dictionary size of 15 x 15 separable basis functions, the total number of correlation results = 225 x = 93.3 million.

As stated earlier, the time taken for the correlation iterations must be less than the frame period. This can be demonstrated by considering a simplified typical encoder as shown in Figure 3. In Figure 3, the input video signal at an input terminal 20 is made up from a series of still images that arrive a constant frame rate.

In the case where the input signal is an interlaced European standard television signal, the frames arrive every 0.04 seconds. The frames are applied to a difference circuit 21 having an output connected to a compression engine 22. The output from the compression engine 22 is connected through a feedback loop including an inverse compression engine 23 to supply reference frames to a motion estimator/compensator 24. The motion estimator/compensator is also connected to receive each input frame from the input terminal 20. A motion compensated frame is supplied by the estimator/compensator 24 to the difference circuit 21 in response to each input frame and reference frame in a manner well known in the art. The difference circuit 21 generates a motion residual signal for application to the compression engine 22. The reference frame is obtained by compressing and then decompressing the previous frame and this implies that the compression and decompression algorithms must have a maximum latency or delay of less than the input frame rate. The entire encoding, including the correlation iterations, and the decoding must be completed in less than a frame period.

The matching pursuits algorithm searches an entire frame for the best basis function or functions to select. This means that a basis function or functions can only be selected after an entire frame has been loaded into the encoder through the input terminal 20. In the input video signal, there is a vertical blanking interval between successive frames but this delay, which is about 16% of the total frame period, has hitherto been insufficient to perform real-time matching pursuits encoding using conventional techniques.

In Figure 4, a parallel structure is shown for a correlator to be used in the compression engine 22 of Figure 3. An input terminal 30 receives the video input and applies it to a set of parallel x-direction correlators 31a, 31b and 31c. The correlators 31a, 31b and 31c correspond respectively to the basis functions at locations 1, 2 and 3 in the x-direction of Figure 2. The output from the x-direction correlator 31b which corresponds to location 2 has an output connected to a set of parallel y-direction correlators 32a, 32b and 32c.

The correlators 32a, 32b and 32c correspond respectively to the basis functions at locations 1,2 and 3 in the ydirection of Figure 2.

In operation, the correlator 31b produces an output representing the correlation in the x-direction between the input video signal and the basis function at location 2. The output from the correlator 31b is a partial or intermediate result which is applied to the three correlators 32a, 32b and 32c. The correlator 32a produces an output which represents the correlation with the function at location 2 in the x-direction and the correlation with the function at location 1 in the ydirection. Similarly, the correlator 32b produces an output which represents the correlation with the function at location 2 in the x-direction and the correlation with the function at location 2 in the y-direction. Finally, the output from the correlator 32c represents the correlation with the function at location 2 in the xdirection and the correlation with the function at location 3 in the y-direction. For ease of reference, the outputs from the correlators 32a, 32b and 32c are indicated as outputs (2,1), (2,2) and (2,3).

The correlator 31a has an output which is connected to supply its partial result to three y-direction correlators (not shown) corresponding to the three ydirection correlators 32a, 32b and 32c connected to receive the partial result from the correlator 31b.

Similarly, the correlator 31c has an output connected to supply its partial result to three y-direction correlators (not shown) corresponding respectively to the three correlators 32a, 32b and 32c connected to receive the partial result from the correlator 31b. The ydirection correlators connected to the correlator 31a produce outputs which can conveniently be referred to as outputs (1, 1), (1,2) and (1,3). The y-direction correlators connected to the correlator 31c produce outputs which can conveniently be referred to as outputs (3,1), (3,2) and (3,3). It will be apparent that the separable nature of the basis functions allows the structure in Figure 4 to produce all the x-direction correlations in parallel and subsequently to produce all the y-direction correlations in parallel. The number of x-direction correlators and the number of y-direction correlators is a matter of design choice and will reflect the number of basis functions with which correlation in each dimension is to be made.

The input image has to be correlated with each basis function at every location in each line of the image. The correlation within each of the correlators of Figure 4 can be speeded up dramatically if the lines of the image are correlated in parallel with one another. This can be achieved by modifying the structure of Figure 4 so that each correlator has a parallel correlator structure.

Thus, referring to Figure 5, the video input signal at an input terminal 40 is applied to a serial-to-parallel converter 41 to convert the serial input image into 576 parallel lines. The lines of the image are supplied to a one-dimensional parallel correlator 42. The parallel correlator 42 incorporates a set of 576 x-dimensional correlators each of which correlates a respective line of the input image so as to produce the correlation results of all the lines in parallel. A similar parallel structure to that shown in Figure 5 for correlation in the x-direction may also be used for correlation in the y-direction.

Converting the video signal from serial to parallel form in the converter 41 introduces a delay because the final line in each frame must be loaded into the converter before it can be output to the correlator stages. Adding a serial to parallel converter inside the coding loop of the encoder will increase the latency beyond the point which can be tolerated for real-time encoding.

In Figure 6, the serial-to-parallel converter 41 is removed from the coding loop and placed at the input to the encoder.

What has been described is a method and apparatus for matching a set of basis functions to a digital signal in which the operation has been speeded up by the parallel architecture described without compromising the matching pursuits algorithm. The provision of a serial to parallel converter outside the encoding loop avoids introducing further latency into the encoding loop.

The matching pursuits algorithm may be simplified by making it block-based rather than pixel based, e. g. the image is divided into 8 x 8 blocks for correlation. This would cause a deterioration in the performance of the compression engine as large basis functions could no longer be used and boundary effects would further reduce the performance of the compression algorithm.

The motion estimator/compensator may be adapted to use the previous input frame as the reference frame instead of the decompressed frame. This would remove the feedback loop and thus a pipelined solution would be sufficient as latency would not be a problem. However, using the previous input frame as the reference frame in the encoder may cause decoding synchronisation problems.

It will be understood by those skilled in the art that although the above examples are described in respect of 576 line parallelism, the method and apparatus is suitable for all other formats such as e. g. 720 lines.

Claims

CLAIMS 1. A method of matching a set of basis functions to a digital signal, the basis functions being separable along first and second dimensions, the method comprising the steps of performing, in parallel, correlations between the digital signal and each basis function in the first dimension, and using the results of each correlation in the first dimension to perform, in parallel, correlations between the said results and each basis function in the second dimension.
2. A method as claimed in claim 1, which is applied to matching a set of basis functions to a digital video signal comprising a succession of video frames.
3. A method as claimed in claim 2, further comprising the step of converting the digital signal from a serial signal representing the video frames to a parallel signal representing parallel lines of the video frames preparatory to the step of performing, in parallel, correlations between the digital signal and the basis functions in the first dimension.
4. Apparatus for matching a set of basis functions to a digital signal, the basis functions being separable along first and second dimensions, the apparatus comprising first correlation means to perform, in parallel, correlations between the digital signal and each basis function in the first dimension, and second correlation means for using the results of each correlation in the first dimension to perform, in parallel, correlations between the said results and each basis function in the second dimension.
5. Apparatus as claimed in claim 4, which is a video signal encoder adapted to match a set of basis functions to a digital signal comprising a succession of video frames.
6. Apparatus as claimed in claim 5, further comprising a serial to parallel converter for converting the digital signal from a serial signal representing the video frames to a parallel signal representing parallel lines of the video frames preparatory to supplying the digital signal to the first correlation means.
7. A method of matching a set of basis functions to a digital signal, the basis functions being separable along first and second dimensions, the method being substantially as hereinbefore described with reference to the accompanying drawings.
8. Apparatus for matching a set of basis functions to a digital signal, the basis functions being separable along first and second dimensions, the apparatus being substantially as hereinbefore described with reference to the accompanying drawings.