GB2344952A

GB2344952A - Matching basis functions

Info

Publication number: GB2344952A
Application number: GB9828034A
Authority: GB
Inventors: Derek William Brown; Colin Davies
Original assignee: NDS Ltd; Tandberg Television AS
Current assignee: Synamedia Ltd; Ericsson Television AS
Priority date: 1998-12-18
Filing date: 1998-12-18
Publication date: 2000-06-21
Also published as: GB9828034D0

Abstract

A set of basis functions is matched to a digital signal which may be a broadcast television signal. The basis functions are used for compression encoding the digital signal by a process known as matching pursuits. This is an iterative process including the steps of correlating coefficients of the basis functions to the input digital data, selecting the basis function having coefficients which best match the input data, and subtracting the selected basis function from the input data. The coefficients of the basis functions used in the correlating step have values which effect pre-normalisation of the correlation results in each iteration of the process.

Description

DIGITAL SIGNAL PROCESSING This invention relates to improvements in or relating to the processing of digital signas, for example broadcast digital signas.

Broadcasting of digital signals is now a well-established technology.

Information in the digital domain can undergo a number of signal processing processes which essentially compress the amount of data that needs to be transmitted. For example in digital television, it is possible to compress the signal such that less than 1% of the required information needs to be transmitted. This means that many digital television channels can be transmitted in the same bandwidth that would previously be used for only one analogue television channel.

There are a number of well-known techniques for compressing digital data; these include the MPEG II standard for digital compression and wavelet compression techniques. A problem that exists with this type of signal processing is to find a numerical representation of signals that is well suited for performing the processes of coding, noise removal, feature enhancement or pattern recognition. It is desirable to decompose the signals into basis elements that are locally adapted to the signal structures and frequency content. The classical methods identified above build signal representations with linear transforms. These transforms build signal representations that are not adapted to the local signal properties.

Matching Pursuits is an alternative method of decomposing a signal into elementary structures that are locally adapted to the local signal properties.

Matching pursuits has excellent performance when compared with other compression techniques. Matching pursuits performs signal analysis using an over-complete dictionary set of basis waveforms. With a complet set of basis waveforms it is possible to completely represent a given signal by a unique linear combination of the waveforms. With an over-complete basis set there may be many different combinations of the basis waveforms which can represent the same given signal. This means that the most appropriate representation can be chosen for different circumstances.

Reference may be made to our co-pending UK Patent Application Numbers: 9802233.8 and 9823957.7 for further information on the application of matching pursuits to decompose a video signal into elementary structures using basis functions that are separable with respect to orthogonal axes.

Two main problems occur in designing a real-time matching pursuits encoder to encode television signals. Firstly, the correlation performed by the encoder is very complex and must be repeated many times within an overall time contraint less than a frame period. Secondly, an entire frame has to be loaded into the encoder before the search to select the best basis functions can be made. The vertical blanking interval between frames is too short to allow currently available encoders to perform real-time matching pursuits encoding.

The present invention aims to speed the operation of matching a set of basis functions to a digital signal.

According to one aspect of the invention, there is provided a method of matching a set of basis functions to input digital data by an iterative process including the steps of: correlating coefficients of the basis functions to the input digital data; selecting the basis function having coefficients which best match the input data; and, subtracting the selected basis function from the input data, characterised in that the coefficients of the basis functions used in the correlating step have values which effect pre-normalisation of the correlation results in each iteration of the process.

According to a further aspect of the invention, there is provided apparatus for matching a set of basis functions to input digital data by an iterative process comprising: a correlator for correlating coefficients of the basis functions to the input digital data; a selecting module to select the basis function having coefficients which best match the input data; and, a subtractor for subtracting the selected basis function from the input data, characterised in that the coefficients of the basis functions used in the correlator have values which effect pre-normalisation of the correlation results in each iteration of the process.

The invention will now be described, by way of example, with reference to the accompanying drawings in which: Figure 1 is a prior art block diagram illustrating signal decomposition using matching pursuits; Figure 2 shows some example two-dimensional functions which are the products of separable one-dimensional basis functions in orthogonal x and yaxis directions; Figure 3 is a block diagram of a known video signal encoder; Figure 4 shows a compression engine, according to the present invention, for use in the encoder of Figure 3; Figure 5 shows the concatenation of short basis functions to form longer concatenated basis functions; Figure 6 shows how a set of one-dimensional basis functions can be generated; and, Figure 7 shows modifications to the 2-dimensional functions of Figure 2.

The iterative matching pursuits process of decomposing a signal with elementary structures of an over-complete dictionary basis set is shown in the block diagram of Figure 1. The original signal is entered in step 10. In step 11, the algorithm initialises by calculating and storing the correlation results between the signal and the dictionary set of basis waveforms or atoms. In step 12, the data for one or more of the best correlation results is stored in a structure codebook. In step 13, a decision is made whether to continue coding. For compression purposes this may be based on whether a given bit budget has been reached. If the decision is yes, then the selected correlation elements of step 12 are subtracted, in step 14, from the original signal entered at step 10 and the stored correlation results are updated. The process loops back to step 12. If the decision in step 13 is no, then the signal can be reconstructed in step 15 from the stored structure book. The stored matching pursuits structure book will be an efficiently coded representation of the original signal which can be stored or transmitted for decoding by a decoder.

The analysis of multi-dimensional signals with multi-dimensional functions is a computationally intensive task. Consider a discrete 2-dimensional signal, e. g. an image, and its correlation with a discrete 2-dimensional function of width 20 and height 20 units. To calculate the correlation at each location in the discrete 2-dimensional signal will require 20*20 = 400 multiplies and adds.

If the discrete multi-dimensional function is separable with respect to its coordinate axes, then the computation can be significantly reduced. Figure 2 shows some examples of two-dimensional functions which are the products of separable one-dimensional basis functions in the orthogonal x and y-axis directions.

With separable functions the correlation of the discrete two-dimensional signal and the discrete two-dimensional function can be broken into two stages. First the results of correlating the discrete two-dimensional signal with the one-dimensional function in the x-direction are calculated and stored.

Then the correlation of these intermediate results and the one-dimensional function in the y-direction are calculated to produce the two-dimensional correlation result.

To calculate the correlation of a separable discrete two-dimensional function of width 20 units and height 20 units at each location in the discrete twodimensional signal will require 20+20 = 40 multiplies and adds. This is a significant saving in computation over the conventional two-dimensional correlation.

For the two-dimensional functions of Figure 2, the computation savings when using separable functions will be even greater since intermediate x-correlation results can be used in the calculations of multiple two-dimensional functions.

The savings discussed for the two-dimensional example will be proportionally greater for functions in higher dimensions.

Even though the computational complexity can be reduced by using separable basis functions, the fact remains that a correlation result must be calculated at every pixel location in the image for all basis functions.

Assuming an image size of 720 x 576 = 414720 pixels and assuming a dictionary size of 15 x 15 separable basis functions, the total number of correlation results = 225 x 414,720 = 93.3 million.

As stated earlier, the time taken for the correlation iterations must be less than the frame period. This can be demonstrated by considering a simplifie typical encoder as shown in Figure 3. In Figure 3, the input video signal at an input terminal 20 is made up from a series of still images that arrive at the same frame rate. In the case where the input signal is an interlaced European standard television signal, the frames arrive every 0.04 seconds.

The frames are applied to a difference circuit 21 having an output connected to a compression engine 22. The output from the compression engine 22 is connected through a feedback loop including an inverse compression engine 23 to supply reference frames to a motion estimator/compensator 24. The motion estimator/compensator is also connected to receive each input frame from the input terminal 20. A motion compensated frame is supplie by the estimator/compensator 24 to the difference circuit 21 in response to each input frame and reference frame in a manner well known in the art. The difference circuit 21 generates a motion residual signal for application to the compression engine 22. The reference frame is obtained by compressing and then decompressing the previous frame and this implies that the compression and decompression algorithms must have a maximum latency or delay of less than the input frame rate. The entire encoding, including the correlation iterations, and the decoding must be completed in less than a frame period.

The compression engine 22 is shown in greater detail in Figure 4. The data input to the compression engine is applied to a current data store or buffer 25.

The buffer 25 is connected to a correlator 26 which performs the correlation steps referred to in connection with Figure 1. The results of the correlation are subjected to normalisation in a normalising module 27. The maximum correlation value for the atom coefficients are found in a module 28. The atoms correlated with the data are stored in the output data store 29 and are subtracted from the original data in the buffer 25. The normalisation conducted in the module 27 is a division operation applied to the correlation coefficients after they have been calculated. Normalisation ensures that if a basis function is correlated with itself, the correlation coefficient = 1.

The matching pursuits algorithm searches an entire frame for the best basis function or functions to select. This means that a basis function or functions can only be selected after an entire frame has been loaded into the encoder through the input terminal 20. In the input video signal, there is a vertical blanking interval between successive frames but this delay, which is about 16% of the total frame period, has hitherto been insufficient to perform realtime matching pursuits encoding using conventional techniques.

One technique to reduce the complexity of the compression engine is to use concatenated basis functions. Reference may be made to our co-pending UK Patent Application Number: 9802233.8 for further explanation of concatenated basis functions. Briefly, one or more long basis functions are created by the repeated correlation or convolution of shorter functions. Figure 5 shows some examples of long functions which have been created by the repeated convolution of short functions. Function 1 is a short function and function 2 is the result of convolving function 1 with itself to form a longer function. Function 3 is the result of convolving function 1 with function 2.

Concatenated basis functions significantly reduce the correlation computations as compared to non-concatenated basis functions.

Figure 6 illustrates how a set of one-dimensional basis functions can be generated and lists the coefficients of each of the short functions that are convolved to form the basis functions.

The process of normalisation has already been mentioned in connection with the compression engine described with reference to Figure 4 as a division operation after the correlation coefficients have been calculated. This ensures that the time-consuming division process occurs only once for each of the basis functions. Table I lists the normalisation factors for each of the basis functions generated using the coefficients in Figure 6.

TABLE I

Basis Function Normalisation Factor Basis 0 1 Basis 1 4. 4 Basis 2 33. 3 Basis 3 227. 4 Basis 4 1630. 3 Basis 5 4. 4 Basis 6 1. 4 Basis 7 4. 2 Basis 8105. 5 The normalisation factors in Table I cause the number of multiplications in each iteration of the process described with reference to the compression engine of Figure 4 to be too large for real time encoding even when using concatenated basis functions to reduce the computational complexity within the correlator. Multiplication and division operations are time-consuming operations when compared to addition or subtraction. According to the invention, the computational complexity is reduced in a number of different ways. A first step is to modify the coefficients of the short functions that are listed in Figure 6. The modified coefficients are presented in Table II.

TABLE IN

Function Coeffcients Scale 0 [1 2 4 2 1] Scale 1 [1 2 4 2 1] Scale2 [1 02040201] Scale3 [1 002004002001] Edge 1 [1 0-1] Edge 2 [1 0-1] Texture [1-2 4-2 1] As can be seen from Table II, the coefficients listed there are all powers of 2.

A consequence of this is that the multiplication operations within the correlator can be performed by a simple left shift. A left shift incurs no time penalty and the correlation is therefore considerably speeded up.

If the basis functions were not concatenated, they would need to have function coefficients that were restricted to powers of two to obtain the benefit of multiplication operations that could be performed using left shifts i. e. the coefficients would be selected from coefficient values of 1,2,4,8,16,32,64, 128 and so on. The number of useful (smooth) basis functions that could be generated from such a restricted selection of coefficients is rather limited and the performance of the correlation operation would suffer. By using concatenated basis functions, however, it is only the short functions from which the concatenated basis functions are formed that need have coefficients that are restricted to values that are powers of two. The modification of the short functions has a relatively small modifying effect on the concatenated basis functions. Figure 7 shows the modified basis functions. A comparison of the basis functions in Figure 7 with those in Figure 2 shows that constraining the coefficients of the short functions to powers of two does not dramatically change the concatenated basis function shapes.

A second step to reducing the complexity of the correlator is to reduce the number of bits that are used to represent the correlation results. If the input data to the encoder represents an input image that is a motion residual where each pixel can take any integer value in the range of +/-255, it takes 9 bits to represent each pixel. The maximum correlation result for each of the concatenated basis functions is shown in Table III below.

The maximum number of bits for the 2 dimensional basis functions can be deduced from Table e) ! ! as follows. Knowing that 9 bit values are input to the correlator and basis 4 correlation results can be represented in 23 bits, it can be deduced that the basis 4 correlation requires an additional 14 bits to represent the result each time it is correlated with the input. Therefore the maximum number of bits for a two-dimensional basis function (Basis (4,4)) is 9 + 14 + 14 = 37 bits. All the other permutations can be calculated using the same reasoning.

TABLE III

Basis Funcfion Maximum Correlation Number of bits Basis 0 +/-255 Basis 1 +/-2, 550 13 Basis 2 +/-25, 500 16 Basis 3 +/-255, 000 19 Basis 4 +/-2, 550,000 23 Basis 5 +/-2, 550 13 Basis 6 +/-510 10 Basis 7 +/-3,060 13 Basis 8 +/-145, 860 19 The total of 37 bits is substantially reduced according to the present invention by performing a partial normalisation in the coefficients of the short functions.

The partial normalisation is achieved by further modifying coefficients of the short functions to be fractional values where the denominator is a power of two. The further modified coefficients are shown in Table IV.

TABLE IV

Function Coefficients Scale 0 [1/4 1/2 1/2 1/4] Scale 1 [1/8 1/4 1/2 1/4 1/8] Scale 2 [1/8 0 1/4 0 1/2 0 1/4 0 1/8] Scale 3 [1/8 0 0 1/4 0 0 1/2 0 0 1/4 0 0 1/8] Edge 1 [1 0 -1] Edge2 [20-2] [ Texture [1/4 -1/2 1 -1/2 1/4] A power of two division can be implemented by a shift right instead of the shift set for multiplication. The simplification obtained by employing coefficients that are based on powers of two is therefore retained. Table V shows the maximum number of bits required to accurately represent the correlation results when using the further modified coefficients of Table IV.

TABLE V

Basis Function Maximum Correlation Value No. of Bits Basis 0 +/-255 9 Basis 1 +/-638 11 Basis 2 +/-796 11 Basis 3 +/-996 Basis 4 +/-1245 12 Basis 5 +/-638 11 Basis 6 +/-510 10 \ Basis7 \ +/-765 Basis 8 +/-1140 12 As can be seen from Table V, the maximum number of bits to represent the results from a one dimensional basis function has been reduced from 23 to 12. In a pipeline encoder, the bit resolution can not be easily changed throughout the encoder. This means that ail the calculations must be performed at the maximum bit resolution to maintain the pipeline.

The adoption of the coefficients listed in Table IV reduces the number of bits to represent the results of correlation and consequently the size of the hardware to implement the correlator by a factor of 50%. The reduced number of bits also increases the throughput of the correlator and the size of the data storage required to store the results. The reduced number of bits does not cause a deterioration in the image quality. Where possible the residual normalisation should be as close to one as possible. Table VI lists the residual normalisation factors for each of the one-dimensional bass functions.

TABLE Vl

Basis Function Normalisation Factor Basis 0 1 Basis 1 1. 27 Basis 2 1. 32 Basis 3 1. 26 Basis 4 1. 25 Basis 5 1. 27 Basis 6 1. 41 Basis 7 1. 32 Basis 81. 26 There is a similarity between some of the short functions from which the concatenated basis functions are derived. This similarity can be exploite to reduce the complexity of the correlation hardware. Thus in Figure 6, correlating the Basis 0 correlation output (input image) with; Scale 0, Texture and Edge 1 generates the correlation results for Basis 1, Basis 5 and Basis 6 respectively. It can be seen from Table IV that Scale 0 and Texture have very similar coefficients. This means that the calculations can be shared on common hardware thus reducing the number of adders and latches required in the correlator. Exploiting the similarity between the coefficients in Table IV can reduce the hardware within the correlator by around 15%.

Claims

1. A method of matching a set of basis functions to input digital data by an iterative process including the steps of: correlating coefficients of the basis functions to the input digital data; selecting the basis function having coefficients which best match the input data; and, subtracting the selected basis function from the input data, characterised in that the coefficients of the basis functions used in the correlating step have values which effect pre-normalisation of the correlation results in each iteration of the process.

2. A method as claimed in Claim 1 in which the set of basis functions used in the correlating step are concatenated basis functions.

3. A method as claimed in Claim 2 in which the correlating step inclues division operations performed by binary shift right operations employing coefficients that are fractional values of which the denominator is a power of 2.

4. A method as claimed in any one of the preceding claims comprising a further step of subjecting the correlation results to a residual normalisation.

5. A method as claimed in any one of the preceding claims which is applied to matching a set of basis functions to a digital video signal comprising a succession of video frames.

6. Apparatus for matching a set of basis functions to input digital data by an iterative process comprising: a correlator for correlating coefficients of the basis functions to the input digital data; a selecting module to select the basis function having coefficients which best match the input data; and, a subtractor for subtracting the selected basis function from the input data, characterised in that the coefficients of the basis functions used in the correlator have values which effect pre normalisation of the correlation results in each iteration of the process.

7. Apparatus as claimed in Claim 6 in which the set of basis functions used in the correlator are concatenated basis functions.

8. Apparatus as claimed in Claim 7 in which the correlator is adapted to correlate coefficients by performing division operations comprising binary shift right operations employing coefficients that are fraction values of which the denominator is a power of 2.

9. Apparatus as claimed in Claim 6,7 or 8 further comprising a normalising module to subject the correlation results to a residual normalisation.

10. Apparatus as claimed in Claim 6,7,8 or 9 which is a video signal encoder adapted to match a set of basis functions to a digital signal comprising a succession of video frames.

11. A method of matching a set of basis functions to input digital data by an iterative process the method being substantially as hereinbefore described with reference to the accompanying drawings.

12. Apparatus for matching a set of basis functions to input digital data by an iterative process, the apparatus being substantially as hereinbefore described with reference to the accompanying drawings.