GB2139046A

GB2139046A - Video signal transmission

Info

Publication number: GB2139046A
Application number: GB8305355A
Authority: GB
Inventors: Chung Wong Andrew Chi; Anthony John Robson
Original assignee: Standard Telephone and Cables PLC
Current assignee: STC PLC
Priority date: 1983-02-25
Filing date: 1983-02-25
Publication date: 1984-10-31
Also published as: AU2468384A; GB2139046B; GB8305355D0

Abstract

Video data for transmission is compressed using a Discrete Fourier Cosine Transformation (DCT) implemented as a Discrete Fourier Transform (DFT). The data is encoded by being reordered into groups in ROMs according to a Discrete Fourier Transform. The ROMs include look-up tables and are in cascaded stages, the outputs Z(0) etc being subjected to the phase rotations necessary to make the DFT effective as a DCT, the final outputs being H'(0) etc. <IMAGE>

Description

SPECIFICATION Video signal transmission This invention relates to a method and means for transmission of digitized video data using data compression techniques.

Television or other forms of picture displays are known to include large amounts of redundancy. This is due to strong correlations evident in the two-dimensional picture domain and for framed pictures between successive frames in the time domain. As a result it is possible to reduce the bandwidth (or data rate for digitized video) by using an algorithm which removes these correlations whilst maintaining most of the information content of the original picture.

According to the invention there is provided a method of transmitting digitized video data wherein said data is subjected to compression by transformation for transmission, characterised in that the data is subjected to a Discrete Cosine Transformation implemented as a Discrete Fourier Transform by a look-up table method before being encoded for transmission and to a corresponding inverse transformation after being decoded after transmission.

The invention also provides an apparatus for transmitting digitized video data including means for subjecting the digitized data to a Discrete Cosine Transformation, and means for encoding said transformed data.

Embodiments of the invention will be described with reference to the accompanying drawings, in which: Figure 1 illustrates a comparison of various transforms; Figure 2 illustrates a two-stage cascaded Discrete Fourier Transform arrangement used as a Discrete Cosine Transform; Figure 3 illustrates in generalised block diagram form a transform coder; and Figures 4 and 5 are generalised bock diagram for a transmitter encoder and receiver decoder respectively for a compressed video transmission system; Figures 6 and 7 illustrate in generalised form a differential pulse code modulation encoder and decoder respectively.

Transform coding methods exploit the inherent correlation between adjacent picture elements. These transformations may operate in a 1-dimensional set of data or in two dimensions. However, to minimise memory requirements, only 1-dimensional techniques are considered here.

A general 1-dimensional transform is of the form:

where, in this case, the elements xj represent a block of adjacent picture elements along a single line of the picture and yj are these same picture elements transformed into a feature domain under the action of a set of j orthonormal basis vectors with coefficients ajj. In matrix terms, Y=AX and, sinceA is orthonormal, the inverse relation is, X=A-1 Y The applicability of transform methods to image processing stems from the ability of a class of these orthonormal transforms to compact the energy of the pixel block into a few components in the transform feature domain.Data compression is accomplished by discarding those transform components which contribute only marginal energy to the power spectra of the picture and quantising the remaining components according to the statistical likelihood of their presence. In so doing, errors are introduced into the reconstructed picture dependent on the degree of compression required.

It is recognised that the Karhunen-Loéve transform (KLT) is optimum in the sense that it minimises the least squared error of the transmitted image. However, this method requires the eigenvector decomposition of the correlation matrix R of the picture to be found and no fast algorithm for this exists. It is therefore necessary to consider transforms which approach the optimality of the KLT but are realisable in hardware.

The Walsh-Hadamard transform (WHT), Discrete Fourier Transform (DFT), Haar Transform (HT) and Discrete Cosine Transform (DCT) are known to exhibit energy compaction. Figure 1 compares their performance to that of the optimum KLT in terms of the variance of each transform coefficient. From this it is apparent that the DCT performance is closest to that of the KLT.

The HT and DFT are of similar complexities to the DCT, each requiring of the order of N2 multiplication operations for an Nth order 1-dimensional transform. These will not be considered further since they offer no advantage over the DCT. The WHT, however, has transform coefficients of + 1 and, thus, has an implicit simple hardware realisation. But, as illustrated in Figure 1, simulation results have shown the DCT gives superior performance to that of the WHT. As a linear transformation in 1-dimension the implementation of the DCT requires N2 multiply-accumulate operations. The hardware requirement for a direct implementation would, therefore, be quite significant. J.G.McWhirter et al, "Compact low-power coder for extreme bit rate reduction of television pictures" IEE Proc., Vol. 127, Pt. F, No. 5, Oct., 1980., have described an implementation of the WHT which could be used in this type of application. However, the increased performance obtained using the DCT merits the examination of techniques for its implementation.

The DCT is defined as:

By defining a new sequence y(k) = x(2k) ) 1 0,1 -1) y(N-1-k)=x(2k+1)) the DCT may be expressed as:

Hence, the DCT may be expressed in terms of the complex DFT.

For a lot of common video compression applications, a 16-point transform is found to be adequate. In order to make the application of the circuit as wide as possible, a transform time of, say, not more than 10 ps is highly desirable. This means the use of low cost options such as standard FFT VLSI devices is precluded.

High speed methods based on complex FFT butterflies are not suitable because of their hardware complexity and size. A method which offers significant advantages in this application is a technique known as Linear Binary Decomposition (LBD).This is suitable for high speed implementation of any linear transformation on shortwordlength data, and is disclosed in e.g. British Patent Application 8116855 (Serial No.2099615) and British Patent Application 8134091.

This technique decomposes the binary coded array of input data into 1-bit precision sub-arrays of shorter address width suitable for mapping through look-up tables. The mapped output data is recombined using proper weighting factors to give the trarsform coefficient.

The width of memory address required for a look-up table performing an n-point linear transformation on m-bit data is 2m+" bits. By using the LBD technique this is reduced to 2n bits at the expense of requiring m look-up operations. However, for a 16-point transform, say, 215 words of memory are required for each output point and this is, therefore, still not a feasible proposition.

It has been shown that the DCT may be expressed as a phase shifted complex DFT. It is easily shown that the DFT may be factorised into the sum of DFT sequences of shorter length. Figure 2 shows the DFT implemented as two cascaded stages of 4 x 4-point transforms. Reordering the input sequence and applying phase shifts to the outputs gives a cascade DCT implementation.

If each 4-point DFT transformation is implemented using the LBD techniques and the output phase shifts are included in ROM, the total memory requirement (for real input and output data) is 29 words for the 1st stage and 212 words for the 2nd stage. This requires only 5 ROM devices to meet the memory organisation requirement.

The price paid for the decrease in memory size is an increase in the number of ALU's required for recombination of the output data and an increased number of shift registers to perform the bit-slice decomposition of the input data array. It has been calculated that these components, together with some control logic, can be integrated onto 2 customised 3000 gate Uncommitted Logic Arrays (ULA) giving a reasonably compact design.

A hardware scheme for this implementation is given in Figure 3. Input data from an analogue to digital convertor circuit (not shown) is read into a buffer store 30 capable of storing 2 biocks of 16 data points. These coefficients are transferred to the first stage ULA 31 in groups of 4 and in the required order. The inter-stage buffer 32 stores 2 sets of 32 coefficients (16 points with real and imaginary parts) and transfers these with appropriate ordering to the 2nd stage ULA 33 which calculates the output points. The values of selected significant output points are then ready for transmission as the compressed video data via output buffer 34.

The use of buffer stores between stages allows pipeline operation of the transform coder giving a throughput delay of the order of 10 ijs. As shown, this scheme uses 13 device packages but some additional control circuitry, in the form of programmable logic arrays, may be required.

The inverse DCT (IDCT) may be implemented using an identical circuit as that described above, avoiding the need for further customised devices. The only difference from the DCT lies in the data stored in the look-up table ROM's 35 & 36.

It should be noted that, although the above description is given in the context of a 16-point transform, this is done purely for illustration purposes, and the method could be applied to any transform size in powers of 2.

The transform coding technique is applied only to 1-dimensional data from each line of the picture. By implementing a linear predictive coding method between adjacent line scans, the vertical picture correlation can be reduced, increasing the data compression. This method can also be used to reduce the wordlength of the digitised video data for direct transmission over the transmission channel.

A first order predictor is chosen to minimise memory requrements and is expressed as: Xn+1= aXn where Xn+1 is the next predicted value of the sequence x (i = 0, 1 ), and ais the update gain factor. It can be shown that the optimum value for a to minimise the mean square prediction error,

is the line correlation coefficient: p = < Xn+i Xn > x2 > However, the computation and storage of these correlation coefficients is costly, in hardware terms, and a fixed value of a =0.875 is assumed. Reasons for this choice are discussed in McWhirter et al (supra).

Figure 4 illustrates a block diagram for a DPCM encoder. The video signal is fed to a digitizer 40, the output of which goes to the DCT coder 41. A prediction coefficient calculated using past data is subtracted from input data selected from either the video digitizer 40 or from the transform coder 41. This difference, or prediction error, is input to either a bit selection ROM 42 or a limiting ROM, depending on whether compression or direct mode is in operation.

The output from the ROM is in two parts; a 5-bit data word and a 3-bit number used to control the serial transfer 43 of the data word. The data word is decoded in ROM to simulate the decoder input data, after transmission through an error-free channel, and is used as the input to a first order prediction loop 44. The prediction coefficient is formed by multiplying, in ROM, the sum of the decoded data word and the previous prediction coefficient by a fixed binary number and is stored in read/write memory 45 for one line period.

Figure 5 illustrates a block diagram for a DPCM decoder. Data received in serial form is converted into parallel form using a programmable counter 50. This data is decoded in an identical decode ROM 51 to that used in the encoder and is added to a prediction coefficient to obtain the output data. The output data is transferred to an output serial store or to the inverse transform coder 52, depending on the current mode of operation. The next prediction coefficient is formed by multiplying the output data by the same factor as in the encoder and is stored in RAM 53 for one line period.

Figures 6 and 7 show implementations for the coder and decoder respectively using standard MSI logic units.

Claims

1. A method of transmitting digitized video data wherein said data is subjected to compression by transformation for transmission, characterised in that the data is subjected to a Discrete Cosine Transformation implemented as a Discrete Fourier Transform by a look-up table method before being encoded for transmission and to a corresponding inverse transformation after being decoded after transmission.

2. A method according to claim 1, characterised in that the digitisedvideo data is encoded into a binary coded array of input data, said array is decomposed into sub-arrays of shorter address width, said sub-arrays are mapped through look-up tables according to a Discrete Fourier Transform algorithm, the mapped output data being recombined with weighting to give the transform coefficient for transmission.

3. A method according to claim 2 further characterised in that the look-up tables are further decomposed into a plurality of cascaded stages whereby the number of input addreses is further reduced.

4. A method according to claim 3 further characterised in that the phase rotations required between the stages of a Discrete Fourier Transform are incorporated into the look-up data of each stage preceding transfer of the stage outputs to the next stage.

5. A method according to claim 3 or 4 further characterised in that the phase rotations required in the outputs of the last stage to make the Discrete Fourier Transform effective as a Discrete Cosine Transform are incorporated in the look-up data of the last stage.

6. A method according to any preceding claim characterised in that the implementation of corresponding inverse transformation is the same as the encoding transmission.

7. Apparatus for transmitting digitized video data including means for compressing the data, said compression means comprising means for performing a Discrete Cosine Transformation on said data before the data is encoded for transmission and means for performing a corresponding inverse transformation on the data after the data is decoded after transmission, characterised in that the transformations are implemented as Discrete Fourier Transforms wherein data in the form of a binary coded array is decomposed into 1-bit precision sub-arrays of shorter address width which are presented for mapping through look-up tables and the mapped output data is recombined with weighting factors to give the transform coefficient.

8. Apparatus according to claim 7 wherein the look-up tables are further decomposed into a plurality of cascaded stages whereby the number of input addresses is further reduced.

9. Apparatus according to claim 8 wherein the look-up tables in each stage incorporate phase rotation data to effect the phase rotations required in the data outputs of successive stages of a Discrete Fourier Transform.

10. Apparatus for transmitting digitized video data substantially as described with reference to the accompanying drawings.

11. A method of transmitting digitized video data substantially as described with reference to the accompanying drawings.