GB2488094A

GB2488094A - Image compression using sum and difference pixel replacement and lowest bit discarding

Info

Publication number: GB2488094A
Application number: GB1021203.3A
Authority: GB
Inventors: Timothy Holroyd Glauert
Original assignee: DisplayLink UK Ltd
Current assignee: DisplayLink UK Ltd
Priority date: 2010-12-14
Filing date: 2010-12-14
Publication date: 2012-08-22
Anticipated expiration: 2030-12-14
Also published as: GB2488094B; GB201021203D0

Abstract

A method of compressing a frame of pixel data comprises receiving a frame of pixel data S1; performing a transform of the pixel data comprising: replacing adjacent pixels (A, B) by the sum (A+B) and difference (A-B) S2 of the pixel data; in a binary representation of the sum (A+B) and difference (A-B), discarding S3 the lowest bit of one of the sum (A+B) and difference (A-B); repeating S4 the transform for the entire frame of pixel data; and then performing entropy coding of transformed pixel data, S5. The frame of pixel data may be a frame of pixel tiles, with transform and entropy coding being performed on a tile by tile basis (Figure 1, S2, with colour transforming, S1). The step of performing a transform for the entire pixel data frame may be performed twice for a frame, once (first) using horizontally adjacent pixels and once (second) using pixels adjacent vertically. Thus, the method adapts the known Haar wavelet transform - which also uses replacement of adjacent pixels with their sums and differences â by discarding the lowest bit of the sum or difference, providing a greater level of compression and computational simplicity.

Description

DESCRIPTION

IMPROVED VIDEO COMPRESSION

This invention relates to a method of compressing a frame of pixel data.

The compression of video data is a large and wide-ranging technical field. In general, as display devices such as televisions and computer monitors have increased in size and resolution and the number of sources of video has increased through the expansion of television channels and Internet sites, then the importance of saving bandwidth by compressing video has correspondingly increased. Well-known technologies such as JPEG and MPEG provide compression technologies that are in wide use throughout various different industries, particularly television broadcast and computing.

is These compression technologies operate on the principle that there are large temporal and spatial redundancies within video images that can be exploited to remove significant amounts of information without degrading the quality of the end user's experience of the resulting image.

For example, a colour image may have twenty-four bits of information per pixel, being eight bits each for three colour channels of red, green and blue. Using conventional compression techniques, this information can be reduced to two bits per pixel without the quality of the final image overly suffering. This can be achieved by dividing the image into rectangular blocks (or tiles), where each block is then subjected to a mathematical transform (such as the Discrete Cosine Transform) to produce a series of coefficients.

These coefficients are then quantized (effectively divided by predetermined numbers) and the resulting compressed image data can be transmitted. At the receiving end, the data is decompressed by performing reverse quantization and reversing the transform to reconstruct the original block. Other steps may also occur in the process, such as entropy encoding, to further reduce the amount of data that is actually transmitted.

In this field of data compression, wavelet transforms are often used as the mathematical transform to convert a stream of data symbols into a form that is more compressible form by a subsequent entropy encoder. The Haar wavelet is a simple transform which replaces pairs of adjacent pixels with their sum and difference as follows: [A, B] => [(A+B)h/2, (A-B)h12] where 2 is the square root of 2. This factor is applied in order to preserve the energy of the data stream and to ensure that the same transform can be used to restore the original data. The use of such transforms provides efficient compression of the original pixel data, but do not always provide enough compression for a specific application, nor are they computationally as simple as they could be.

It is therefore an object of the invention to improve upon the known art.

According to a first aspect of the present invention, there is provided a method of compressing a frame of pixel data comprising receiving a frame of pixel data, performing a transform of the pixel data comprising replacing adjacent pixels (A, B) by the sum (A+B) and difference (A-B) of the pixel data, in a binary representation of the sum (A+B) and difference (A-B) discarding the lowest bit of one of the sum (A+B) and difference (A-B), repeating the transform for the entire frame of pixel data, and performing entropy coding of transformed pixel data.

According to a second aspect of the present invention, there is provided a device for compressing a frame of pixel data comprising an encoder arranged to receive a frame of pixel data, perform a transform of the pixel data comprising: replacing adjacent pixels (A, B) by the sum (A+B) and difference (A-B) of the pixel data, in a binary representation of the sum (A+B) and difference (A-B) discarding the lowest bit of one of the sum (A+B) and difference (A-B), repeating the transform for the entire frame of pixel data, and perform entropy coding of transformed pixel data.

Owing to the invention, it is possible to provide an improvement of the Haar transform which increases the efficiency for lossless encoding of data.

This invention describes two principal modifications to this transform, and these provide a more efficient lossless encoding of pixel data. The first modification is to adjust the transform to use integer calculations and thus remove the inherent data loss caused by use of the square root of 2. The second modification takes advantage of the fact that the binary representation of the values (A+B) and (A-B) always have the same value for the lowest bit and this bit can therefore be discarded from the second value.

Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:-Figure 1 is a schematic diagram showing the processing of a video frame, is Figure 2 is a schematic diagram of components of a computer, and Figure 3 is a flowchart of a method of compressing a frame of pixel data.

Figure 1 shows a frame 10 comprised of tiles 12. The frame is compressed through the process steps Si colour transform, S2 tile transform, S3 entropy coding. This type of compression is used for example when a server is running virtual machines or client sessions for remote client devices.

For example, a single server may have twenty remote clients connected, and the server must provide twenty images, each image representing the image of the client session that must be displayed at the respective client device. Owing to limits on current connection technologies, this type of server-client system will only work if the outgoing video data is highly compressed from the original size.

To compress a frame of video data that is expressed in conventional RGB format it is desirable to perform a colour transform to a domain where the luma values are separated from the chroma values, for example Y, Cr Cb.

There is then performed a tile transform, on each colour channel, using a mathematical transform which transforms the original pixel data into a set of numbers that can be more easily compressed. The compression uses a tile size of 8x8 pixels, for example. The transform turns a tile 12 into a series of coefficients and these are then entropy coded. No quantization is performed as the compression is a lossless encoding.

The entropy coding (also called entropy encoding) is a process of transposing the numbers that have been produced by the transform step so that the most commonly occurring numbers are mapped to the shortest bit-length codes. This further reduces the size of the data present at this point.

The end result is a set of data that is a compression of the original image 10 that can be transmitted to another location. The original image can be recreated by reversing the process shown in Figure 1. The entropy coding is reversed and the inverse of the transform function is applied. The resulting luma and chroma values are converted back to RGB data for display by a suitable display device.

Figure 2 illustrates schematically some of the components of a computer 14, which has a central processing unit (CPU) 16, a graphics processing unit (GPU) 18 and an encoder 20. Obviously other components of the computer 14 would also be present, such as memory and other interfaces, but have been omitted for clarity purposes. The GPU 18 is for controlling the output of a local display device connected to the computer and can be used in the compression of images if required. The encoder 20 is the principal component for compressing of a video stream received from the CPU 16. The output of the encoder 20 can be used to control an additional display device or can be transmitted over a general-purpose network for display elsewhere.

Although the encoder 20 is shown as a separate component, it could also be a software process that is being executed by the CPU 16. The encoder takes un-coded video tiles as input and produces coded data messages as output. The encoder 20 performs all of the steps (colour transform, tile transform and entropy coding) because otherwise the 10 bandwidth is increased. The encoder 20 does not need to hold an entire screen image at one time and the encoder 20 can therefore be built on a FPGA which does not need any external storage (DDR). None of the processing steps need vast storage, e.g. a 64x64 tile requires 12KB in RGB form. Holding more tiles would increase system throughput, but it need not be huge.

The encoder 20 may provide an output that is transmitted by the CPU 16 over a USB or Ethernet connection to an additional display device that will then perform a decode of the compressed pixel data. The decompression process is essentially the reverse of the compression steps and the original pixel data is recreated in real-time. The ultimate display device will receive the decompressed pixel data and display that. The compression of the pixel data by the encoder 20 prior to transmission to the display device reduces the bandwidth requirement of the pixel data and allows general-purpose network connections to be used as the transport medium for the pixel data.

The encoder 20 uses two modifications to the tile transform which allows the encoder 20 to be used for more efficient lossless encoding of data.

is The first modification is to adjust the transform to use integer calculations and thus remove the inherent data loss caused by use of the square root of two.

This requires that the forward and reverse transforms are different: Forward: [A, B] => [(A+B), (A-B)] Reverse: [A, B] => [(A+B)/2, (A-B)/2] The second modification takes advantage of the fact that the binary representation of the values (A+B) and (A-B) always have the same value for the lowest bit and this bit can therefore be discarded from the second value.

This bit must be restored during the reverse transform as follows: Forward: [A, B] => [(A+B), (A-B)>>1] Reverse: [A, B] => [(A+(Bc.cl)+(A&1))12, (A-(Bccl)-(A&1))12] where the operator ">.>" shifts the bits of a value to the right and discards the bottom bit, and the operator "&" performs a logical AND between the two binary values.

The effect of this modified transform is to halve the magnitude of the second symbol in each pair of transformed values. When combined with an appropriate entropy coding scheme this provides a significant reduction in the number of bits required to encode the transformed values. Figure 3 sum marises this process which comprises receiving a frame of pixel data (step Si), then performing a transform of the pixel data by replacing adjacent pixels by the sum and difference of the pixel data (step S2), in a binary representation of the sum and difference discarding the lowest bit of one of the sum and difference (step S3), repeating the transform for the entire frame of 0 pixel data (step S4), and performing entropy coding of transformed pixel data (step S5).

When wavelet transforms are used in image compression processing, they are usually applied first in one dimension (horizontally across the image) and then in a second dimension (vertically down the image). The resulting is values are then segmented into low and high frequency components and the transform is applied again to the low frequency components to create a hierarchical series of successively smaller images. Applying this process to data encoded with the transform described above further increases the benefit of this transform over the original Haar formulation.