CN118158393A

CN118158393A - Image data decompression

Info

Publication number: CN118158393A
Application number: CN202311646158.1A
Authority: CN
Inventors: I·马丁内利; S·芬尼; K·马克思; P·希金博顿
Original assignee: Imagination Technologies Ltd
Current assignee: Imagination Technologies Ltd
Priority date: 2022-12-07
Filing date: 2023-12-01
Publication date: 2024-06-07
Also published as: GB2619573A; GB2619573B; GB202218417D0

Abstract

The image data is decompressed. A method and decompression unit for performing decompression to determine one or more image element values from compressed data is provided. The compressed data represents a block of image data comprising a plurality of image element values, each image element value comprising a plurality of data values associated with a respective plurality of channels, wherein the plurality of channels comprises at least one reference channel and a plurality of non-reference channels. Compressed channel data for each of the channels is read from the compressed data. For each of the channels, the compressed channel data for that channel is used to determine an initial data value related to that channel for each of the one or more image element values that are decompressed.

Description

Image data decompression

Technical Field

The present disclosure relates to data compression and data decompression.

Background

Data compression, whether lossless or lossy, is desirable in many applications where data is stored in and/or read from memory. By compressing the data before storing the data in the memory, the amount of data transferred to the memory may be reduced. An example of data for which data compression is particularly useful is image data. The term "image data" is used herein to refer to two-dimensional data having values corresponding to respective pixels or sampling locations of an image. For example, the image may be generated as part of a rendering process on a Graphics Processing Unit (GPU). The image data may include, but is not limited to: depth data to be stored in the depth buffer, pixel data (e.g., color data) to be stored in the frame buffer, texture data to be stored in the texture buffer, surface normal data to be stored in the surface normal buffer, and illumination data to be stored in the illumination buffer. These buffers may be any suitable type of memory, such as cache memory, separate memory subsystems, storage areas in a shared memory system, or some combination thereof.

The GPU may be used to process data to generate image data. For example, the GPU may determine pixel values (e.g., color values) of an image to be stored in a frame buffer, which may be output to a display. GPUs typically have a highly parallelized architecture for processing large blocks of data in parallel. There is significant commercial pressure to have GPUs, particularly those intended for implementation on mobile/embedded devices, operate with reduced latency, reduced power consumption, and reduced physical size (e.g., reduced silicon area). Against these goals is the desire to use higher quality rendering algorithms to produce higher quality images. Reducing the memory bandwidth (i.e., reducing the amount of data transferred between the GPU and memory) can significantly reduce latency and power consumption of the system, which is why it may be particularly useful to compress the data before transferring the data. This is also true to a lesser extent when considering data that is moved around within the GPU itself. Furthermore, the same problem may be associated with other processing units, such as a Central Processing Unit (CPU) and a GPU.

FIG. 1 illustrates an exemplary graphics processing system 100 that may be implemented in an electronic device, such as a mobile/embedded device. Graphics processing system 100 includes a GPU 102 and a memory 104 (e.g., graphics memory). Data (which may be compressed data) may be transferred in either direction between GPU 102 and memory 104.

GPU 102 includes processing logic 106, memory interface 108, compression unit 110, and decompression unit 112. In some examples, the compression unit and decompression unit may be combined into a single unit that may perform both compression and decompression.

In operation, GPU 102 may process image data areas individually. The region may, for example, represent a rectangular (including square) portion (or "tile") of the rendering space (i.e., a two-dimensional space representing, for example, an image region to be rendered). Processing logic 106 may perform rasterization of graphics primitives (e.g., without limitation, triangles and lines) using known techniques such as depth testing and texture mapping. Processing logic 106 may include a cache unit to reduce memory traffic. Some data is read from or written to the memory 104 by the processing logic 106 through the memory interface 108. In the example shown in fig. 1, data being written from processing logic 106 to memory 104 is transferred from processing logic 106 to memory interface 108 via compression unit 110. The compression unit 110 may compress the data before passing the data to the memory interface. Similarly, in the example shown in fig. 1, data being read from memory 104 by processing logic 106 is passed from memory interface 108 to processing logic 106 via decompression unit 112. Decompression unit 112 may decompress the data before passing the data (if it has been compressed) to processing logic 106. The use of compression unit 110 and decompression unit 112 means that compressed data may be transferred between memory interface 108 and memory 104, thereby reducing the amount of data to be transferred to memory 104 through an external memory bus.

As known to those skilled in the art, the processing logic 106 of the GPU 102 may generate a set of one or more color values (e.g., RGB or RGBA) for each pixel in the rendering space and cause the color values to be stored in a frame buffer (e.g., memory 104). The set of color values for a frame may be referred to herein as color data or image data. The processing logic 106 may also generate other image data, e.g., depth data, surface normal data, illumination data, etc., and may store those image data values in one or more buffers in memory. In some cases, these buffers may be referred to as frame buffers, while in other cases, the term "frame buffer" may be reserved for buffers storing color values or storing data to be sent to a display. In some graphics rendering systems, processing logic 106 may use image data values stored in a buffer for a particular rendering in performing one or more subsequent renderings. For example, color values generated by one rendering may represent textures that may be stored (e.g., in compressed form) in memory 104, and the textures may be read (e.g., and decompressed) from memory 104 for application to a surface as textures in one or more subsequent renderings. Similarly, surface normal values generated for a geometric model in one rendering may be used to apply lighting effects to the same model during rendering one or more subsequent renderings. Further, the surface depth values generated and stored in one rendering may be read back for one or more subsequent renderings of the same model.

Since image data (e.g., color data) may be quite large, the memory bandwidth associated with writing and reading image data to and from the buffer in memory may be a substantial portion of the total memory bandwidth of the graphics processing system and/or GPU. Accordingly, the image data is generally compressed via the compression unit 110 before being stored in the buffer, and decompressed via the decompression unit 112 after being read from the buffer.

When data is compressed using a lossless compression technique and then decompressed using a complementary lossless decompression technique, the original data can be recovered without losing the data (assuming no errors in the compression or decompression process). The degree to which data is compressed may be expressed as a compression ratio, where the compression ratio is obtained by dividing the size of uncompressed data by the size of compressed data. The compression ratio achieved by lossless compression techniques generally depends on the data being compressed. For example, lossless compression techniques tend to be able to achieve relatively high compression ratios when compressing highly correlated data; whereas lossless compression techniques tend to achieve relatively low compression ratios when compressing uncorrelated (e.g., random) data. Thus, it is difficult to ensure that lossless compression techniques will achieve a particular compression ratio (e.g., a compression ratio of 2:1). Thus, if only lossless compression techniques are used, the system must generally be able to handle situations where the desired compression ratio (e.g., 2:1) cannot be achieved, and situations where compression cannot occur at all, for example, using lossless compression techniques.

In some cases, it may be considered more important to ensure that the compression ratio is guaranteed not to lose data during compression. For example, ensuring a compression ratio allows the memory footprint to be reduced, which is necessary to ensure that compressed data blocks can be stored. Ensuring the compression ratio may allow for a reduction in the size (e.g., silicon area) of the memory 104. In these cases, lossy compression techniques may be used, which may enable a guaranteed compression ratio, but some data may be lost during compression.

Two british patents GB2586531B and GB2586532B describe methods for compressing and decompressing blocks of image data to meet a target compression level.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

There is provided a computer-implemented method of performing decompression to determine one or more image element values from compressed data, wherein the compressed data represents a block of image data comprising a plurality of image element values, each image element value comprising a plurality of data values associated with a respective plurality of channels, wherein the plurality of channels comprises at least one reference channel and a plurality of non-reference channels, the method comprising:

reading compressed channel data for each of the channels from the compressed data;

for each of the channels, determining an initial data value related to the channel for each of the one or more image element values that are decompressed using the compressed channel data for the channel;

For each of the one or more image element values that are decompressed, determining a decompressed data value for each of the non-reference channels by:

Reading an indication of a compressed mode for the non-reference channel of the block from the compressed data, wherein the compressed mode is a channel decorrelation mode or a non-channel decorrelation mode, and

Determining the decompressed data values for the non-reference channels as determined initial data values relating to the non-reference channels for the image element values if the compressed mode for the non-reference channels is a non-channel decorrelation mode; and

If the compressed mode for the non-reference channel is a channel decorrelation mode, the decompressed data values for the non-reference channel are determined as a function of the determined initial data values related to the non-reference channel and the determined initial data values related to one of the at least one reference channel for the image element values.

The compressed mode for a first non-reference channel of the plurality of non-reference channels may be a channel decorrelation mode and the compressed mode for a second non-reference channel of the plurality of non-reference channels may be a non-channel decorrelation mode.

The function may be a sum.

For each of the one or more image element values that are decompressed, for each of the at least one reference channel, the determined initial data value related to the reference channel for the image element value may be a decompressed data value for the reference channel.

The plurality of channels may include a red channel, a green channel, a blue channel, and optionally an alpha channel.

The reading compressed channel data for each of the channels may include:

reading an indication of an origin value for each of the channels from the compressed data; and

For each of the channels, reading a representation of a difference value from the compressed data, wherein the difference value for the channel may represent a difference between the data value and the origin value for the channel for one or more image element values decompressed from the compressed data;

Wherein said determining, for each of the channels, an initial data value related to the channel for each of the one or more image element values that are decompressed using the compressed channel data for the channel may comprise:

For each of the channels, determining an initial data value related to the channel for each of the one or more image element values that are decompressed using: (i) An indication of an origin value for a channel read from the compressed data, and (ii) a representation of a difference value read from the compressed data.

The determining, for each of the channels, an initial data value related to the channel for each of the one or more image element values that are decompressed may include: the determined differences of the channel for the origin value of the channel and for the image element value are summed.

The determining, for each of the channels, an initial data value related to the channel for each of the one or more image element values that are decompressed may comprise subtracting the determined difference value for the channel for the image element value from an origin value for the channel.

For each of the non-reference channels, an indication of the origin value for the non-reference channel may be read from the compressed data as a number representing: (i) An origin value for a non-reference channel, or (ii) a difference between an origin value for a non-reference channel and an origin value for one of the at least one reference channel of the block.

The reading compressed channel data for each of the channels may further include:

For each of the channels, reading from the compressed data an indication of a first number of bits of the representation of the difference for the channel;

Wherein the determining an initial data value related to the channel for each of the one or more image element values that are decompressed may comprise:

based on a representation of the difference values read from the compressed data, for each of the channels and for each of the one or more image element values that are decompressed, a difference value is determined from the first number of bits for the channel.

The determined difference value for a channel of image element values may have the first number of bits for a channel.

For each of the channels, each of the representations of the differences for the channels may have a first number of bits for the channel.

The reading of the representation of the difference from the compressed data may comprise:

obtaining a second number of bits for each of the channels, wherein each of the representations of the difference for each of the channels has the second number of bits for the channel; and

The obtained second number of bits for the respective channel is used to read a representation of the difference value for one or more image element values decompressed from the compressed data.

The determining a difference value from the first number of bits for each of the channels and for each of the one or more image element values that are decompressed may include adding zero, one or more least significant bits to a representation of the difference value read from the compressed data, thereby determining a difference value having the first number of bits for each of the channels.

Zero, one, or more least significant bits added to the representation of the difference value read from the compressed data may be determined by bit copying of the corresponding zero, one, or more most significant bits of the representation of the difference value read from the compressed data.

The obtaining the second number of bits for each of the one or more channels may include determining the second number of bits for the channel using the first number of bits for each of the one or more channels according to a predetermined scheme.

The obtaining a second number of bits for each of the one or more channels may include reading an indication of the second number of bits for a channel from the compressed data.

The compressed data may be in a compressed data block, the compressed data block comprising:

A head portion having a fixed size and comprising: (i) An indication of an origin value for each of the channels, and (ii) an indication of a compressed mode for each of the non-reference channels; and

A body portion having a variable size and comprising a representation of a difference value for each of the channels.

The header portion may also include an indication of a first number of bits for each of the channels.

The image element value may be a pixel value, a texel value, a depth value, a surface normal or an illumination value.

The method may further comprise outputting the determined data values of the decompressed one or more image element values for further processing.

There is provided a decompression unit configured to perform decompression to determine one or more image element values from compressed data, wherein the compressed data represents an image data block comprising a plurality of image element values, each image element value comprising a plurality of data values relating to a respective plurality of channels, wherein the plurality of channels comprises at least one reference channel and a plurality of non-reference channels, the decompression unit comprising:

decompression logic configured to:

reading compressed channel data for each of the channels from the compressed data; and

For each of the channels, determining an initial data value related to the channel for each of the one or more image element values that are decompressed using the compressed channel data for the channel; and

Channel decorrelation logic configured to determine, for each of the one or more image element values that are decompressed, a decompressed data value for each of the non-reference channels by:

The decompression logic may include:

initial data value determination logic configured to read from the compressed data an indication of an origin value for each of the channels; and

Difference determination logic configured to read, for each of the channels, a representation of a difference from the compressed data, wherein the difference for the channel represents a difference for the channel between an initial data value and an origin value for one or more image element values decompressed from the compressed data;

Wherein the initial data value determination logic may be further configured to determine, for each of the channels, an initial data value related to the channel for each of the one or more image element values that are decompressed using: (i) An indication of an origin value for a channel read from the compressed data, and (ii) a representation of a difference value read from the compressed data.

The difference determination logic may be further configured to:

for each of the channels, reading from the compressed data an indication of a first number of bits of the representation of the difference for the channel; and

The difference determination logic may be further configured to:

A decompression unit may be provided, which is configured to perform any of the methods described herein.

A computer-implemented method of compressing a block of image data may be provided, wherein the block of image data comprises a plurality of image element values, each image element value comprising a plurality of data values associated with a respective plurality of channels, wherein the plurality of channels comprises a reference channel and one or more non-reference channels, the method comprising:

for each of the one or more non-reference channels:

Determining a number of bits n _{non-decorrelated} for a non-channel decorrelation mode for non-destructively representing a difference between a maximum value and a minimum value of data values for the non-reference channel of the block;

Determining a decorrelated data value for the non-reference channel by finding a difference between the data value of the non-reference channel and a data value of the reference channel for each image element value in the block;

Determining a number of bits n _decorrelated for a channel decorrelation mode for non-destructively representing a difference between a maximum value and a minimum value of the decorrelated data values for the non-reference channel of the block;

Comparing the determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode with the determined number of bits n _decorrelated for the channel decorrelation mode; and

Selecting the channel decorrelation mode or the non-channel decorrelation mode according to a result of the comparison, wherein if the channel decorrelation mode is selected, the decorrelation data value for the non-reference channel of the block is used instead of the data value for the non-reference channel of the block to determine compressed channel data for the non-reference channel;

Determining compressed channel data for each of the channels of the block; and

Forming compressed data, the compressed data comprising:

An indication of the selected mode for each of the one or more non-reference channels, and

The determined compressed channel data for each of the channels.

A compression unit may be provided, the compression unit being configured to compress a block of image data, wherein the block of image data comprises a plurality of image element values, each image element value comprising a plurality of data values relating to a respective plurality of channels, wherein the plurality of channels comprises a reference channel and one or more non-reference channels, the compression unit comprising:

analyzer logic configured to, for each of the one or more non-reference channels:

compression logic configured to:

Determining compressed channel data for each of the channels of the block; and

Forming compressed data, wherein the compressed data comprises:

The determined compressed channel data for each of the channels.

The compression unit and/or decompression unit may be embodied in hardware on an integrated circuit. A method of manufacturing a compression unit and/or a decompression unit in an integrated circuit manufacturing system may be provided. An integrated circuit definition data set may be provided that, when processed in an integrated circuit manufacturing system, configures the system to manufacture a compression unit and/or a decompression unit. A non-transitory computer-readable storage medium having stored thereon a computer-readable description of a compression unit and/or a decompression unit that, when processed in an integrated circuit manufacturing system, causes the integrated circuit manufacturing system to manufacture an integrated circuit embodying the compression unit and/or the decompression unit may be provided.

An integrated circuit manufacturing system may be provided, the integrated circuit manufacturing system comprising: a non-transitory computer readable storage medium having stored thereon a computer readable description of a compression unit and/or a decompression unit; a layout processing system configured to process the computer readable description to generate a circuit layout description of the integrated circuit embodying the compression unit and/or the decompression unit; and an integrated circuit generation system configured to manufacture the compression unit or the decompression unit according to the circuit layout description.

Computer program code for performing any of the methods described herein may be provided. A non-transitory computer-readable storage medium having stored thereon computer-readable instructions that, when executed at a computer system, cause the computer system to perform any of the methods described herein may be provided.

As will be apparent to those skilled in the art, the above features may be suitably combined and combined with any of the aspects of the examples described herein.

Drawings

Examples will now be described in detail with reference to the accompanying drawings, in which:

FIG. 1 illustrates a graphics processing system in which compression and decompression units are implemented within the graphics processing unit;

FIG. 2 illustrates an image data array including image data blocks;

Fig. 3 shows a compression unit;

FIG. 4 is a flow chart of a method of compressing a block of image data;

FIG. 5 is a flowchart of some of the steps of a method of compressing blocks of image data in an example of meeting a target compression level;

FIG. 6a illustrates an example format for storing four compressed data blocks;

FIG. 6b illustrates an example format of data within a header portion of a compressed data block;

Fig. 7 shows a decompression unit;

FIG. 8 is a flow chart of a method of performing decompression to determine one or more image element values from compressed data;

FIG. 9 is a flowchart of some of the steps of a method of performing decompression to determine one or more image element values from compressed data in an example of meeting a target compression level;

FIG. 10 illustrates a computer system in which a graphics processing system is implemented; and

Fig. 11 illustrates an integrated circuit manufacturing system for generating an integrated circuit embodying a compression unit or a decompression unit as described herein.

The figures illustrate various examples. Skilled artisans will appreciate that element boundaries (e.g., blocks, groups of blocks, or other shapes) illustrated in the figures represent one example of boundaries. In some examples, it may be the case that one element may be designed as a plurality of elements, or that a plurality of elements may be designed as one element. Where appropriate, common reference numerals have been used throughout the various figures to indicate like features.

Detailed Description

The following description is presented by way of example to enable a person skilled in the art to make and use the invention. The invention is not limited to the embodiments described herein and various modifications to the disclosed embodiments will be apparent to those skilled in the art.

Embodiments will now be described by way of example only.

As described above, in a compression scheme, it is generally advantageous to reduce the size of the compressed data, but there may be a tradeoff between reducing the size of the compressed data and reducing the degree of error that may be introduced by using lossy compression techniques. Examples described herein provide opportunities to reduce the size of compressed data and/or reduce the level of errors that may be introduced by using lossy compression techniques without increasing the size of the compressed data. Specifically, in the examples described herein, a compression scheme may select between two compression modes based on which mode will produce compressed data having a smaller size. In the examples described herein, the two modes are: (i) A channel decorrelation mode and (ii) a non-channel decorrelation mode. An indication of the selected mode may be included in the compressed data so that the decompression unit may correctly decompress the compressed data. For example, the image data block to be compressed may be represented as a 2D array of image element values (e.g., pixel values), where each of the image element values includes a plurality of data values associated with a respective plurality of channels. For example, there may be three channels (e.g., a red (R) channel, a green (G) channel, a blue (B) channel) or four channels (e.g., a red (R) channel, a green (G) channel, a blue (B) channel, and an α (a) channel). One of these channels (e.g., the green channel) may be designated as a reference channel, and the other channels (e.g., the red and blue channels) may be designated as non-reference channels. In the channel decorrelation mode, a decorrelated data value may be determined for a non-reference channel and then used instead of the data value for the non-reference channel to determine compressed channel data for the non-reference channel. The decorrelated data values for the non-reference channels may be determined by finding a difference between the data values of the non-reference channels and the data values of the reference channels for each image element value in the block. A channel decorrelation mode or a non-channel decorrelation mode may be selected for each non-reference channel such that a different mode may be selected for a different non-reference channel of the compressed block of image element values. Selecting at the granularity of the channels allows for the selection of the appropriate mode for each non-reference channel independently. To facilitate this, an indication of the selected mode is included in the compressed data of each of the non-reference channels.

Some of the examples described herein implement lossless compression and decompression techniques. Lossless compression and decompression techniques cannot guarantee that the target compression level is met, but (assuming no errors occur during compression or decompression) when data is compressed using a lossless compression technique and then decompressed using a complementary lossless decompression technique, the original data can be restored without losing the data.

However, as described above, it may be useful to have compression techniques (and complementary decompression techniques) that ensure that a target compression level is met (e.g., that a compression ratio, such as 2:1, is met) without too much data quality loss (e.g., without introducing visually perceptible artifacts in the image due to compression-induced losses). The target compression level is satisfied if the achieved compression ratio is equal to or greater than the target level of compression ratio. Some previous high-end lossy compression algorithms may achieve a fixed compression ratio without too much data quality loss, but these previous high-end lossy compression algorithms involve, for example, performing complex operations on fixed-point values (such as filtering, multiplication, and division operations), and may require internal buffering of the values during the compression process. Thus, these previous high-end lossy compression algorithms are generally considered unsuitable for use in small, low-cost and/or low-power computing systems, e.g., may be implemented in mobile devices such as smartphones and tablet computers or other devices that are particularly limited in size, cost, and/or power.

All of the compression and decompression examples described herein (lossless and lossy examples) can be implemented in a small, low cost and low power computing system. The examples described herein can be implemented simply (i.e., effectively), for example, in fixed-function circuitry. For example, these examples may be performed by: simple operations such as addition, subtraction and comparison operations are performed and more complex operations such as multiplication or division operations are not performed (e.g., these operations may be extensive and/or expensive in terms of the amount of data required to represent the values or the size of hardware logic required to implement them), and do not require internal buffering of the values during the compression or decompression process as the previous high-end compression and decompression algorithms mentioned above. In these examples, these simple operations (e.g., addition operations, subtraction operations, and comparison operations) are "integer operations," meaning that they operate on integers, such as image element values in integer format.

The compression techniques described herein may be implemented in a compression unit implemented in dedicated hardware (e.g., using fixed-function circuitry). Similarly, the decompression techniques described herein may be implemented in a decompression unit implemented in dedicated hardware (e.g., using fixed function circuitry). In these examples, the compression unit and decompression unit may be smaller in physical size (e.g., silicon area) when compared to previous high-end compression and decompression units implemented in hardware. The compression and decompression units described herein are suitable for implementation in small low cost processing units (e.g., GPUs or CPUs) with small silicon area and low power consumption and low latency. This is achieved without unduly degrading the data quality (e.g., image quality).

Furthermore, in some examples described herein, if this would meet a target compression level for compressing an image data block, the compression technique allows the compression to be lossless, but if the lossless compression of the image data block does not meet the target compression level, the compression may become lossy in order to ensure that the target compression level is met. In these examples, rather than having separate lossless and lossy compression units, a single compression unit may be used to perform lossless or lossy compression to compress the image data blocks. Performing lossless compression and lossy compression using a single compression unit may reduce the overall hardware implemented in the device (e.g., may reduce the silicon area implemented in the device) and may reduce the power consumption of the device, as compared to implementing separate units for lossless compression and for lossy compression. Similarly, in the examples described herein, rather than having separate lossless and lossy decompression units, a single decompression unit may be used to perform lossless decompression or lossy decompression to compress blocks of image data. Performing lossless decompression and lossy decompression using a single decompression unit may reduce the overall hardware implemented in the device (e.g., may reduce the silicon area implemented in the device) and may reduce the power consumption of the device, as compared to implementing separate units for lossless decompression and for lossy decompression.

The image data to be compressed may be represented as a 2D array of image element values (e.g., pixel values). Each of the image element values includes a plurality of data values related to a respective plurality of channels, such as a red (R) channel, a green (G) channel, a blue (B) channel, and an alpha (a) channel. Fig. 2 shows an image data array 200 comprising image data blocks. In this example, image data array 200 represents an image represented by pixel values, where the image data block is an 8 x 8 block of pixel values, represented in fig. 2 as 202 ₀、202₁、202₂ and 202 ₃. In other examples, the image data blocks may be of different sizes and/or shapes (e.g., 16 x 4 blocks of pixel values). Further, in other examples, there may be more (or less) than four image data blocks in the image data array.

Examples described herein relate to compressing and decompressing blocks of pixel values, where the pixel values represent an image. However, it should be understood that pixel values are only one example of image element values that may be compressed using the techniques described herein. More generally, a block of image data may be compressed to form a compressed block of data, and the compressed block of data may be decompressed to form a block of image data, wherein the image data includes a plurality of image element values. To give some examples, the image element values may be: (i) a texel value representing a texture; (ii) pixel values representing an image; (iii) Depth values representing surface depths at different sampling locations within the scene; (iv) Surface normal values representing directions of normal vectors of surfaces at different sampling positions within the scene; or (v) an illumination value representing illumination on the surface at different sampling locations within the scene. The illumination value represents a "light pattern". The light map may be considered a texture such that the light map may be used and processed in the same manner as the texture. Pixel values and texel values are examples of color values (where pixel values represent images and texel values represent textures). These color values are multi-channel values. For example, the color values may be in RGB format, where there are a red channel (R), a green channel (G), and a blue channel (B). In other examples, the color values may be in RGBA format, where there are a red channel (R), a green channel (G), a blue channel (B), and an alpha channel (a). In other examples, the color values may be in YCbCr format, where the color values have a luminance (luma) channel (Y), a first chrominance channel (Cb), and a second chrominance channel (Cr). As known in the art, the multi-channel color values may have many other formats. Each channel of the multi-channel color value includes a data value associated with that particular channel. The depth values, surface normal values, and illumination values are generally considered single channel values, but they may be processed like multi-channel values and packed into a multi-channel format according to the principles of the examples described herein. As a simple example, 2×4 partitions of depth values (a to H): a 2x 2 block, which may be considered to be equivalent to depth values, where each depth value includes two channels: /(I) In the examples described herein, when we refer to a "pixel value," we may refer to the value of one of the channels of the multi-channel value.

In examples described herein, compressed data is stored as compressed data blocks. For example, the header and the difference may be stored in the same consecutive compressed data block. More generally, however, it should be understood that compressed data need not be stored as blocks of data, e.g., as contiguous blocks of data. For example, the header and the difference may be stored separately, e.g., in different sections of memory.

Compression

An example of a compression technique is now described with reference to fig. 3 to 6 b. Fig. 3 shows a compression unit 302 configured to perform compression on a block of image data. Compression unit 302 may be implemented as compression unit 110 in the graphics processing system shown in fig. 1. Compression unit 302 includes analyzer logic 304 and compression logic 305. Compression logic 305 includes difference determination logic 306, difference size determination logic 308, and compressed data formation logic 310. Each of the logic blocks 304-310 implemented in the compression unit 302 may be implemented in hardware (e.g., dedicated hardware implemented with fixed function circuitry), software (e.g., as a software module executing on a processor), or a combination thereof. Implementing logic blocks in hardware generally provides lower latency operations than implementing logic blocks in software. However, implementing the logic blocks in software allows more flexibility in changing the functionality of the logic blocks after the compression unit 302 is manufactured. Thus, in some systems, a hardware implementation may be more appropriate than a software implementation (e.g., when compression needs to be performed quickly), while in some other systems, a software implementation may be more appropriate than a hardware implementation (e.g., when the functionality of the compression unit needs to be variable).

Fig. 4 shows a flow chart of a method of compressing a block of image data using the compression unit 302.

In step S402, an image data block is received at the compression unit 302. As described above, the image data block comprises a plurality of image element values, each image element value comprising a plurality of data values relating to a respective plurality of channels. The plurality of channels includes a reference channel and one or more non-reference channels. In the examples described below, each image element value includes data values related to a red (R) channel, a green (G) channel, and a blue (B) channel, and the reference channel is a green channel. The green channel is typically better correlated with the brightness of the image element values than the red and blue channels, which is why the green channel is selected as the reference channel, but in other examples channels other than the green channel (e.g. the red or blue channel) may also be the reference channel.

Steps S404 to S412 are performed for each of the non-reference channels (e.g., for the red channel and for the blue channel) in order to select a channel decorrelation mode or a non-channel decorrelation mode to determine compressed channel data for the non-reference channels.

Specifically, in step S404, the analyzer logic 304 determines a number of bits n _{non-decorrelated} for the non-channel decorrelation mode for use in losslessly representing the difference between the maximum and minimum of the data values for the non-reference channels of the block. To this end, the analyzer logic 304 may find the maximum data value in the non-reference channel for the block, find the minimum data value in the non-reference channel for the block, and subtract the minimum data value from the maximum data value to determine the difference between the maximum data value and the minimum data value (Δ _{non-decorrelated}).

In the examples described herein, the data values are in an unsigned integer format, where if the data values each have m bits, the data values range from 0 to 2 ^m -1. For example, m may be 8 such that the data value ranges from 0 to 255. In other examples, m may be a different value and/or may be in a signed format. Note that one skilled in the art will know how to map signed formatted values to equivalent offset unsigned values, e.g., involving inverting the Most Significant Bit (MSB). The determined difference (delta _{non-decorrelated}) between the maximum data value and the minimum data value for the non-reference channel is represented in an unsigned integer format, and the number of bits n _{non-decorrelated} for the non-channel decorrelation mode used to represent the determined difference losslessly is the number of bits used to represent the determined difference in an unsigned integer format without any leading zeros. Thus, n _{non-decorrelated} will range from 0 to m. If the determined difference (Δ _{non-decorrelated}) is 0, n _{non-decorrelated} =0. If the determined difference is greater than zero, then for the determined difference (Δ _{non-decorrelated}) to be an integer within the range of 2 ^k-1≤Δ_{non-decorrelated}<2^k, the analyzer logic 304 determines that n _{non-decorrelated} =k, where k is an integer such that 1+.k+.m.

In step S406, the analyzer logic 304 determines a decorrelated data value for the non-reference channel by finding a difference between the data value of the non-reference channel and the data value of the reference channel for each image element value in the block. For example, where R _i、G_i and B _i are data values for the red, green, and blue channels, respectively, of the ith image element value of a block, and where the green channel is a reference channel, the decorrelated data value (R '_i) for the red channel of the ith image element value of a block may be determined such that R' _i＝R_i-G_i, and the decorrelated data value (B '_i) for the blue channel of the ith image element value of a block may be determined such that B' _i＝B_i-G_i.

In step S408, the analyzer logic 304 determines a number of bits n _decorrelated for the channel decorrelation pattern for losslessly representing the difference between the maximum and minimum of the decorrelated data values for the non-reference channels of the block. To this end, the analyzer logic 304 may find the maximum decorrelated data value in the non-reference channel for the block, find the minimum decorrelated data value in the non-reference channel for the block, and subtract the minimum decorrelated data value from the maximum decorrelated data value to determine a difference (Δ _decorrelated) between the maximum and minimum decorrelated data values.

The decorrelated data values may be positive, zero or negative. In the examples described herein, the decorrelated data values are in a signed integer format (e.g., a two's complement format), where the initial decorrelated data values each have m+1 bits, and the decorrelated data values range from- (2 ^m -1) to 2 ^m -1. For example, as mentioned above, m may be 8 such that the decorrelated data values range from-255 to 255. In other examples, m may be a different value. Note, however, that since a non-decorrelated data value will never need more than m bits, the decorrelation case requiring m+1 bits is superfluous because it is automatically inferior to the non-decorrelation case. Thus, this example will assume that up to m bits may be required. Still further, the example may choose to consider only those cases where up to m-1 bits are needed, since for m bits there is no memory savings relative to the worst case non-decorrelated data values. The determined difference (Δ _decorrelated) between the maximum and minimum decorrelated data values for the non-reference channel is represented in an unsigned integer format, and the number of bits n _decorrelated for the channel decorrelation mode used to represent the determined difference losslessly is the number of bits used to represent the determined difference in an unsigned integer format without any leading zeros. Thus, n _decorrelated will range from 0 to n _{decorrelated,max}, where n _{decorrelated,max} may be m+1, or n _{decorrelated,max} may be m if the decorrelation case where m+1 bits are needed is never selected. Furthermore, n _{decorrelated,max} may be m-1 if a decorrelation case requiring at least m bits is never selected. If the determined difference (Δ _decorrelated) is 0, n _decorrelated =0. If the determined difference is greater than zero, then for the determined difference (Δ _decorrelated) to be an integer within the range of 2 ^k-1≤Δ_decorrelated<2^k, the analyzer logic 304 determines that n _decorrelated =k, where k is an integer such that 1+.k+.n _{decorrelated,max}.

In step S410, the analyzer logic 304 compares the determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode with the determined number of bits n _decorrelated for the channel decorrelation mode. Specifically, the analyzer logic 304 may determine whether n _decorrelated<n_{non-decorrelated}. If the analyzer logic 304 finds n _decorrelated<n_{non-decorrelated}, this indicates that the extension (or "range") of the decorrelated data values is smaller than the extension (or "range") of the (non-decorrelated) data values in the block for the channel in question, and thus the decorrelated data values may be compressed to a greater extent than the (non-decorrelated) data values in the block for the channel in question. For example, if n _decorrelated<n_{non-decorrelated}, as described in more detail below, the difference value representing the difference between the decorrelated data values in the block for the channel may be represented (losslessly) using fewer bits than the difference value representing the difference between the (non-decorrelated) data values in the block for the channel.

In step S412, the analyzer logic 304 selects either a channel decorrelation mode or a non-channel decorrelation mode based on the result of the comparison. For example, if the determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode is greater than the determined number of bits n _decorrelated for the channel decorrelation mode, the channel decorrelation mode may be selected; whereas the non-channel decorrelation mode may be selected if the determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode is smaller than the determined number of bits n _decorrelated for the channel decorrelation mode. In some examples, the non-channel decorrelation mode may be selected if the determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode is equal to the determined number of bits n _decorrelated for the channel decorrelation mode. This may be considered beneficial because any error in the data value of the reference channel will propagate into multiple channels in the channel decorrelation mode, which is not the case in the non-channel decorrelation mode. However, in some alternative examples, if the determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode is equal to the determined number of bits n _dec orrelated for the channel decorrelation mode, the channel decorrelation mode may be selected.

The minimum value (d _min) of the (non-decorrelated) data values for the non-reference channels in question may be any value in the range from 0 to 2 ^m -1 (e.g. in the range from 0 to 255, where m=8), and may thus be represented in m bits. The minimum value (d' _min) of the decorrelated data values for the non-reference channels in question may be any value in the range of- (2 ^m -1) to 2 ^m -1 (e.g. in the range of-255 to 255, where m=8), and thus may be represented by (m+1) bits. In the examples described below, a minimum value (hereinafter referred to as an "origin value") is stored in the compressed data using m bits. Thus, in some examples, if the minimum value (d '_min) of the decorrelated data values for the non-reference channel in question cannot be represented by m bits only, the channel decorrelation mode is not selected, i.e., if the minimum value (d' _min) of the decorrelated data values for the non-reference channel in question is outside the range from- (2 ^m-1) to 2 ^m-1 -1 (e.g., outside the range from-128 to 127, where m=8), the channel decorrelation mode is not selected. Stated another way, the selection of the channel decorrelation mode or the non-channel decorrelation mode in step S412 may depend on whether the minimum value (d' _min) of the decorrelated data values for the non-reference channels is at the slaveTo/>Wherein x _max is the maximum value that the (non-decorrelated) data value can have, i.e. x _max＝2^m -1. Specifically, if: (i) The determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode is less than the determined number of bits n _decorrelated for the channel decorrelation mode, or (ii) the minimum value (d' _min) of the decorrelated data value for the non-reference channel is not in the slaveTo/>Within the range of inclusion, a non-channel decorrelation mode may be selected; and if: (i) The determined number of bits n _{non-decorrelated} for the non-channel decorrelation mode is greater than the determined number of bits n _decorrelated for the channel decorrelation mode, and (ii) the minimum value (d' _min) of the decorrelated data values for the non-reference channel is at the slave/>To the point ofWithin the range of inclusion, then the channel decorrelation mode may be selected.

If the channel decorrelation mode is selected in step S412, the data values of the non-reference channels for the block are replaced with the decorrelation data values of the non-reference channels for the block to determine compressed channel data for the non-reference channels (in step S414). For example, in the case where the green channel is the reference channel and the channel decorrelation mode is selected for the red channel in step S412, then to determine compressed channel data for the red channel, the data value for the red channel (R) is replaced with a decorrelated data value R '(e.g., where R' =r-G), as described in more detail below. As another example, in the case where the green channel is the reference channel and the channel decorrelation mode is selected for the blue channel in step S412, then to determine compressed channel data for the blue channel, the data value for the blue channel (B) is replaced with a decorrelated data value B '(e.g., where B' =b-G), as described in more detail below. As another example in which there is also an alpha channel, in the case where the green channel is the reference channel and the channel decorrelation mode is selected for the alpha channel in step S412, then to determine compressed channel data for the alpha channel, the data value for the alpha channel (a) is replaced by a decorrelated data value a '(e.g., where a' =a-G), as described in more detail below. In the case where there are multiple non-reference channels, different compression modes may be selected for different non-reference channels, e.g., a channel decorrelation mode may be selected for a first non-reference channel of the multiple non-reference channels and a non-channel decorrelation mode may be selected for a second non-reference channel of the multiple non-reference channels.

In step S414, the compression logic 305 determines compressed channel data for each of the channels of the block. In different examples, the compression scheme used by compression logic 305 to compress the data values for each of the channels of the block may be different. One such example of using the origin value + delta value scheme is described in detail below with reference to fig. 5, 6a and 6b.

In step S416, the compression logic 305 forms compressed data including: (i) An indication of the selected mode for each of the one or more non-reference channels, and (ii) the determined compressed channel data for each of the channels.

The compressed data is output from the compression unit 302 and may be transferred to the memory 104, for example, via the memory interface 108. The compressed data may then be stored in the memory 104. Alternatively, the compressed data may be stored in a memory other than the memory 104, e.g., the compressed data may be stored in a local memory on the GPU 102.

Fig. 5 is a flow chart showing how the compression logic 305 may implement steps S414 and S416 in an example. In this example, the image data block may be compressed to meet the target compression level. As described above, the image data block comprises a plurality of image element values, each comprising a plurality of data values relating to a respective plurality of channels, wherein if a channel decorrelation pattern has been selected for a particular non-reference channel, the data values for that non-reference channel are replaced with decorrelated data values in steps S414 and S416 (and which may be referred to as "data values" only in the following description of steps S414 and S416). The data values within the block for the channel may be represented using an origin value that is common to all data values within the block for the channel and a difference value that is specific to each data value. The differences in the data values within the blocks for the channels may each be represented by a first number of bits. This first number of bits is chosen such that it can be used without any leading zeros to losslessly represent the maximum difference value for the channel of a block. An indication of the selected compression mode for each of the one or more non-reference channels of the block, and an indication of the origin value and the first number of bits for each of the channels of the block are included in a header portion of the compressed data block. The header portion of the compressed data block has a fixed size. The representation of the difference value is included in a body portion of the compressed data block. The body portion of the compressed data block does not have a fixed size (i.e. the size of the body portion of the compressed data block depends on the actual data being compressed), but the body portion of the compressed data block has a maximum size that will not be exceeded by the representation of the difference value stored in the body portion of the compressed data block. The maximum size of the body portion may depend on the target compression level. If lossless compression is implemented, for each of the channels, the representation of the difference for that channel each has a first number of bits determined for that channel. However, in the example of implementing lossy compression described in detail below, for each of the channels, the second number of bits is determined based on the first number of bits for all of the channels. For each of the channels, each of the representations of the differences for the channels is stored in a compressed data block using a second number of bits for the channel. If the first number of bits is used to represent the difference in the compressed data block that will meet the target compression level for compressing the image data block, the first number of bits is used to represent the difference in the compressed data block, i.e., for each of the channels, the second number of bits is determined to be equal to the first number of bits for the channel. However, if the use of the first number of bits to represent the difference in the compressed data block would not meet the target compression level for compressing the image data block, the second number of bits for at least one of the channels is determined to be less than the first number of bits for that channel, thereby ensuring that storing the representation of the difference in the compressed data block using the second number of bits would meet the target compression level.

In step S504, for each of the channels, the analyzer logic 304 determines an origin value for the channel of the image data block. For example, the origin value of a channel may be determined by identifying the minimum of the data values related to the channel of the block.

In step S506, for each of the channels, the analyzer logic 304 determines a maximum difference value for the block. The difference value (which may also be referred to as an "delta value") for a channel of a block represents the difference between the data value and the origin value of the channel determined for the block. The maximum difference value for the channel for the block may be determined by identifying a minimum data value and a maximum data value associated with the channel for the block and then subtracting the identified minimum data value from the identified maximum data value.

In step S508, the analyzer logic 304 determines a first number of bits for each of the channels. The first digit of the determined channel is a number. In particular, the first number of bits of the determined channel is the number of bits used to represent the maximum difference value of the channel without loss. In the examples described herein, the first number of bits of the determined channel is the smallest number of bits that can be used to represent the largest difference of the determined channel without loss. The first number of bits for each of the channels is determined by determining how many bits are used without any leading zeros to represent the maximum difference obtained for the channels of the block, wherein the determined number of bits is the first number of bits. Note that the first number of bits for a channel of a block is related to all data values related to that channel for the block, whereas for different data values related to the channels in the block, the channels have different first numbers of bits determined for the different data values. In some examples, step S508 may be performed for reference channels instead of non-reference channels, because for each of the non-reference channels the first number of bits would be equal to n _{non-decorrelated} determined in step S404 or to n _decorrelated determined in step S408, and thus it may not be necessary to determine the first number of bits again for the non-reference channels. Specifically, if the non-channel decorrelation mode is selected in step S412, the first number of bits for the non-reference channel will be equal to n _{non-decorrelated}, and if the channel decorrelation mode is selected in step S412, the first number of bits for the non-reference channel will be equal to n _decorrelated.

The first number of bits determined for each of the one or more channels is passed to difference size determination logic 308. In step S510, the difference size determination logic 308 uses (e.g., analyzes) the determined first number of bits for each of the channels to determine a corresponding second number of bits for each of the channels. The second number of bits is determined such that (e.g., to ensure) each of the differences for the channels satisfies a target compression level for compressing the block of image data with a corresponding second number of bits. The second digit is a number. In examples described herein, each of the second numbers of bits for the respective channel of the block is determined from all of the first numbers of bits for the respective channel of the block. Note that the second number of bits for a channel of a block is related to all data values related to that channel for the block, whereas for different data values related to the channels in the block, the channels have different second numbers of bits determined for the different data values.

Step S510 may include the difference size determination logic 308 determining whether the difference for the channel will meet the target compression level with a corresponding determined first bit number or numbers. If it is determined that the difference for a channel will meet the target compression level as represented by the corresponding determined first number of bits, then for each of the channels the second number of bits is equal to the first number of bits for that channel. In other words, if the lossless compression of the image element values in the block meets the target compression level, then the image element values are losslessly compressed. However, if it is determined that the difference for a channel will not meet the target compression level represented by the corresponding determined first number of bits, the second number of bits is less for at least one of the channels than for that channel. In other words, if lossless compression of the image element values in the block fails to meet the target compression level, the data values for at least one of the channels are compressed in a lossy manner, thereby ensuring that the target compression level is met. For example, the amount of loss introduced into the data values of the compressed image element values may not be greater than that required to meet the target compression level.

In examples described herein, the determined first number of bits for a channel is used to determine a second number of bits for a channel according to a predetermined scheme. Determining the second number of bits from the first number of bits according to a predetermined scheme means determining the second number of bits in a deterministic manner. In these examples, compression unit 302 and decompression unit 702 use the same predetermined scheme (as described in more detail below with reference to fig. 7, 8, and 9). Since the same predetermined scheme of determining the second number of bits using the first number of bits is used in the compression unit 302 and the decompression unit 702, the indication of the second number of bits need not be included in the compressed data. Note that the compression unit 302 and the decompression unit 702 both use the same target compression rate. As described below, an indication of the first number of bits is included in the compressed data, so the decompression unit 702 may determine the same second number of bits as determined in the compression unit 302 using the first number of bits and a predetermined scheme. The predetermined scheme may be referred to as a predetermined technique or a predetermined algorithm.

In step S512, for each of the channels, the difference determination logic 306 determines a difference value representing a difference between the data value and the determined origin value for the channel of the block. For example, if the origin value is the smallest of the data values of the channels in the block, the difference value may be determined by subtracting the origin value from the data values of the channels in the block.

In the example shown in fig. 5, the difference value is determined after the first number of bits and the second number of bits have been determined. In some other examples, the difference may be determined before the first number of bits and the second number of bits are determined. Specifically, in these other examples, the determined difference may be used to determine the first number of bits, for example, by identifying a maximum difference.

The compressed data forming logic 310 receives the origin value, the first number of bits, the second number of bits, and the difference value for each of the channels. The compressed data forming logic 310 also receives an indication of the selected compression mode (i.e., an indication of a channel decorrelation mode or a non-channel decorrelation mode) for each of the one or more non-reference channels. In step S514, the compressed data forming logic 310 forms compressed data, for example, as compressed data blocks, wherein the compressed data includes:

-an indication of the selected compressed mode for each of the non-reference channels;

-an indication of the determined origin value for each of the channels;

-an indication of the determined first number of bits for each of the channels; and

-For each of the channels, a representation of the determined differences, wherein each of the representations of the determined differences for the channels uses the determined second number of bits for the channel such that the target compression level is met.

The example of step S514 described above implements a lossy compression scheme in which the representation of the determined difference value for each of the channels has a second number of bits determined for that channel. However, in other examples where a lossless compression scheme is implemented, the representation of the determined difference value for each of the channels may have a first number of bits determined for that channel (and step S510 may not be performed in these lossless examples).

The target compression level corresponds to a target compression block size. For image data blocks having a fixed size, the target compression level for compressing the image data blocks means that the compressed data blocks do not exceed the target compressed block size. If the compression of the image data block "meets the target compression level", this means that the resulting compressed data block does not exceed the target compressed block size. For example, if lossless compression of a block of image data is such that the compressed block is smaller than the target compressed block size, the compressed block of data may be smaller than the target compressed block size. In this sense, the target compressed block size represents the maximum size of the compressed block. Compression unit 302 may have a set of fixed target compression levels. For example, a set of predetermined target compression levels may include:

(i) A lossless compression level, in which all image element values are compressed losslessly, even though this does not result in any reduction in data size (this may be considered to be equivalent to a target compression ratio of 1:1, which may refer to a target compressed block size being 100% of the uncompressed image data block size), such that at the lossless compression level, the space reserved in memory (which may be referred to as the 'memory footprint') remains unchanged, but the amount of data transferred to and from memory (which may be referred to as the 'memory bandwidth') may be reduced;

(ii) A 75% compression level, wherein the target compressed block size is 75% of the uncompressed image data block size. This corresponds to a compression ratio of 4:3;

(iii) A 50% compression level, wherein the target compressed block size is 50% of the uncompressed image data block size. This corresponds to a compression ratio of 2:1;

(iv) A 37.5% compression level, wherein the target compressed block size is 37.5% of the uncompressed image data block size. This corresponds to a compression ratio of 8:3; and

(Iv) A 25% compression level, wherein the target compressed block size is 25% of the uncompressed image data block size. This corresponds to a compression ratio of 4:1.

In the examples described herein, the target compression level refers to the amount of data used to store the difference. The size of the data used to store the indication of the origin value and the indication of the first number of bits is fixed and does not depend on the target compression level used by the compression unit 102 and is not included in determining the compression ratios "1:1", "4:3", "2:1", "8:3", and "4:1" in the examples listed above.

If the compression unit is implemented in dedicated hardware (e.g., as a fixed function circuit), the compression unit 302 may be configured in hardware to be able to perform compression according to any one of a set of target compression levels. A selection of a target compression level may be made from a set of target compression levels for use by compression unit 302 in compressing a block of image data. For example, the target compression level implemented by compression unit 302 may be configured prior to runtime, such as by firmware instructions executed at GPU initialization. In this way, the compression unit 302 will compress all image data blocks according to the same target compression level unless the configuration of the compression unit 302 is subsequently altered. Alternatively, when a block of image data is provided to the compression unit 302 to be compressed, an indication may be provided to the compression unit 302 along with the block of image data to indicate which target compression level is used to compress the block of image data. This will allow the compression unit 302 to compress different image data blocks according to different target compression levels without being reconfigured, but will add a little additional complexity to the system since an indication of the target compression level will be sent to the compression unit 302 together with the image data blocks. During decompression, a driver for the GPU sends an indication of the target compression level to the decompression unit. This may be relatively simple if the target compression level for the different image data blocks is unchanged at run-time. However, if the target compression levels for different image data blocks may change at run-time, the driver may keep track of the target compression levels for the different compressed data blocks and indicate these to the decompression unit so that the decompression unit may correctly decompress the compressed data blocks according to its target compression levels. In some other examples, the indication of the target compression level for the block may be included in a header portion of the compressed data block such that upon decompressing the compressed data block, the decompression unit may read the indication of the target compression level from the header portion of the compressed data block to determine the target compression level for the block instead of the driver tracking the target compression level.

In a different example, in step S510, the second number of bits may be differently determined based on the first number of bits. For example, to determine the second number of bits, the first number of bits may be reduced by zero, one or more such that each of the differences for the channel is represented by the respective second number of bits by removing zero, one or more Least Significant Bits (LSBs) from the representation of the difference having the determined first number of bits to meet the target compression level for compressing the block of image data. The reduction of the first number of bits to determine the amount of the second number of bits may be defined by a predetermined scheme. If the first number of bits of the channel is reduced by zero, this means that the second number of bits of the channel will be equal to the first number of bits of the channel. Similarly, if zero LSBs are removed from the representation of the difference value with the first digit, the representation of the difference value is not changed. The LSBs are removed from the representation of the difference (rather than the other bits) because these are the least significant bits in terms of the value representing the difference, so that removing the LSBs will lose less information than removing the other bits from the difference. In this way, although some information loss is introduced into the compression process by removing the LSB from the difference, the information loss is small and, for example, only to the extent that ensures that the target compression level is met.

The previous paragraph gives a method (which may be referred to as truncation (truncation)) for mapping the representation of the difference from a first number of bits (N) to a second number of bits (M), where M < N), but other methods may be used in other examples. In some of these other methods, the average error may be smaller when decompressing the representation of the difference from M bits back to N bits. For example, in one other approach, there may be a pre-computed look-up table that includes a mapping from an N-bit pattern of bits to an M-bit pattern of bits, which will reduce errors when the M-bit representation is decompressed back into an N-bit representation. For example, the mappings in the lookup table may be determined by performing an exhaustive search for the "best" M-bit pattern for each N-bit pattern for a given decompression scheme (e.g., bit replication) and storing these mappings in the lookup table. Here, the "best" M-bit pattern for an N-bit pattern is an M-bit pattern that has minimal error relative to the original N-bit pattern when decompressed back by N-bits (according to the decompression scheme used in the system). In this 'look-up table' method, the first number of bits is reduced by zero, one or more, so that the second number of bits is determined such that each of the differences for the channels is represented by a respective second number of bits by mapping a representation of the difference having the first number of bits to a representation of the difference having the second number of bits according to a predetermined look-up table, satisfying a target compression level for compressing the image data block.

As another method, for each of the differences, a truncation method may be used to map the representation of the difference from a first number of bits (N) to a second number of bits (M) to determine a first candidate map. The second candidate map is determined to be 1 greater than the first candidate map. The third candidate map is determined to be 1 smaller than the first candidate map. Each of the three candidate maps may be re-expanded into an N-bit representation and the three re-expanded representations compared to the original N-bit representation of the difference. A candidate map that produces a re-expanded representation closest to the original N-bit representation of the difference value is selected for use as the representation of the difference value having the second number of bits (M).

The predetermined scheme may be directed to uniformly reducing each of the channels, for example by the same or similar amounts. For example, in the above truncation method, the least significant bits may be sequentially discarded from the respective channels according to a predetermined scheme until the target compression level is satisfied. In other words, one bit is removed from each of the channels in turn until the total number of bits meets the target compression level. The order in which LSBs are discarded from different channels is defined by a predetermined scheme. The reference channel may be prioritized over the non-reference channels, for example by setting the order in which the LSBs are discarded from the different channels, such that the reference channel is the last channel to discard the LSBs.

In the example described above, in step S406, the analyzer logic 304 obtains the maximum difference value for the channel for the block by identifying a minimum data value and a maximum data value related to the channel for the block, and subtracting the identified minimum data value from the identified maximum data value. However, in other examples, the maximum difference value for a channel may be determined for a block by determining all the difference values for the channel for the block and then identifying the maximum of those determined difference values. In these other examples, the analyzer logic may obtain the maximum difference for the channels of the block by receiving the differences of the channels determined for the block and determining which of the determined differences is the largest.

Note that if all the data values of a channel in a block have the same value, then the difference value for that channel is not stored in the compressed data block. In this case, the first number of bits of the channel and the second number of bits of the channel are both zero, and the origin value of the channel stored in the header indicates the single value for the channel that each data value has in the block.

As described above, in some examples, the origin value of a channel for a block is determined by identifying the minimum of the data values related to the channel for the block, and the maximum difference value of the difference values of the channels for the block is determined by identifying the maximum of the data values related to the channel for the block and subtracting the minimum of the identified data values related to the channel for the block from the maximum of the identified data values related to the channel for the block. In these examples, the difference value for the channel is determined by subtracting the origin value of the channel from the data value associated with the channel. In these examples, the origin value of a channel represents the lower limit (i.e., minimum) of the data values associated with the channel for the block, and the difference value of the channel represents the addition of the origin value of the channel as the compressed data block is decompressed.

However, in other examples, the origin value of the channel for the block is determined by identifying the maximum of the data values related to the channel for the block, and the maximum difference value of the differences of the channels for the block is determined by identifying the minimum of the data values related to the channel for the block and subtracting the minimum of the data values related to the identified channel for the block from the maximum of the data values related to the identified channel for the block. In these examples, the difference value for the channel is determined by subtracting the data value related to the channel from the origin value of the channel. In these examples, the origin value of a channel represents the upper limit (i.e., maximum) of the data value associated with the channel for the block, and the difference value of the channel represents the subtraction made from the origin value of the channel when decompressing the compressed data block.

In some examples, the origin value of a channel may be set within a range of data values for the channel of the block (e.g., a value intermediate the range of data values for the channel of the block, e.g., a value halfway between the maximum data value and the minimum data value associated with the channel of the block), where the difference value may be a signed value.

In some examples, it may be beneficial to clamp the incoming data value to a slightly reduced range so that fewer bits may be used for the representation of the difference. For example, if the minimum data value for the channel of a block is 20 and the maximum data value for the channel of a block is 54, the maximum difference for the channel of a block is 34. Since 34 is just above 32 (where note 2 ⁵ =32), it is too large to be represented by 5 bits, and thus 6 bits are used (i.e., the first number of bits for the channel of the block is 6). However, almost half of the possible encodings (i.e., from 35 to 63) are unused. If the incoming data value is clamped to a reduced range, e.g. 21 to 52 (i.e. so the minimum data value is 21 and the maximum data value is 52), the maximum difference for the channel of the block will be 31, which can be represented by 5 bits (i.e. the first number of bits for the channel of the block is 6), so the compression ratio can be increased and the additional error caused by the clamping is very small.

In some examples, the origin value and the difference value for each of the channels of the block may be determined from a modulo operation. In these examples, the origin value for each of the channels may be determined as one of the data values of the block associated with the channel, which data value yields the smallest maximum difference value when the difference value is determined relative to the origin value according to the modulo of the modulo operation. For example, the difference value may be determined as modulo 2m, where m is the number of bits in each of the data values. For example, if the data value is an 8-bit value, the difference value may be determined as modulo 256. For example, if the data values related to the channels for the blocks are 8-bit values representing decimal values of 251, 255, 7, and 16 (i.e., in binary, the data values are 11111011, 11111111, 00000111, and 00010000), the origin value may be determined to be 251 (when these values are represented in modulo 256, they may be considered to represent a value of-5). Taking 251 as the origin value of the four data values, the differences may be determined as 0, 4, 12 and 21, such that the first number of bits determined for these differences is five, i.e. the maximum difference (21) may be represented with five bits without loss (10101). Note that if the modulo operation is not used to compress a block of these four data values, then the differences would each need to have eight bits, e.g., if the origin value is determined to be the smallest value in the block (i.e., 7), then the largest difference would be 248, requiring eight bits to be represented losslessly (as 11111000).

In the above example, a single origin value is determined for each of the channels for the block. In some other examples, a plurality of origin values may be determined for at least one of the channels. In these other examples, each of the differences for at least one of the channels for which there are multiple origin values is determined relative to one of the multiple origin values, and an indication of each of the differences is included in the compressed block to indicate from which of the multiple origin values the difference was determined. This may be useful if the data values related to the channel within the block form a plurality of groups of data values, wherein each group of data values has a small range. For example, if the image element values within a block of image data relate to a plurality of different objects in a scene, this may occur relatively frequently. For example, if the data values related to the channels for the block are 3, 5, 3, 4, 6, 132, 133, then the first origin value may be determined to be 3 and the second origin value may be determined to be 132, and then the differences for the first five data values may be determined to be 0, 2, 0, 1, and 3 with respect to the first origin value and the differences for the last three data values may be determined to be 0, and 1 with respect to the second origin value. In this case, the first number of bits determined for these differences is two, i.e. the maximum difference (3) can be represented losslessly in two bits (11). Note that if this block of data values is compressed using a single origin value, the differences each need to have eight bits, e.g., if the origin value is determined to be the minimum (i.e., 3), then the maximum difference would be 130, requiring eight bits to be represented losslessly (10000010).

As mentioned above, in the examples described herein, the compressed data is stored as compressed data blocks, e.g., the header and the difference may be stored in the same consecutive compressed data blocks. More generally, however, it should be understood that compressed data need not be stored as blocks of data, e.g., as contiguous blocks of data. For example, the header and the difference may be stored separately, e.g., in different sections of memory.

Fig. 6a and 6b show example formats of compressed data blocks. In particular, fig. 6a shows an example format 600 for storing four compressed data blocks (denoted block 0, block 1, block 2, and block 3 in fig. 6 a). Each of the compressed data blocks includes a header portion (stored in section 602) having a fixed size and comprising: an indication of a compressed mode for each of the non-reference channels, an indication of the determined origin value for each of the channels, and an indication of the determined first number of bits for each of the channels of the block. Each of the compressed data blocks also includes a body portion (stored in section 604) having a variable size and including a representation of the determined difference value for each of the channels. In the example shown in fig. 6a, a plurality of (i.e., four) compressed data blocks are stored together such that the header portions of the compressed data blocks are stored together in section 602 and the body portions of the compressed data blocks are stored together in section 604. In other examples, each compressed data block may be stored separately, i.e., in a different address space, such that the header portion and body portion of each compressed data block are stored in a contiguous address range. In fig. 6a, the size of the body portion (stored in section 604) of the compressed block is a multiple of 128 bits. In the example described herein, this is the case when the target is an even bit/value (e.g., 4 bits/pixel) because the image data blocks (202 ₀、202₁、202₂ and 202 ₃) have 64 image element values, but if the target is an odd bit/value (e.g., 5 bits/pixel), the size of the body portion (stored in section 604) of the compressed block may be a multiple of 64 bits (and not necessarily a multiple of 128 bits).

The four compressed blocks represented in fig. 6a may correspond to the respective four image data blocks 202 ₀、202₁、202₂ and 202 ₃ shown in fig. 2, such that each image data block includes 64 image element values. In this example, each image element value includes four 8-bit data values relating to the red, green, blue and alpha channels, respectively, such that each image element value is represented with 32 bits. Thus, each uncompressed image data block is represented with 2048 bits. In the example shown in fig. 6a and 6b, a lossy compression scheme has been used and the target compression level is 50%. This means that in this example the maximum total number of bits that can be used to represent the difference value for each compressed data block is 1024 (i.e. 16 bits per picture element value, divided between four channels). In the example shown in fig. 6a, each of the header portions of the compressed data block comprises 64 bits. In the example shown in fig. 6a, each row of the address space represents 128 bits, so the header portions of two compressed blocks may be on the same row, and each of the body portions of different compressed data blocks is provided with a maximum memory footprint corresponding to eight rows of the address space, i.e. 1024 bits (fig. 6a is not drawn to scale). Because the compression technique described herein in this example ensures that the target compression level (e.g., 50% compression target) is met, the body portion of the compressed data block will not exceed 1024 bits provided for it in the example shown in fig. 6 a. Each compressed data block has a fixed memory allocation (i.e., a fixed memory footprint) that does not exceed and is determined by the target compression level.

As an example of c channels (channel 0 to channel (c-1)), the difference value related to d data values (data value 0 to data value (d-1)) is set to data value 0, channel 0; data value 0, channel 1; … data value 0, channel (c-1); data value 1, lane 0; … data value 1, channel (c-1); … data value (d-1), lane 0; … data values (d-1), the order of the channels (c-1) being stored in the body portion of the compressed data block. In other words, a difference value related to a particular data value is stored for each of the different channels, and then a difference value related to a next data value is stored for each of the different channels, and so on. This allows reading the difference value associated with the data value of each of the different channels for a particular image element value from the compressed data block in one chunk. This may be useful if some, but not all, of the image element values in the compressed data block are to be decompressed.

Fig. 6b shows an example format 610 of data within a header portion of a compressed data block. In this example, 64 bits of data are included in the header portion of the compressed data block. The header portion includes an indication of the origin values for the four channels 612. In this example, 32 bits are used for the indications of these origin values, such that 8 bits are used for each indication of the origin value of the channel. In this way, for each of the channels, the indication of the origin value for the channel has the same number of bits as one of the data values associated with the channel of one of the image element values in the image data block. In this way, the accuracy of the origin value is not reduced, and therefore no data is lost in the origin value even if lossy compression is being performed. If a channel decorrelation mode is selected for the non-reference channel, the origin value for the non-reference channel is indicated by an indication in the compressed data in signed format. If a non-channel decorrelation mode is selected for the non-reference channel, the origin value for the non-reference channel is indicated by an indication in the compressed data in unsigned format. For each of the non-reference channels, an indication of the determined origin value for the non-reference channel may be included in the compressed data as a number representing: (i) The determined origin value for the non-reference channel, or (ii) the difference between the determined origin value for the non-reference channel and the determined origin value for the reference channel of the block.

The header portion also includes some bits 614 (e.g., three bits) that are used to indicate the compressed mode for the non-reference channels. For example, a first bit of the three bits 614 may indicate a channel decorrelation mode or a non-channel decorrelation mode for a first non-reference channel (e.g., a red channel), a second bit of the three bits 614 may indicate a channel decorrelation mode or a non-channel decorrelation mode for a second non-reference channel (e.g., a blue channel), and a third bit of the three bits 614 may indicate a channel decorrelation mode or a non-channel decorrelation mode for a third non-reference channel (e.g., an alpha channel). Fig. 6b shows that the head part also comprises some error correction indications 616, e.g. Cyclic Redundancy Check (CRC) bits, which will be described below. In the example shown in fig. 6b, 13 bits are used for the error correction indication. Error correction indications 616 are included for some use cases (e.g., for safety critical cases), but for other use cases they may not be needed and may not be included in the header portion. The header portion also includes an indication of the first number of bits for each of the channels 618. In this example, there are four channels and each indication of the first number of bits has four bits, such that there are sixteen bits for the indication of the first number of bits. Using four bits to indicate the first number of bits allows a value from 0 to 15 to be represented by each first number of bits. Many of these codes are not used to indicate the first number of bits, for example when all data values are 8 bit values, values from 9 to 15 are not used to indicate the first number of bits, and these redundant codes may be used for other purposes.

In examples described herein, the compressed data block has a base address and the header portion begins at a first address defined by the base address, wherein data of the header portion begins at the first address in a first direction, and wherein the body portion begins at a second address defined by the base address, wherein data of the body portion begins at the second address in a second direction, wherein the first direction is opposite to the second direction in the address space. For example, if each compressed data block is stored separately, the base address may point to the beginning of the body portion of the compressed data block, and the address of the body portion increases from that point, the header portion may begin with a base address minus 1, and the address of the header portion decreases from that point. The example shown in fig. 6a is somewhat more complex, because four compressed data blocks are stored together. However, even in this case, the start address of the header portion of each of the different compressed data blocks is defined by the base address, and the start address of the body portion of each of the different compressed data blocks is defined by the base address.

In the above example, the compressed data block does not include an indication of the determined second number of bits for each of the channels. Instead, an indication of the first number of bits for each of the channels is included in the compressed data block, and the same predetermined scheme used to determine the second number of bits from the first number of bits may be used in the decompression unit 112 to determine the second number of bits, as described in more detail below.

However, in other examples, the compressed data block may also include an indication of the determined second number of bits for the channel (e.g., in a header portion of the compressed data block) for each of the channels. For example, the header portion of the compressed data block may not include the error correction indication 616, in which case those bits may be used to store an indication of the second number of bits for the corresponding channel.

As mentioned above, in some examples, one or more error correction indications may be included in the compressed data block. These error correction indications may be determined by the compression unit 302 by determining the image element values that will be obtained by correctly decompressing the compressed data blocks. The error correction indication may be CRC bits that may be calculated in a known manner. If the difference is represented losslessly in the compressed data block, e.g. if the first number of bits is equal to the second number of bits for each of the one or more channels, the CRC bits can be calculated from the original data value of the image data block, as these are the values that should be obtained by correctly decompressing the compressed data block. However, if any of the differences is represented in the compressed data block, for example, if the first number of bits is not equal to the second number of bits for any of the channels, the compression unit 302 may determine a data value that the decompression unit 602 should obtain by correctly decompressing the compressed data block, and then may calculate CRC bits from the data value determined by the compression unit 302. The compression unit may for example determine that the decompression unit should obtain data values by analyzing the input data blocks and various determinations made during the compression process. Alternatively, the compression unit may decompress the compressed data block to determine the data value that the decompression unit should obtain. In this way, even when lossy compression is being performed, error correction techniques may be used to determine if there are any errors in the transmission of data representing the compressed data blocks, for example, in the transmission from the compression unit 110 to the memory 104 via the memory interface 108, and from the memory 104 back to the decompression unit 112 via the memory interface 108.

As mentioned in the previous paragraph, the compression unit 302 may determine the data value that the decompression unit should obtain when decompressing compressed data. If lossy compression has been performed, the decompressed data values for the reference channel may not exactly match the original data values in the reference channel before compression (which may be referred to as "uncompressed" data values). When the channel decorrelation mode is selected, then the decompressed data values for the reference channels (rather than the uncompressed data values) will be used to determine the data values for the non-reference channels. However, in the above example, the decorrelated data values for the non-reference channels are determined in the compression unit 302 based on the uncompressed data values for the reference channels (instead of the decompressed data values). Thus, in some examples, after the data values have been compressed as described above, compression unit 302 may re-determine the de-correlated data values for the non-reference channels based on the decompressed data values for the reference channels (instead of the uncompressed data values), and then compress these re-determined de-correlated data values for the non-reference channels as described above. This will reduce the extent to which errors introduced into the data values of the reference channels by lossy compression propagate into the data values of the non-reference channels. More specifically, if the channel decorrelation mode is selected, the compression unit 302 may determine a decompressed data value for the reference channel using an indication of the origin value for the reference channel and a representation of the determined difference value, and then for each of the one or more non-reference channels, the compression unit 302 may: (i) Determining new decorrelated data values for the non-reference channel by finding a difference between the data values of the non-reference channel and the decompressed data values of the reference channel for each image element value in the block; and (ii) replacing the data values of the non-reference channels for the block with the new decorrelated data values to determine compressed channel data for the non-reference channels.

Exemplary compression

An example will now be described in which each image element value has data values related to four color channels (red channel (R), green channel (G), blue channel (B) and alpha channel (a)). In this simplified example, a block of four pixel values would be compressed. Each pixel value is represented in 32 bits (i.e., each data value is represented in 8 bits) in R8G8B8A8 format, and thus the uncompressed image data block is represented in 128 bits. The target compression level is 50% and therefore the body portion of the compressed data block cannot exceed 64 bits. The red, green, blue and alpha values in a block may be represented as separate data sets. In this example, in different channels, the values of four pixels in a block are:

R＝[17，24，9，45]

G＝[89，100，84，119]

B＝[240，228，215，198]

A＝[255，255，250，255]

In this example, the green channel is the reference channel, so the decorrelated data values are:

R’＝R–G＝[-72，-76，-75，-74]

G’＝G＝[89，100，84，119]

B’＝B-G＝[151，128，131，79]

A’＝A-G＝[166，155，166，136]

The difference between the maximum and minimum of the red data values is 36 (45-9=36), which can be represented in binary with 6 bits (e.g. 100100), so that n _{non-decorrelated,red} =6. The difference between the maximum and minimum of the decorrelated red data values is 4 (-72+76=4), which can be represented in binary by 3 bits (e.g. 100), so that n _{decorrelated,red} =3. Since n _{decorrelated,red}<n_{non-decorrelated,red} and the minimum value (-76) of the decorrelated data values for the red channel is in the range of-128 to +127, the channel decorrelation mode is selected for the red channel and the (non-decorrelated) data value R is replaced with the decorrelated data value R' for compression.

The difference between the maximum and minimum of the blue data values is 42 (240-198=42), which can be represented in binary with 6 bits (e.g. 101010), so that n _{non-decorrelated,blue} =6. The difference between the maximum and minimum of the decorrelated blue data values is 3 (151-79=72), which can be represented in binary with 7 bits (e.g. 1001000), so that n _{decorrelated,blue} =7. Due to n _{decorrelated,blue}>n_{non-decorrelated,blue}, a non-channel decorrelation mode is selected for the blue channel and compressed using the (non-decorrelated) data value B.

The difference between the maximum and minimum values of the alpha data values is 5 (255-250=5), which can be represented in binary by 3 bits (e.g. 101), so that n _{non-decorrelated,alpha} =3. The difference between the maximum and minimum values of the decorrelated alpha data values is 30 (166-136=30), which can be represented in binary 5 bits (e.g. 11110), so that n _{decorrelated,alpha} =5. Due to n _{decorrelated,alpha}>n_{non-decorrelated,alpha}, a non-channel decorrelation mode is selected for the alpha channel and compressed using the (non-decorrelated) data value a.

In this example, the origin value is determined as the minimum value in the block for each channel. Therefore, the origin value and the difference value are:

R': origin= -76 (10110100 in two's complement format), difference= [4,0,1,2]. The lossless differences for the red channel may be represented in binary with 3 bits (i.e., the first digit of the red channel is 3), i.e., they may be represented as: [100, 000, 001, 010].

G: origin=84, difference= [5, 16,0, 35]. The lossless differences of the green channel may be represented in binary with 6 bits (i.e. the first digit of the green channel is 6), i.e. they may be represented as: [000101, 010000, 000000, 100011].

B: origin=198, difference= [42, 30, 17,0]. The lossless differences for the blue channels may be represented in binary with 6 bits (i.e. the first digit of the blue channel is 6), i.e. they may be represented as: [101010, 011110, 010001, 000000].

A: origin=250, difference= [5,5,0,5]. The lossless differences for the alpha channel may be represented in binary with 3 bits (i.e., the first number of bits for the alpha channel is 3), i.e., they may be represented as: [101, 101, 000, 101].

The 50% compression target allows 16 bits per pixel value for the difference, but if all channels are compressed losslessly in this example, there will be 18 bits per pixel value for the difference (i.e. 3+6+6+3=18). According to a predetermined scheme, we can choose to first lose LSBs from the alpha channel and the blue channel such that the second number of bits is determined for the channel as r=3 bits, g=6 bits, b=5 bits, and a=2 bits. The indication of the origin value for each channel comprises 8 bits, the indication of the first number of bits for each channel comprises 4 bits, and the indication of the selected compression mode for each non-reference channel comprises 1 bit.

Thus, in the compressed block, we store in binary an indication of the compressed mode for each of the non-reference channels (where '0' indicates the non-channel decorrelation mode and '1' indicates the channel decorrelation mode), an indication of the origin value for each of the channels, an indication of the first number of bits for each of the channels, and a difference value for each of the channels as:

r: origin value= 10110100, compressed mode indicator=1, first digit=0011, difference: [100, 000, 001, 010]

G: origin value= 01010100, first digit=0110, difference: [000101, 010000, 000000, 100011]

B: origin value= 11000110, compressed mode indicator=0, first digit=0110, difference: [10101, 01111, 01000, 00000]

A: origin value= 11111010, compressed mode indicator=0, first digit=0011, difference: [10, 10, 00, 10]

Note that in this example, using the channel decorrelation mode for the red channel means that the first number of bits of the red channel is 3, whereas if the non-channel decorrelation mode is used for the red channel, the first number of bits of the red channel should be 6. Thus, if in this example the channel decorrelation mode is not available for the red channel, the second number of bits for three of the four channels must be reduced by one, which will increase the error introduced by the lossy compression technique. In the example given above, a simple truncation method is used to reduce the difference to have the second number of bits, but in other examples other methods may be used, for example the look-up table method described above may be used instead of the truncation method.

Decompression

Examples of decompression techniques are now described with reference to fig. 7, 8, and 9. Fig. 7 shows a decompression unit 702 configured to perform decompression to determine one or more image element values from compressed data (e.g., compressed data blocks). Decompression unit 702 may be implemented as decompression unit 112 in the graphics processing system shown in FIG. 1. The decompression unit 702 includes decompression logic 703. Decompression logic 703 includes difference determination logic 704 and initial data value determination logic 706. The difference determination logic 704 includes difference size determination logic 708 and unpacker logic 710. Decompression unit 702 also includes channel decorrelation logic 712. Each of the logic blocks 704-712 implemented in the decompression unit 702 may be implemented in hardware (e.g., dedicated hardware implemented with fixed function circuitry), software (e.g., as a software module executing on a processor), or a combination thereof. Implementing logic blocks in hardware generally provides lower latency operations than implementing logic blocks in software. However, implementing the logic blocks in software allows more flexibility in changing the functionality of the logic blocks after the decompression unit 702 is manufactured. Thus, in some systems, a hardware implementation may be more appropriate than a software implementation (e.g., when compression needs to be performed quickly), while in some other systems, a software implementation may be more appropriate than a hardware implementation (e.g., when the functionality of the decompression unit needs to be variable).

Fig. 8 shows a flow chart of a method of performing decompression to determine one or more image element values from a compressed data block that has been compressed as described above. As described above, each image element value comprises a plurality of data values associated with a respective plurality of channels, wherein the plurality of channels comprises a reference channel and a plurality of non-reference channels. In this example, the compressed data is in a compressed data block, but in other examples, the compressed data is not necessarily in a compressed data block.

In step S802, a compressed data block is received at the decompression unit 702. For example, if decompression unit 702 is implemented as decompression unit 112 in the graphics processing system shown in FIG. 1, compressed data blocks may be received from memory 104 via memory interface 108.

In step S804, the decompression logic 703 reads compressed channel data for each of the channels from the compressed data. For example, the compressed channel data may have the format shown in fig. 6a and 6b and described above. In step S806, for each of the channels, the decompression logic 703 uses the compressed channel data for the channel to determine an initial data value related to the channel for each of the one or more image element values that are decompressed. For each of the one or more image element values that are decompressed, the determined initial data value related to the reference channel for the image element value is a decompressed data value for the reference channel. However, depending on the compression mode for the non-reference channel, the determined initial data value related to the non-reference channel for the image element value may or may not be a decompressed data value for the non-reference channel. Details of how steps S804 and S806 are performed in the example are described in detail below with reference to the flowchart of fig. 9.

In steps S808 through S814, for each of the one or more image element values that were decompressed, the channel decorrelation logic 712 determines a decompressed data value for each of the non-reference channels. Specifically, in step S808, for each of the non-reference channels, the channel decorrelation logic 712 reads an indication of the compressed mode for the non-reference channel of the block from the compressed data. As described above, the compression mode is a channel decorrelation mode or a non-channel decorrelation mode.

In step S810, the channel decorrelation logic 712 identifies the compressed mode for the non-reference channel as a channel decorrelation mode or a non-channel decorrelation mode. In one example, if the indication of compressed mode is '0', it indicates a channel decorrelation mode, and if the indication of compressed mode is '1', it indicates a non-channel decorrelation mode; in other examples, '1' may indicate a channel decorrelation mode and '0' may indicate a non-channel decorrelation mode. If the compressed mode for the non-reference channel is the channel decorrelation mode, the method goes to step S812, and if the compressed mode for the non-reference channel is the non-channel decorrelation mode, the method goes to step S814.

In step S812 (which is performed on the non-reference channels if the compressed mode for the non-reference channels is a channel decorrelation mode), channel decorrelation logic 712 determines decompressed data values for the non-reference channels as a function (e.g., sums) of the determined initial data values related to the non-reference channels and the determined initial data values related to the reference channels for the image element values. In the examples detailed herein, the function is a summation function, but it should be understood that in other examples, other mathematical functions may be used. This is beneficial when the compressed decorrelation operation takes the difference between the non-reference and reference channels, and when there is a (average) strong positive correlation between the reference and non-reference channels, i.e. the covariance of the two channels is about 1.0. This is often the case in image data, as the signal is often subject to brightness variations, where if one channel increases (or decreases), the likelihood that the other channel also increases (or decreases) is high.

In other examples, alternative or additional 'decorrelation' schemes may be used or included to cope with other data behaviors. For example, in RGB or YUV image data, sometimes a block of pixels may exhibit "anti-correlation" of channels, i.e., as the U-channel increases, perhaps V decreases. Similarly, in RGB data, an image of a parrot, perhaps with red and green feathers, may show an inverse relationship between R & G channels. In these examples, where the covariance of the channels is about-1.0, instead of using the difference between the reference and non-reference channels during compression, the sum of the channels may be used. Thus, a difference may be used in decompression.

In another example, the magnitude of the covariance between the channels may be less than one. For example, if "X" is changed with the green channel and "X/2" is changed with the blue channel, then the decorrelation is B _Decorrelated =b-floor (G/2) at compression and vice versa at decompression. Additional analysis and labeling may be included in the scheme to select the best decorrelation pattern.

In step S814 (which is performed on the non-reference channels if the compressed mode for the non-reference channels is a non-channel decorrelation mode), channel decorrelation logic 712 determines decompressed data values for the non-reference channels as determined initial data values related to the non-reference channels for the image element values.

The determined decompressed data values represent decompressed image element values. The method may further include outputting the determined decompressed data values of the decompressed one or more image element values for further processing. For example, where decompression unit 702 is implemented as decompression unit 112 in graphics processing system 100, the determined decompressed data values may be output from decompression unit 112 to processing logic 106 of GPU 102, e.g., for processing by processing logic 106.

As described above, different compression modes may be used for different non-reference channels, e.g., a compression mode for a first one of the non-reference channels may be a channel decorrelation mode and a compression mode for a second one of the non-reference channels may be a non-channel decorrelation mode. Further, as described above, the plurality of channels may include a red channel, a green channel, a blue channel, and in some examples an alpha channel, and the green channel may be a reference channel.

Fig. 9 is a flow chart showing how decompression logic 703 may implement steps S804 and S806 in an example. In this example, the image data block may be decompressed, where the image data block has been compressed to meet the target compression level. In this example, the compressed data includes an indication of a compressed mode for each of the non-reference channels, an indication of an origin value for each of the channels, an indication of a first number of bits to represent the difference value for each of the channels losslessly, and a representation of the difference value for each of the channels. A second number of bits may be obtained for each of the channels, wherein the second number of bits is the number of bits of the representation of the difference value included in the compressed data. If the second number of bits is less than the first number of bits of the channel, one or more LSBs are added (i.e., appended) to the difference from the compressed data to determine a difference having the first number of bits. Then, for each channel, an initial data value associated with the channel may be determined by combining the origin value for the channel and the difference value for the channel with the first number of bits for the channel.

The compressed data may be stored as compressed data blocks having the above-described format (e.g., as shown in fig. 6a and 6 b). As mentioned above, in the examples described herein, the compressed data is stored as compressed data blocks, e.g., the header and the difference may be stored in the same consecutive compressed data blocks. More generally, however, it should be understood that compressed data need not be stored as blocks of data, e.g., as contiguous blocks of data. For example, the header and the difference may be stored separately, e.g., in different sections of memory. In the example format of the compressed data block shown in fig. 6a and 6b, the header portion of the compressed data block has a fixed size. Furthermore, the second number of bits may be determined based on information in the header portion without the need to read any data from the body portion of the compressed data block. When the second number of bits has been determined, the location of each difference may be determined. For example, the sum of the second number of bits of the channel gives the total compressed size of the difference value in relation to the data value of one image element value, which can be used to determine the offset of the decompressed image element value relative to the desired difference value. Thus, the differences of some data values may be read, not necessarily all differences. This means that the decompression unit 112 may determine the decompressed values for one or more of the image element values in the image data block without decompressing the entire compressed data block. For example, an image data block may include 64 texel values, and a bilinear texture filter operation may require determining only four of the texel values from the compressed data block. Using the decompression techniques described in the examples herein, decompression unit 112 may determine four desired texel values without decompressing the other 60 texel values in the compressed data block. In this way, the compressed data blocks are "randomly accessible". Furthermore, individual data values relating to individual channels may even be decompressed from the compressed data block without having to decompress other data values (even data values relating to different channels of the same image element value) from the compressed data block.

In this example, step S804 includes steps S902 to S908, and step S806 includes steps S910 and S912. In step S902, the difference size determination logic 708 reads an indication of the first number of bits from the compressed data blocks for each of the channels. As described above, the first number of bits of a channel is the number of bits that can be used to losslessly represent the difference of the channel for the block.

In step S904, the difference size determination logic 708 obtains the difference size of the difference value stored in the compressed block for each of the channels. In other words, in step S904, the difference size determination logic 708 obtains (e.g., determines) a second number of bits for each of the channels, wherein a representation of the difference for each of the channels is included in the compressed data block using the second number of bits for that channel. As described above, the second number of bits is determined to ensure that the compressed data block meets the target compression level for compressing the image data block. The determined difference size (i.e., the second number of bits) and an indication of the first number of bits of the channel are provided to the unpacker logic 710.

In the example described herein in which the indication of the second number of bits is not included in the compressed data block, step S904 includes determining the second number of bits for the channel using the first number of bits for each of the channels according to a predetermined scheme. Specifically, the predetermined scheme is the same scheme as that used in the compression unit 302 described above, so that given the same first number of sets of bits, the decompression unit 702 will determine the same second number of sets of bits as determined in the compression unit 302. In other words, the compression unit 302 and the decompression unit 702 use a common function to determine how many LSBs were discarded from the difference value for each of the one or more channels.

Specifically, similar to that described above with respect to the compression process, step S904 may include determining whether the difference for the channel will satisfy the target compression level with a corresponding determined first number of bits. If it is determined that the difference for a channel will meet the target compression level as represented by the corresponding determined first number of bits, then for each of the channels the second number of bits is equal to the first number of bits for that channel. However, if it is determined that the difference for a channel will not meet the target compression level represented by the corresponding determined first number of bits, the second number of bits is less for at least one of the channels than for that channel.

Further, as described above, in some examples, an indication of the second number of bits for each of the channels may be included in the compressed data block. In these examples, the difference size determination logic 708 need not be implemented in the decompression unit. In these examples, for each of the channels, the difference determination logic 704 obtains the second number of bits by reading an indication of the second number of bits of the channel from the compressed data block.

In step S906, the unpacker logic 710 reads a representation of the difference value for the decompressed one or more image element values from the compressed data block using the second number of bits for the corresponding channel.

Note that in some other examples (not shown in fig. 9) where lossless compression and decompression are implemented, the first number of bits of a channel is used, and the difference value for that channel may always be included in the compressed data, so that the second number of bits need not be obtained, i.e., step S904 need not be performed.

In step S908, the initial data value determination logic 706 reads an indication of the origin value for each of the one or more channels from the compressed data block. An indication of the origin value for the reference channel is read from the compressed data as a number representing the origin value for the reference channel. For each of the non-reference channels, an indication of the origin value for the non-reference channel may be read from the compressed data as a number representing: (i) An origin value for a non-reference channel, or (ii) a difference between an origin value for a non-reference channel and an origin value for a reference channel of a block.

In step S910, based on the representation of the difference values read from the compressed data blocks, for each of the channels and for each of the one or more image element values that are decompressed, the unpacker logic 710 determines a difference value from the first number of bits of the channel. For example, the difference value determined for a channel has a first number of bits indicated for that channel. The determined difference value with the appropriate first number of bits for the corresponding channel is provided to the initial data value determination logic 706.

In the example described herein, in step S910, for each of the channels and for each of the one or more image element values that are decompressed, determining a difference value from the first number of bits of the channel includes adding (i.e., appending) zero, one or more Least Significant Bits (LSBs) to a representation of the difference value read from the compressed data block, thereby determining a difference value having the first number of bits for each of the channels. Zero LSBs are added to the representation of the difference value read from the compressed data block if the first number of bits is the same as the second number of bits of the channel. If the first number of bits is greater than the second number of bits of the channel, one or more LSBs are added to the representation of the difference read from the compressed data block to determine a difference having the first number of bits for the channel.

In different examples, the value of the added bits may be different. In a simple example, zero, one or more LSBs added to the representation of the difference value read from the compressed data block are all zero. Using zero as the added LSB means that when adding the difference to the origin value, the result is guaranteed not to overflow, so that a simpler (e.g., smaller in silicon area) adder can be used to add the difference to the origin value than if other values were used for the added LSB. When decorrelation is used, it is possible to get an underflow, so to avoid an underflow, intermediate results smaller than zero can be clamped to zero. Furthermore, using zero as the added LSB means that a data value of zero that is compressed and then decompressed will still be zero. This may be useful, for example, to represent red, blue, and green channel values, such that a completely black region (e.g., having red, green, and blue values of zero) will remain completely black after compression and decompression of the image element values. This may also be useful for representing alpha channel values such that fully transparent image element values (i.e., image element values with alpha values of zero) will remain fully transparent after compression and decompression of the image element values.

In another example, zero, one or more LSBs added to the representation of the difference value read from the compressed data block are all ones. Using one as the added LSB means that the maximum data value that is compressed and then decompressed (e.g., value 255 for an 8-bit data value) will still be the maximum data value. This may be useful, for example, to represent red, blue, and green channel values such that a fully white region (e.g., having maximum red, green, and blue values (e.g., 255 for an 8-bit value)) will remain fully white after compression and decompression of the image element values. This may also be useful to represent the alpha channel value such that a completely opaque image element value (i.e., the image element value with the largest alpha value (e.g., an alpha value of 255 for an 8-bit value)) will remain completely opaque after compression and decompression of the image element value. If one is used as the LSB added to the difference, precautions can be taken to ensure that the sum of the difference and the origin value does not overflow. For example, the carry out bit may be used to indicate whether there is a potential overflow (e.g., by performing a logical OR (OR) operation on the summed bits to determine the carry out bit). If the carry out bit indicates that there is no potential overflow, the method proceeds as described herein, but if the carry out bit indicates that there is a potential overflow for the sum, the result of the sum is clamped to a maximum value (e.g., 255 for an 8-bit value).

In another example, zero, one or more LSBs added to the representation of the difference value read from the compressed data block are random or pseudo random bits. The use of random or pseudo-random bits as the added LSBs may help reduce the visually perceptible banding introduced into the image element values by compressing and decompressing the image element values.

In another example, zero, one or more LSBs added to the representation of the difference value read from the compressed data block are determined by bit copying corresponding zero, one or more MSBs of the representation of the difference value read from the compressed data block. In this way, if n LSBs are added to the representation of the difference, those n bits match the n MSBs of the representation of the difference read from the compressed data block. This is a simple way of determining the LSB to be added and does not add systematic bias to the decompressed value, but it pushes the decompressed value slightly off center.

In yet another example, zero, one or more LSBs added to the representation of the difference value read from the compressed data block are different for different ranges of origin values or for different ranges of different values. For example, zero, one or more LSBs added to the representation of the difference value are zero for low origin values, e.g., for origin values below a threshold; while zero, one or more LSBs added to the representation of the difference are ones for high origin values, e.g., for origin values above the threshold. The threshold value may be any suitable value, such as any value between 1 and (2 ^m -1) for an m-bit origin value. This may help to maintain true black (represented by zero) and true white (represented by one). As another example, for a difference value whose MSB is zero, one or more LSBs added to the representation of the difference value are zero; and zero, one or more LSBs added to the representation of the difference value are ones for differences whose MSBs are ones.

The method may comprise: performing a left shift of zero, one or more bit positions on bits of the representation of the difference value read from the compressed data block; and then adding zero, one or more LSBs in the LSB positions after shifting, wherein zero, one or more LSBs are determined as described above to determine a difference value having the first number of bits for each of the one or more channels.

In step S912, for each of the channels, the initial data value determination logic 706 uses the determined difference value of (i) the origin value for the channel and (ii) the channel for the image element value to determine an initial data value related to the channel for each of the one or more image element values that are decompressed. The manner in which the initial data values are determined using the difference and origin values for the channels matches the manner in which the difference and origin values are determined based on the data values in the compression unit 302. As described above, the compression unit 302 may use the data values differently in different examples to determine the origin value and the difference value. For example, if the origin value is the smallest data value within the block, then for each of the channels, an initial data value relating to the channel for each of the one or more image element values that are decompressed is determined by summing the determined differences for the origin value of the channel and the channel for the image element value. As another example, if the origin value is the largest data value within the block, then for each of the channels, an initial data value relating to the channel for each of the one or more image element values that are decompressed is determined by subtracting the determined difference value for the channel of the image element values from the origin value for the channel. In both examples given in the preceding paragraph, precautions may be taken to ensure that there is no carry from LSB that has been added to the difference to the most significant bit of the decompressed data value. For example, an overflow may be detected, and in response to detecting an overflow, the decompressed data values may be clamped to an appropriate maximum value. Similarly, an underflow may be detected, and in response to detecting the underflow, the decompressed data value may be clamped to an appropriate minimum value.

In another example, for each of the channels, an initial data value related to the channel for each of the one or more image element values that are decompressed may be determined by combining the origin value for the channel and the determined difference value for the channel for the image element value using addition or subtraction in a modulo operation. In this example, precautions may be taken to ensure that the decompressed value is on the correct side of the modulo operation. For example, if the four blocks of 8-bit data values for a channel are 251, 255, 0, and 1, then the origin value may be set to 251 (i.e., binary representation 11111011) and the difference values may be 0, 4,5, and 6, according to an example using modulo arithmetic. These differences can be represented losslessly as 000, 100, 101, and 110 with three bits, so the first number of bits is three. In this example, the second number of bits for the channel of the block is determined to be two in order to meet the target compression level. Briefly, the differences may be stored as 00, 10 and 11, but may be such that the decompressed data values are 251, 255 and 1 (if the LSB added to the decompression unit is zero). It may be very evident that the data loss in the third decompressed data value has changed from a value of 0 to a value of 255. Thus, when the difference is compressed, the compression unit may identify whether rounding of the difference will cause the resulting decompressed data values to cross modulo of the modulo operation, and if so, may modify the rounding of the difference to avoid crossing modulo of the modulo operation. It should be noted that the compression unit knows how the compressed data will be decompressed, so it can determine if rounding in the difference will cause the resulting decompressed data values to cross over the modulo of the modulo operation. For example, the third difference in the above example may be rounded up instead of rounded down (but the other three differences will still be rounded down) so that the three differences may be stored as 00, 10, 11, and 11 in the compressed data. This results in decompressed values of 251, 255, 1 and 1. The data loss in the third data value has changed from a value of 0 to a value of 1, which is not as significant as a change from 0 to 255.

In yet another example, the compressed data block includes a plurality of origin values for at least one of the channels, wherein an indication is included in the compressed data block to indicate from which of the plurality of origin values each of the differences for the at least one of the channels is determined. In this example, for the at least one of the channels, an initial data value relating to the channel for each of the one or more image element values that is decompressed is determined using (i) an origin value for the channel indicated by an indication of the determined difference value for the channel of image element values and (ii) the determined difference value for the channel of image element values.

The method proceeds from step S912 to step S808 as described above.

As mentioned above, in some examples, one or more error correction indications may be included in the compressed data block. The decompression unit 702 may determine one or more error correction indications based on the determined decompressed data values of the decompressed one or more image element values. The error correction indication may be CRC bits that may be calculated in a known manner. Decompression unit 702 may read one or more error correction indications from the compressed data blocks. The decompression unit 702 may then compare the determined one or more error correction indications with one or more error correction indications read from the compressed data block to determine whether there are errors in the determined decompressed data values of the decompressed one or more image element values. If it is determined that there is no error in decompressing the image element values, the image element values are used trustably and trustably. This is particularly useful in systems that achieve functional safety. For example, the graphics processing system 102 may be configured to operate according to a security standard (such as the ISO 26262 standard) for rendering images that are considered to be security critical (e.g., for rendering images that include warning symbols displayed on the dashboard of an automobile) if the graphics processing system 102 is being implemented in an automobile. If the decompression unit 702 determines that there is an error in the determined decompressed data value, this means that an error occurred in the data transfer between the compression unit 302 and the decompression unit 702 (e.g., when the compressed data has been transferred to or from memory). In this case, the decompression unit 702 may output an error signal to indicate that an error has occurred. The remainder of the system may respond to this error signal in any suitable manner, such as by discarding the decompressed image data blocks and/or requesting that the image data blocks be compressed and transmitted again.

Exemplary decompression

In the example of a block of compressed image data given above, where the reference channel is the green channel, the values of four pixels in the block in the different channels are:

R＝[17，24，9，45]

G＝[89，100，84，119]

B＝[240，228，215，198]

A＝[255，255，250，255]

And the resulting compressed data block has the following four channels of data:

We now describe how all image element values of the compressed data block can be decompressed by the decompression unit 702. In step S902, first digits are read from the compressed data block, and these first digits are 3, 6, and 3 for the respective red, green, blue, and alpha channels. In step S904, the decompression unit 702 will determine that if lossless compression is used, there will be 18 bits per image element value for the difference value, whereas in this example the target compression level (50%) only allows a maximum of 16 bits per image element value for the difference value. The same algorithm as in the compression technique described above is used, so the difference size determination logic 708 determines the difference size (i.e., the second number of bits) for the channel as r=3 bits, g=6 bits, b=5 bits, and a=2 bits.

In step S906, the unpacker logic 710 reads the difference value from the compressed data block according to the determined second number of bits. In step S910, the unpacker logic 710 adds the LSBs to the difference values in place to form a difference value having a first number of bits for each of the channels. In this example, the added bits are determined by bit replication (i.e., by replicating one or more MSBs of the difference value), so the difference value will be determined as (the difference value where bold bits have been added to the compressed block by the unpacker logic):

red delta value: [100, 000, 001, 010] (decimal: [4,0,1,2 ])

Green delta value: [000101, 010000, 000000, 100011] (decimal: [5, 16,0, 35 ])

Blue delta value: [101011, 011110, 010000, 000000] (decimal: [43, 30, 16,0 ])

Alpha increment value: [101, 101, 000, 101] (decimal: [5,5,0,5 ])

In step S908, the initial data value determination logic 706 reads the origin value from the compressed data block. In step S912, the initial data value determination logic 706 adds the difference value to the origin value to determine an initial data value of the decompressed image element value as:

is determined as (signed two's complement format): the initial red value of 10110100+ [100, 000, 001, 010] = [10111000, 10110100, 10110111, 10110110] (or decimal: -76+ [4,0,1,2] = [ -72, -76, -75, -74 ])

Is determined as (unsigned integer format): the initial green value of 01010100+ [000101, 010000, 000000, 100011] = [01011001, 01011100, 01010100, 01110111] (or decimal: 84+ [5, 16,0, 35] = [89, 100, 84, 119 ])

Is determined as (unsigned integer format): the initial blue value of 11000110+ [101011, 011110, 010000, 000000] = [11110001, 11100100, 11010110, 11000110] (or decimal: 198+ [43, 30, 16,0] = [241, 228, 214, 198 ])

Is determined as (unsigned integer format): an initial alpha value of 11111010+ [101, 101, 000, 101] = [11111111, 11111111, 11111010, 11111111] (or decimal: 250+ [5,5,0,5] = [255, 255, 250, 255 ])

The green, blue and alpha channels are compressed using the non-channel decorrelation mode, so in step S814, the channel decorrelation logic 712 determines that the decompressed data values for the green, blue and alpha channels are the initial data values for the green, blue and alpha channels, respectively. However, the red channel is compressed using the channel decorrelation mode, so in step S812, the channel decorrelation logic 712 determines that the decompressed data values for the red channel are the sum of the initial data values for the red and green channels. The decompressed data values for the red color channel are thus determined as [ -72, -76, -75, -74] + [89, 100, 84, 119] = [17, 24,9, 45]. In this example, there is no underflow in the red channel, i.e. all values are ≡ 0. An underflow may occur when using the channel decorrelation mode and if this is the case any negative decompressed data value is clamped to zero.

By comparing these decompressed values with the original input values, it can be seen that in this example some small errors have been introduced into the blue channel, but not the green, red and alpha channels. The error in all channels is smaller than if the red channel were compressed using the non-channel decorrelation mode.

As described above, some examples are described in detail herein with reference to a block of pixel data including pixel values, but more generally, the compression and decompression processes may be performed for a block of image data including image element values, which may be pixel values, texel values, depth values, surface normal values, or illumination values, to name a few examples.

All the operations described above as being performed by the compression unit 302 and decompression unit 702 can be effectively implemented in hardware, for example, using shifters, adders, and comparators. The compression unit 302 and the decompression unit 702 do not perform complex operations such as division. Furthermore, the operations of compression unit 302 and decompression unit 702 do not require a large cache or other type of large local memory. For these reasons, compression unit 302 and decompression unit 702 can be implemented efficiently in hardware, which may result in less hardware (e.g., less silicon area), lower power consumption, and/or lower operating latency than more complex compression and decompression units. Furthermore, the same compression unit (e.g., compression unit 302) may perform both lossless and lossy compression, so two separate compression units are not required if both lossless and lossy compression are desired to be implemented. Similarly, the same decompression unit (e.g., decompression unit 702) may perform lossless and lossy decompression on the compressed data blocks, thus eliminating the need for two separate decompression units if it is desired to perform both lossless and lossy decompression. When both lossless and lossy techniques are desired, having a single compression unit and a single decompression unit configured to operate in a lossless and lossy manner (as well as implementing a lossless compression unit, a separate lossy compression unit, a lossless decompression unit, and a separate lossy decompression element) reduces the amount of silicon area implemented on the compression unit and the decompression unit.

In the above example, each image element value includes four 8-bit data values relating to the red, green, blue and alpha channels, respectively, such that each image element value is represented by 32 bits. In other examples, the data values may have different numbers of bits. As another example, each image element value may include three 10-bit data values associated with the red, green, and blue channels and a 2-bit data value associated with the alpha channel, respectively, such that each image element value is represented with 32 bits. In these examples, the green channel may be used as the reference channel, decorrelation may be performed for the red and blue channels (as non-reference channels), but decorrelation may not be performed for the alpha channel. As another example, each image element value includes four 10-bit data values associated with a red channel, a green channel, a blue channel, and an alpha channel, respectively, such that each image element value is represented with 40 bits.

Furthermore, in the above example, there is a single reference channel (e.g., the green channel). However, in other examples, there may be more than one reference channel. For example, there may be four channels (e.g., RGBA), where two of the channels (e.g., the green channel and the blue channel) are reference channels and two of the channels (e.g., the red channel and the alpha channel) are non-reference channels. For channel decorrelation (and channel re-correlation), a first non-reference channel (e.g., a red channel) of the non-reference channels may reference a first reference channel (e.g., a green channel) of the reference channels, and a second non-reference channel (e.g., an alpha channel) of the non-reference channels may reference a second reference channel (e.g., a blue channel) of the reference channels.

FIG. 10 illustrates a computer system in which the graphics processing system described herein may be implemented. The computer system includes a CPU 1002, a GPU 1004, a memory 1006, a Neural Network Accelerator (NNA) 1008, and other devices 1014, such as a display 1016, speakers 1018, and a camera 1022. Processing blocks 1010 (corresponding to processing logic 106 and compression unit 110 and decompression unit 112) are implemented on GPU 1004. In other examples, one or more of the depicted components may be omitted from the system and/or the processing block 1010 may be implemented on the CPU 1002 or within the NNA 1008. The components of the computer system may communicate with each other via a communication bus 1020. Storage 1012 (corresponding to memory 104) is implemented as part of memory 1006.

The compression unit and decompression unit of fig. 1,3 and 7 are shown as comprising several functional blocks. This is merely illustrative and is not intended to limit the strict division between the different logic elements of such entities. Each of the functional blocks may be provided in any suitable manner. It should be understood that intermediate values described herein as being formed by the compression unit and/or the decompression unit need not be physically generated by the compression unit and/or the decompression unit at any point, and may merely represent logical values that conveniently describe the processing performed by the compression unit and/or the decompression unit between its inputs and outputs.

The compression and/or decompression units described herein may be embodied in hardware on an integrated circuit. The compression unit and/or decompression unit described herein may be configured to perform any of the methods described herein. In general, any of the functions, methods, techniques or components described above may be implemented in software, firmware, hardware (e.g., fixed logic circuitry) or any combination thereof. The terms "module," "functionality," "component," "element," "unit," "block," and "logic" may be used herein to generally represent software, firmware, hardware, or any combination thereof. In the case of a software implementation, the module, functionality, component, element, unit, block or logic represents program code that performs specified tasks when executed on a processor. The algorithms and methods described herein may be executed by one or more processors executing code that causes the processors to perform the algorithms/methods. Examples of a computer-readable storage medium include Random Access Memory (RAM), read-only memory (ROM), optical disks, flash memory, hard disk memory, and other memory devices that can store instructions or other data using magnetic, optical, and other techniques and that can be accessed by a machine.

The terms computer program code and computer readable instructions as used herein refer to any kind of executable code for a processor, including code expressed in a machine language, an interpreted language, or a scripting language. Executable code includes binary code, machine code, byte code, code defining an integrated circuit (e.g., a hardware description language or netlist), and code expressed in programming language code such as C, java or OpenCL. The executable code may be, for example, any kind of software, firmware, script, module, or library that, when properly executed, handled, interpreted, compiled, run in a virtual machine or other software environment, causes the processor of the computer system supporting the executable code to perform the tasks specified by the code.

The processor, computer, or computer system may be any kind of device, machine, or special purpose circuit, or a set or portion thereof, that has processing capabilities such that it can execute instructions. The processor may be or include any kind of general purpose or special purpose processor, such as CPU, GPU, NNA, a system on a chip, a state machine, a media processor, an Application Specific Integrated Circuit (ASIC), a programmable logic array, a Field Programmable Gate Array (FPGA), or the like. The computer or computer system may include one or more processors.

The present invention is also intended to cover software defining a configuration of hardware as described herein, such as HDL (hardware description language) software, as used for designing integrated circuits or for configuring programmable chips to perform desired functions. That is, a computer readable storage medium may be provided having encoded thereon computer readable program code in the form of an integrated circuit definition data set that, when processed (i.e., run) in an integrated circuit manufacturing system, configures the system to manufacture a compression unit and/or decompression unit configured to perform any of the methods described herein, or to manufacture a compression unit and/or decompression unit comprising any of the apparatus described herein. The integrated circuit definition data set may be, for example, an integrated circuit description.

Accordingly, a method of manufacturing a compression unit and/or a decompression unit as described herein in an integrated circuit manufacturing system may be provided. Furthermore, an integrated circuit definition data set may be provided which, when processed in an integrated circuit manufacturing system, causes a method of manufacturing a compression unit and/or a decompression unit to be performed.

The integrated circuit definition data set may be in the form of computer code, for example, as a netlist, code for configuring a programmable chip, as a hardware description language defining hardware suitable for fabrication at any level in an integrated circuit, including as Register Transfer Level (RTL) code, as a high-level circuit representation (such as Verilog or VHDL), and as a low-level circuit representation (such as OASIS (RTM) and GDSII). A higher-level representation (e.g., RTL) that logically defines hardware suitable for fabrication in an integrated circuit may be processed at a computer system configured to generate fabrication definitions for the integrated circuit in the context of a software environment that includes definitions of circuit elements and rules for combining these elements to generate fabrication definitions for the integrated circuit so defined by the representation. As is typically the case when software is executed at a computer system to define a machine, one or more intermediate user steps (e.g., providing commands, variables, etc.) may be required to configure the computer system to generate a manufacturing definition for an integrated circuit to execute code that defines the integrated circuit to generate the manufacturing definition for the integrated circuit.

An example of processing an integrated circuit definition data set at an integrated circuit manufacturing system to configure the system to manufacture a compression unit and/or a decompression unit will now be described with respect to fig. 11.

Fig. 11 illustrates an example of an Integrated Circuit (IC) manufacturing system 1102 configured to manufacture a compression unit and/or a decompression unit as described in any of the examples herein. Specifically, IC fabrication system 1102 includes layout processing system 1104 and integrated circuit generation system 1106. The IC fabrication system 1102 is configured to receive an IC definition data set (e.g., defining a compression unit and/or a decompression unit as described in any of the examples herein), process the IC definition data set, and generate an IC (e.g., embodying a compression unit and/or a decompression unit as described in any of the examples herein) from the IC definition data set. Processing of the IC definition data set configures IC fabrication system 1102 to fabricate an integrated circuit embodying the compression unit and/or decompression unit as described in any of the examples herein.

Layout processing system 1104 is configured to receive and process the IC definition data set to determine a circuit layout. Methods of determining a circuit layout from an IC definition dataset are known in the art and may involve, for example, synthesizing RTL codes to determine a gate level representation of a circuit to be generated, for example, in terms of logic components (e.g., NAND, NOR, AND, OR, MUX and FLIP-FLOP components). By determining the location information of the logic components, the circuit layout may be determined from the gate level representation of the circuit. This may be done automatically or with the participation of a user in order to optimize the circuit layout. When the layout processing system 1104 determines a circuit layout, it may output the circuit layout definition to the IC generation system 1106. The circuit layout definition may be, for example, a circuit layout description.

The IC generation system 1106 generates ICs according to circuit layout definitions, as is known in the art. For example, the IC generation system 1106 may implement a semiconductor device fabrication process that generates ICs, which may involve a multi-step sequence of photolithography and chemical processing steps during which electronic circuits are developed on wafers made of semiconductor material. The circuit layout definition may be in the form of a mask that may be used in a lithographic process to generate an IC from the circuit definition. Or the circuit layout definitions provided to the IC generation system 1106 may be in the form of computer readable code that the IC generation system 1106 can use to form a suitable mask for generating the IC.

The different processes performed by IC fabrication system 1102 may all be implemented at one location, e.g., by a party. Or IC manufacturing system 1102 may be a distributed system such that some processes may be performed at different locations and by different parties. For example, some of the following phases may be performed at different locations and/or by different parties: (i) Synthesizing an RTL code representing the IC definition dataset to form a gate level representation of the circuit to be generated; (ii) generating a circuit layout based on the gate level representation; (iii) forming a mask according to the circuit layout; and (iv) using the mask to fabricate the integrated circuit.

In other examples, processing of the integrated circuit definition data set at the integrated circuit manufacturing system may configure the system to manufacture the compression unit and/or the decompression unit without processing the integrated circuit definition data set to determine the circuit layout. For example, an integrated circuit definition dataset may define a configuration of a reconfigurable processor such as an FPGA, and processing of the dataset may configure the IC manufacturing system to generate (e.g., by loading configuration data into the FPGA) the reconfigurable processor having the defined configuration.

In some embodiments, the integrated circuit manufacturing definition data set, when processed in the integrated circuit manufacturing system, may cause the integrated circuit manufacturing system to generate an apparatus as described herein. For example, an apparatus as described herein may be manufactured by configuring an integrated circuit manufacturing system in the manner described above with reference to fig. 11 through an integrated circuit manufacturing definition dataset.

In some examples, the integrated circuit definition dataset may include software running on or in combination with hardware defined at the dataset. In the example shown in fig. 11, the IC generation system may also be further configured by the integrated circuit definition data set to load firmware onto the integrated circuit in accordance with the program code defined in the integrated circuit definition data set at the time of manufacturing the integrated circuit or to otherwise provide the integrated circuit with the program code for use with the integrated circuit.

Embodiments of the concepts set forth in the present application in apparatuses, devices, modules, and/or systems (and in methods implemented herein) may provide improved performance over known embodiments. Performance improvements may include one or more of increased computational performance, reduced latency, increased throughput, and/or reduced power consumption. During the manufacture of such devices, apparatuses, modules and systems (e.g., in integrated circuits), a tradeoff may be made between performance improvements and physical implementation, thereby improving the manufacturing method. For example, a tradeoff can be made between performance improvement and layout area, matching the performance of a known implementation, but using less silicon. This may be accomplished, for example, by reusing the functional blocks in a serial fashion or sharing the functional blocks among elements of an apparatus, device, module, and/or system. Rather, the concepts described herein that lead to improvements in the physical implementation of devices, apparatus, modules and systems (e.g., reduced silicon area) can be weighed against performance improvements. This may be accomplished, for example, by fabricating multiple instances of the module within a predefined area budget.

The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims

1. A computer-implemented method of performing decompression to determine one or more image element values from compressed data, wherein the compressed data represents a block of image data comprising a plurality of image element values, each image element value comprising a plurality of data values associated with a respective plurality of channels, wherein the plurality of channels comprises at least one reference channel and a plurality of non-reference channels, the method comprising:

If the compressed mode for the non-reference channel is the channel decorrelation mode, determining the decompressed data values for the non-reference channel as a function of the determined initial data values relating to the non-reference channel and the determined initial data values relating to one of the at least one reference channel for the image element values.

2. The method of claim 1, wherein the compressed mode for a first non-reference channel of the plurality of non-reference channels is the channel decorrelation mode, and wherein the compressed mode for a second non-reference channel of the plurality of non-reference channels is the non-channel decorrelation mode.

3. A method as claimed in claim 1 or 2, wherein the function is a summation function.

4. A method as claimed in any preceding claim, wherein for each of the one or more image element values being decompressed, for each of the at least one reference channel, the determined initial data value relating to the reference channel for the image element value is a decompressed data value for the reference channel.

5. The method of any preceding claim, wherein the reading compressed channel data for each of the channels comprises:

For each of the channels, reading a representation of a difference value from the compressed data, wherein the difference value for the channel represents a difference for the channel between the data value and the origin value for the one or more image element values decompressed from the compressed data;

wherein said determining, for each of said channels, an initial data value related to said channel for each of said one or more image element values that are decompressed using said compressed channel data for said channel comprises:

For each of the channels, determining the initial data value related to the channel for each of the one or more image element values that are decompressed using: (i) The indication of the origin value for the channel read from the compressed data, and (ii) one of the representations of the difference value read from the compressed data.

6. The method of claim 5, wherein said determining, for each of the channels, the initial data value related to the channel for each of the one or more image element values that are decompressed comprises: the determined difference value of the origin value for the channel and the channel for the image element value is summed.

7. The method of claim 5, wherein the determining, for each of the channels, the initial data value related to the channel for each of the one or more image element values that are decompressed comprises subtracting the determined difference value for the channel for the image element value from the origin value for the channel.

8. The method of any of claims 5 to 7, wherein for each of the non-reference channels, the indication of the origin value for the non-reference channel is read from the compressed data as a number representing: (i) The origin value for the non-reference channel, or (ii) a difference between the origin value for the non-reference channel and the origin value for one of the at least one reference channel of the block.

9. The method of any of claims 5 to 8, wherein the reading compressed channel data for each of the channels further comprises:

wherein said determining the initial data value relating to the channel for each of the one or more image element values that are decompressed comprises:

Based on the representation of the difference values read from the compressed data, for each of the channels and for each of the one or more image element values that are decompressed, a difference value is determined from the first number of bits for the channel.

10. The method of claim 9, wherein the determined difference for a channel of image element values may have the first number of bits for the channel.

11. The method of claim 9 or 10, wherein for each of the channels, each of the representations of the differences for the channel has the first number of bits for the channel.

12. The method of claim 9 or 10, wherein said reading a representation of a difference from said compressed data comprises:

Obtaining a second number of bits for each of the channels, wherein each of the representations of the differences for each of the channels has the second number of bits for the channel; and

The representation of the difference value for the one or more image element values decompressed from the compressed data is read using the obtained second number of bits for the respective channel.

13. The method of claim 12, wherein the determining a difference value from the first number of bits for each of the channels and for each of the one or more image element values that are decompressed comprises adding zero, one, or more least significant bits to the representation of the difference value read from the compressed data, thereby determining the difference value having the first number of bits for each of the channels.

14. The method of claim 13, the zero, one, or more least significant bits added to the representation of the difference read from the compressed data being determined by bit replication of corresponding zero, one, or more most significant bits of the representation of the difference read from the compressed data.

15. The method of any of claims 12-14, wherein the obtaining a second number of bits for each of the one or more channels comprises determining the second number of bits for the channel according to a predetermined scheme using the first number of bits for each of the one or more channels.

16. The method of any of claims 12 to 14, wherein the obtaining a second number of bits for each of the one or more channels comprises reading an indication of the second number of bits for the channel from the compressed data.

17. The method of any of claims 5 to 16, wherein the compressed data is in a compressed data block, the compressed data block comprising:

A head portion having a fixed size and comprising: (i) The indication of the origin value for each of the channels, and (ii) the indication of the compressed mode for each of the non-reference channels; and

A body portion having a variable size and comprising the representation of the difference value for each of the channels.

18. A method as claimed in claim 17 when dependent on any of claims 9 to 16, wherein the head portion further comprises the indication of the first number of bits for each of the channels.

19. A decompression unit configured to perform decompression to determine one or more image element values from compressed data, wherein the compressed data represents a block of image data comprising a plurality of image element values, each image element value comprising a plurality of data values associated with a respective plurality of channels, wherein the plurality of channels comprises at least one reference channel and a plurality of non-reference channels, the decompression unit comprising:

Decompression logic configured to:

If the compressed mode for the non-reference channel is the channel decorrelation mode, determining the decompressed data values for the non-reference channel as a function of the determined initial data values relating to the non-reference channel and the determined initial data values relating to one of the at least one reference channel for the image element value.

20. A computer readable storage medium having stored thereon computer readable code configured such that when the code is run, the method of any of claims 1 to 18 is performed.

21. A computer readable storage medium having stored thereon an integrated circuit definition dataset that, when processed in an integrated circuit manufacturing system, configures the integrated circuit manufacturing system to manufacture a decompression unit according to claim 19.