CN117156148A - Compression and decompression of sub-primitive presence indication for use in a rendering system - Google Patents

Compression and decompression of sub-primitive presence indication for use in a rendering system Download PDF

Info

Publication number
CN117156148A
CN117156148A CN202310605511.5A CN202310605511A CN117156148A CN 117156148 A CN117156148 A CN 117156148A CN 202310605511 A CN202310605511 A CN 202310605511A CN 117156148 A CN117156148 A CN 117156148A
Authority
CN
China
Prior art keywords
values
entropy
value
entropy decoded
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310605511.5A
Other languages
Chinese (zh)
Inventor
S·芬尼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Imagination Technologies Ltd
Original Assignee
Imagination Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Imagination Technologies Ltd filed Critical Imagination Technologies Ltd
Publication of CN117156148A publication Critical patent/CN117156148A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/06Ray-tracing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/004Predictors, e.g. intraframe, interframe coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/20Contour coding, e.g. using detection of edges
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/21Collision detection, intersection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Graphics (AREA)
  • Signal Processing (AREA)
  • Image Generation (AREA)
  • Investigating Strength Of Materials By Application Of Mechanical Stress (AREA)

Abstract

The present application relates to compression and decompression of sub-primitive presence indications for use in a rendering system. A method and decompression unit for decompressing compressed data to determine an indication of the presence of one or more sub-primitives is provided. A compressed data block is received. The entropy encoded data is read. Entropy decoding is performed. Spatial re-correlation is performed on the entropy decoded value blocks to determine one or more sub-primitive presence indications. For each of the plurality of rows of entropy decoded values in the first dimension, spatial re-correlation includes: for one or more entropy decoded values in a row: the method further includes determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and replacing the entropy decoded value with a value of a sum of the predicted values. For each of the plurality of rows of entropy decoded values in the second dimension, spatial re-correlation further comprises: for one or more entropy decoded values in a row: the method further includes determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and replacing the entropy decoded value with a value of a sum of the predicted values.

Description

Compression and decompression of sub-primitive presence indication for use in a rendering system
Cross reference to related applications
The present application claims priority from uk patent applications 2207939.6 and 2207936.2 filed 5/30 2022, which are incorporated herein by reference in their entirety.
Technical Field
The present disclosure relates to techniques for compressing and/or decompressing sub-primitive presence indications for use in a rendering system, such as a ray tracing system.
Background
The rendering system may be used to generate an image of the scene. Two common rendering techniques are ray tracing and rasterization. In particular, ray tracing is a computational rendering technique for generating an image of a scene (e.g., a 3D scene) by tracing the path of light ("rays") through the scene, typically from the perspective of a camera. Each ray is modeled as originating from a camera and entering the scene through a pixel. As a ray traverses a scene, it may intersect objects within the scene. The intersection between a ray and its intersecting object can be modeled to create a realistic visual effect. For example, in response to determining that a ray intersects an object, a shader program (i.e., a portion of computer code) may be executed for the intersection. A programmer may write a shader program to define how the system reacts to an intersection (which may, for example, result in one or more secondary rays being emitted into the scene), for example, to represent reflection of a ray from an intersecting object or refraction of a ray through an object (e.g., if the object is transparent or translucent). As another example, the shader program may cause one or more rays to be emitted into the scene for determining whether an object is in a shadow at an intersection. The result of executing the shader program (and processing the associated secondary ray) may be to calculate the color value of the pixel through which the ray passed.
Rendering an image of a scene using ray tracing may involve performing many intersection tests, such as performing billions of intersection tests to render an image of a scene. To reduce the number of intersection tests that need to be performed, the ray tracing system may generate an acceleration structure, where each node of the acceleration structure represents an area within the scene. The acceleration structure is typically hierarchical (e.g., has a tree structure) such that it contains multiple levels of nodes, where nodes near the top of the acceleration structure represent relatively large areas in the scene (e.g., the root node may represent the entire scene), and nodes near the bottom of the acceleration structure represent relatively small areas in the scene. The leaf nodes of the acceleration structure represent regions in the scene surrounding at least one primitive or part of a primitive and have pointers to the surrounding primitives.
An acceleration structure may be used to perform intersection testing of a ray by first testing the ray for intersection with a root node of the acceleration structure (e.g., in a recursive manner). If a ray is found to intersect a parent node (e.g., root node), then the test may proceed to a child node of the parent node. In contrast, if a ray is found not to intersect a parent node, intersection testing of child nodes of the parent node may be avoided, thereby saving computational effort. If a ray is found to intersect a leaf node, the ray may be tested against objects within the region represented by the leaf node to determine which object(s) the ray intersects. "primitives" may be used to represent objects. The primitives represent geometric units in the system and may be, for example, convex polygons. Primitives are typically triangles, but they may also be other shapes, such as rectangular (the term "rectangular" is used herein to include "square"), pentagonal, hexagonal, or non-planar shapes, such as spherical or bi-cubic curved sheets, or have curved edges, etc.
Primitives are typically simple geometric shapes to facilitate intersection tests to determine whether a ray intersects a primitive. However, primitives may be used to represent more complex shapes. For example, a texture (e.g., a 2D image or a 3D volume) may be applied to the primitive, where the texture may have alpha values that determine the opacity at different locations on the primitive, e.g., a maximum sampled alpha value (e.g., 255 for an 8-bit alpha value) means that the primitive is completely opaque at the sampling location, and a minimum sampled alpha value (e.g., 0) means that the primitive is completely transparent at the sampling location. The value between the minimum alpha value and the maximum alpha value may represent the partial opacity. For purposes of intersection testing in ray tracing systems, if a ray intersects a primitive at a location where the primitive is completely transparent (i.e., at a location where the alpha value is zero), then the intersection is not accepted, i.e., the ray passes directly through the primitive. In this way, setting the alpha value to zero may be used to represent a hole in a primitive, i.e., a location on the primitive that is "absent" in terms of the intersection test procedure is considered. For intermediate alpha values, the system may choose to weight sum the object behind the primitive and the shadow surface itself, or possibly use a threshold, commonly referred to in the art as alpha test. Textures that include non-existent regions may be referred to as "pass-through textures," alpha test textures, "or" masking textures, "and primitives to which these textures are applied may be referred to as" pass-through primitives, "" alpha test transparent primitives, "or" masking primitives. The through primitives may be used to represent geometries with complex perimeters or large numbers of holes therein, such as leaf and link pens with a small number of primitives.
Note that the "texture" is not necessarily the actual image-it may be calculated "in flight". Such calculations may be performed by executing a "shader" program. Thus, "checking texture" may also be understood to include these calculation methods.
FIG. 1 shows two triangle primitives 102 sharing edges to form a quadrilateral 1 And 102 2 Is an example of (a). Textures representing leaves are applied to both primitives. The texture has some regions (e.g., 104) that are completely transparent so that they are not present for the purpose of intersection testing. Textures also have some opaque regions (e.g., 106) (e.g., they are opaque) so that they exist for intersection testing purposes. Finally, there may be a small number of regions (e.g., along the boundary between regions 104 and 106) that are partially transparent, which may be treated with, for example, the two methods previously mentioned for the "intermediate α" value. Different ray tracing systems may react differently to finding the intersection of a ray with a partially transparent region, e.g., the intersection may be considered a hit, miss, or partial hit. One or more additional rays may be generated as a result of the partial hit.
When the intersection test process finds that the ray intersects the through primitive, the intersection test process of the ray may be stopped while a shader program is executed on the programmable execution unit to determine whether the primitive appears at the intersection point of the ray and the primitive intersection. The existence of primitives at an intersection is typically determined by the alpha channel mapped to the texture on the primitive. The transfer between the intersection test procedure (which may be implemented in fixed function hardware) and the shader program (which is executed on the programmable execution unit) introduces latency into the ray tracing system. For example, fixed function hardware implementing an intersection test procedure may stop thousands of clock cycles when a shader program is executed on a programmable execution unit to determine the presence of primitives at an intersection. Thus, reducing the number of times a shader program needs to be executed to determine the presence of a punch-through primitive at an intersection will significantly improve the performance of the ray tracing system. Reducing the number of times that shader programs need to be executed to determine the presence of a pass-through primitive at an intersection would be particularly beneficial without increasing the number of primitives used to represent geometry, as increasing the number of primitives would increase processing costs in a ray tracing system, such as the processing costs of rendering, modeling, and updating of acceleration structures.
A paper entitled "Sub-triangle opacity masks for faster ray tracing of transparent objects" by Holger Gruen, carsten Benthin, and Sven Woop (Proceedings of the ACM on Computer Graphics and Interactive Techniques, vol. 3, phase 2, article number: 18) proposes ray tracing of transparent primitives using a Sub-triangle opacity mask for alpha testing. Each triangle primitive is subdivided into a set of sub-primitives of uniform size. For example, fig. 2 shows a triangle primitive 202 that is subdivided into 64 consistently sized sub-primitives, labeled 0 through 63. The barycentric coordinates of the three vertices of triangle primitive 202 are labeled b=0, 0,1, b=0, 1,0, and b=1, 0. Any position within triangle primitive 202 may be uniquely identified with barycentric coordinates, indicating which of sub-primitives (0-63) the position is within. For each sub-element (0 to 63), an evaluation is made in a preprocessing step to determine a sub-element presence indication, which indicates that each sub-element is: (i) complete presence, (ii) complete absence, or (iii) partial presence. If a sub-primitive portion is present, then the texture needs to be checked, e.g., by executing a shader program, to determine whether a particular point within the sub-primitive is present or absent. The preprocessing step may be performed by an Application Programming Interface (API), or as part of a process such as creating primitives and textures by a user. Each sub-primitive presence indication is represented with 2 bits to indicate one of three presence states: (i) complete presence, (ii) complete absence, or (iii) partial presence. The "partial present" state may be referred to as a "check texture" state because the presence at a location within a partially present sub-primitive is determined by checking the texture, i.e., by executing a shader program.
When an intersection is found between a ray and a primitive, the presence indication may be queried to determine whether to accept the intersection. The location of the intersection within the primitive, for example as indicated by the barycentric coordinates, is used to identify the sub-primitive where the intersection is located. If the presence indication of the identified sub-primitive indicates that the sub-primitive is completely present or completely absent, the intersection test procedure may proceed with the intersection test without executing a shader program to determine the presence of the primitive at the intersection point. However, if the presence indication of the identified sub-primitive indicates that a sub-primitive portion is present, then the texture is examined by executing a shader program to determine the presence of the primitive at the intersection point.
The use of presence indications reduces the number of times a shader program needs to be executed to examine textures to determine the presence of primitives at intersections to determine whether to accept an intersection. In other words, there is an indication of the region used to determine the complete absence and presence of the primitive, thereby reducing the number of times an alpha test needs to be performed, thereby skipping more expensive alpha test operations where possible. Alpha testing (i.e., running a shader program to check the alpha value of textures at intersections) is an expensive operation in terms of latency and power consumption.
If a primitive is subdivided into K sub-primitives, then 2K bits are used for the presence indication of that primitive and these bits will be included with the remaining primitive data for the primitive during the intersection test. In the example shown in fig. 2, K is 64 such that 128 bits are used for the presence indication of primitive 202. This is a significant increase in the amount of primitive data used to describe the primitives.
Furthermore, uk patents GB2538856B and GB2522868B describe a rasterized rendering technique in which an opacity state diagram is used to indicate whether a block of texels of a texture is completely opaque, completely transparent, partially transparent or a mixture of these states. The indication in the opacity state diagram may be used to accelerate processing through primitives in the rasterization system. Similar to the presence indication described above with reference to the ray tracing system, each opacity state in the rasterization systems of GB2538856B and GB2522868B is represented by two bits.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
There is provided a method of decompressing compressed data to determine one or more sub-primitive presence indications for use in a rendering system (e.g. for use in intersection testing in a rendering system), the method comprising:
receiving a compressed data block of a sub-primitive existence indication block;
reading entropy encoded data from the compressed data block;
performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values; and
performing spatial re-correlation on the block of entropy decoded values to determine one or more of the sub-primitive presence indications in the sub-primitive presence indication block, the performing spatial re-correlation comprising:
(a) For each of a plurality of rows of entropy decoded values in a first dimension within a block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(b) For each of a plurality of rows of entropy decoded values in a second dimension within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value.
The one or more entropy decoded values of the determined prediction values in the row in the first dimension may be alternative entropy decoded values in the row in the first dimension and the one or more entropy decoded values of the determined prediction values in the row in the second dimension may be alternative entropy decoded values in the row in the second dimension.
The one or more other entropy decoded values in the row on which the determination of the predicted value is based may be adjacent to the entropy decoded value of the determined predicted value.
The plurality of rows of entropy decoded values in the first dimension may comprise all rows of entropy decoded values in the first dimension within the block of entropy decoded values, and wherein the plurality of rows of entropy decoded values in the second dimension may comprise all rows of entropy decoded values in the second dimension within the block of entropy decoded values.
Each of the presence indications may indicate a presence state that is one of: (i) complete presence, (ii) complete absence, and (iii) partial presence.
The presence status of a partial presence may be represented by a zero value in a sub-primitive presence indication block.
In the sub-primitive presence indication block, a completely present presence state may be represented by a value of one and a completely absent presence state may be represented by a value of two. Alternatively, in the sub-primitive presence indication block, the presence state of complete presence may be represented by a value of two, and the presence state of complete absence may be represented by a value of one.
The replacing the entropy decoding value with the value of the entropy decoding value and the determined prediction value of the entropy decoding value may include performing a summation calculation using a modulo 3 operation to determine the value of the entropy decoding value and the determined prediction value of the entropy decoding value.
The determining of the predicted value of the entropy decoded value based on one or more other entropy decoded values in the row may be performed using a prediction function that operates according to the following equation:
and the replacing the entropy decoding value with the entropy decoding value and the value of the sum of the entropy decoding value and the determined prediction value of the entropy decoding value may be performed according to:
d 2i+1 :=d 2i+1 +Predict(d 2i ,d 2i+2 )mod 3,
wherein d is 2i Is the (2 i) th entropy decoded value in the row, d 2i+1 Is the (2i+1) th entropy decoded value in the row, and d 2i+2 Is the (2i+2) th entropy decoded value in the row.
If d 2i+1 Is the last entropy decoded value in the row such that d 2i+2 Not present in the entropy decoded value block, d 2i The value of (2) can be used to replace d in the prediction function 2i+2
The row of entropy decoding values in a first dimension within the block of entropy decoding values may be a row of entropy decoding values within the block of entropy decoding values, and the row of entropy decoding values in a second dimension within the block of entropy decoding values may be a column of entropy decoding values within the block of entropy decoding values. Alternatively, the row of entropy decoding values in a first dimension within the block of entropy decoding values may be a column of entropy decoding values within the block of entropy decoding values, and the row of entropy decoding values in a second dimension within the block of entropy decoding values may be a row of entropy decoding values within the block of entropy decoding values.
Prior to (a) and (b), the performing spatial re-correlation may further comprise:
(c) For each of a plurality of rows of entropy decoded values in a first dimension within a sub-block of entropy decoded values within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(d) For each row of the plurality of rows of entropy decoded values in the second dimension within the sub-block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(e) The entropy-decoded values from the sub-blocks are interleaved with entropy-decoded values in the block of entropy-decoded values that have not been spatially re-correlated.
The performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values may comprise:
for each of a plurality of subsets of encoded data values, reading from the entropy encoded data an indication identifying a number of bits for each of the encoded data values in the subset; and
The encoded data values in the entropy encoded data are parsed based on the identified number of bits, thereby interpreting the encoded data values.
The entropy decoded data values may be determined by selectively presetting leading zeros to the interpreted encoded data values such that each of the entropy decoded data values has the same number of bits as each of the presence indications in the sub-primitive presence indication block.
Each of the subsets of encoded data values may be a 2 x 2 subset of encoded data values.
The performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values may comprise performing huffman decoding.
The performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values may comprise performing run-length decoding.
Run-length decoding may be performed according to raster scan order or morton order.
The method may further comprise:
receiving an indication of a sample location within a sub-primitive presence indication block for which a presence indication is to be determined; and
a presence indication of the sample location is determined using one or more of the determined sub-primitive presence indications in the sub-primitive presence indication block.
The rendering system may be a ray tracing system and the method may further include determining a presence of the primitive at an intersection with the ray using the determined presence indication at the sample location as part of performing an intersection test on the ray in the ray tracing system.
The rendering system may be a ray tracing system or a rasterization system.
There is provided a decompression unit configured to decompress compressed data to determine one or more sub-primitive presence indications for use in a rendering system (e.g. for use in intersection tests in a rendering system), the decompression unit being configured to receive a compressed data block of sub-primitive presence indication blocks, the decompression unit comprising:
an entropy decoding module configured to perform entropy decoding on entropy encoded data that has been read from the compressed data blocks to determine blocks of entropy decoded data values; and
a spatial re-correlation module configured to perform spatial re-correlation on the block of entropy decoded values to determine one or more of the sub-primitive presence indications in the sub-primitive presence indication block, the performing spatial re-correlation comprising:
(a) For each of a plurality of rows of entropy decoded values in a first dimension within a block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(b) For each of a plurality of rows of entropy decoded values in a second dimension within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value.
There is provided a decompression unit configured to perform any one of the decompression methods described herein.
A method of compressing a block of sub-primitive presence indications for use in a rendering system (e.g., for use in intersection testing in a rendering system) into a compressed data block may be provided, the method comprising:
performing spatial decorrelation on the sub-primitive presence indications in the sub-primitive presence indication block to determine a presence indication of spatial decorrelation, the performing spatial decorrelation comprising:
(a) For each of a plurality of rows of presence indications in a first dimension within the presence indication block:
for one or more of the presence indications in the row: (i) Determining a predicted value of the presence indication based on one or more other presence indications in the row, and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication; and
(b) For each of a plurality of rows of presence indications in a second dimension within the presence indication block:
for one or more of the presence indications in the row: (i) Determining a predicted value of the presence indication based on one or more other presence indications in the row, and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication;
performing entropy encoding on the spatially decorrelated presence indication to determine entropy encoded data; and
the entropy encoded data is stored in the compressed data block.
A compression unit may be provided that is configured to compress blocks of sub-primitive presence indications for use in a rendering system (e.g. for use in intersection testing in a rendering system) into compressed data blocks, the compression unit comprising:
a spatial decorrelation module configured to perform spatial decorrelation on the sub-primitive presence indications in the sub-primitive presence indication block to determine a spatially decorrelated presence indication, the spatial decorrelation comprising:
(a) For each of the plurality of rows of presence indications in the first dimension within the presence indication block:
For one or more of the presence indications in the row: (i) Determining a predicted value of the presence indication based on one or more other presence indications in the row, and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication; and
(b) For each of the plurality of rows of presence indications in the second dimension within the presence indication block:
for one or more of the presence indications in the row: (i) Determining a predicted value of the presence indication based on one or more other presence indications in the row, and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication; and
an entropy encoding module configured to perform entropy encoding on the spatially decorrelated presence indication to determine entropy encoded data, and store the entropy encoded data in the compressed data block.
The compression unit and/or decompression unit may be embodied in hardware on an integrated circuit. A method of manufacturing a compression unit or a decompression unit in an integrated circuit manufacturing system may be provided. An integrated circuit definition data set may be provided that, when processed in an integrated circuit manufacturing system, configures the system to manufacture either a compression unit or a decompression unit. A non-transitory computer readable storage medium may be provided having stored thereon a computer readable description of a compression unit or a decompression unit, which when processed in an integrated circuit manufacturing system, causes the integrated circuit manufacturing system to manufacture an integrated circuit embodying the compression unit or the decompression unit.
An integrated circuit manufacturing system may be provided, the integrated circuit manufacturing system comprising: a non-transitory computer readable storage medium having stored thereon a computer readable description of a compression unit or a decompression unit; a layout processing system configured to process the computer readable description to generate a circuit layout description of an integrated circuit embodying the compression unit or the decompression unit; and an integrated circuit generation system configured to manufacture the compression unit or the decompression unit according to the circuit layout description.
A computer program code for performing any of the methods described herein may be provided. A non-transitory computer readable storage medium may be provided having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform any of the methods described herein.
As will be apparent to those skilled in the art, the above features may be suitably combined, and may be combined with any of the aspects of the examples described herein.
Drawings
Examples will now be described in detail with reference to the accompanying drawings, in which:
FIG. 1 shows a pass-through texture applied to two primitives forming a quadrilateral;
FIG. 2 shows a triangle primitive subdivided into 64 sub-primitives;
FIG. 3 illustrates a ray tracing system according to examples described herein;
FIG. 4 is a flow chart of a method of compressing a sub-primitive presence indication block into a compressed data block;
FIG. 5a shows a partially existing object;
FIG. 5b illustrates a presence indication block of the object shown in FIG. 5a, the presence indication block comprising 256 presence indications arranged in a 16X 16 arrangement;
FIG. 6 is a flow chart illustrating an exemplary process for performing spatial decorrelation on sub-primitive presence indication blocks;
FIG. 7a illustrates the result of applying spatial decorrelation to the presence indication bank in the presence indication block illustrated in FIG. 5 b;
FIG. 7b illustrates the result of applying spatial decorrelation to the presence indication column in the presence indication block illustrated in FIG. 7 a;
fig. 7c shows the result of de-interleaving the presence indication columns in the presence indication block shown in fig. 7 b;
FIG. 7d illustrates the result of de-interleaving the presence indication rows in the presence indication block illustrated in FIG. 7 c;
FIG. 7e illustrates the result of applying spatial decorrelation to the presence indication bank in the presence indication sub-block illustrated in FIG. 7 d;
FIG. 7f illustrates the result of applying spatial decorrelation to the presence indication column in the presence indication sub-block illustrated in FIG. 7 e;
FIG. 7g illustrates the result of de-interleaving the presence indication rows and columns in the presence indication sub-block illustrated in FIG. 7 f;
FIG. 7h illustrates different regions within a spatial decorrelated presence indication block;
FIG. 8 is a flow chart illustrating an exemplary process for performing entropy encoding on spatially decorrelated presence indications;
FIG. 9 illustrates compressed data blocks;
FIG. 10 is a flow chart of a method of decompressing compressed data to determine sub-primitive presence indications for use in intersection testing;
FIG. 11 is a flow chart illustrating an exemplary process for performing entropy decoding on entropy encoded data to determine blocks of entropy decoded data values;
FIG. 12 is a flow chart illustrating an exemplary process for performing spatial re-correlation on blocks of entropy decoded values;
FIG. 13 illustrates a computer system in which a compression unit and/or a decompression unit is implemented; and
fig. 14 illustrates an integrated circuit manufacturing system for generating an integrated circuit embodying a compression unit or a decompression unit.
The figures illustrate various examples. Skilled artisans will appreciate that element boundaries (e.g., blocks, groups of blocks, or other shapes) illustrated in the figures represent one example of boundaries. In some examples, it may be the case that one element may be designed as a plurality of elements, or that a plurality of elements may be designed as one element. Where appropriate, common reference numerals have been used throughout the various figures to indicate like features.
Detailed Description
The following description is presented by way of example to enable a person skilled in the art to make and use the invention. The invention is not limited to the embodiments described herein and various modifications to the disclosed embodiments will be apparent to those skilled in the art.
Embodiments will now be described by way of example only. In the present disclosure, the sub-primitive presence indication represents the presence state of the corresponding sub-primitive.
In the ray tracing system described in the background section above, each presence indication is stored with 2 bits, such that if a primitive is subdivided into K sub-primitives, 2K bits are used for the presence indication of the primitive. Reducing the amount of data used to represent the presence indication is beneficial in reducing the amount of memory required to store the presence indication and reducing the amount of data transferred between different components in the ray tracing system. Thus, a reduction in the amount of data used to represent the presence indication may reduce the latency, power consumption, and silicon area of the ray tracing system.
As a simple example of how to compress the presence indications, it is noted that two bits are used for each presence indication to indicate one of three presence states (fully present, fully absent or partially present), so if the presence information of multiple sub-elements is combined, the presence indication of a group of sub-elements may be able to be represented with less than 2 bits per sub-element on average. As an example, a set of 5 sub-primitives (i.e., 3 5 The presence indication of 243 possible presence state combinations may be stored in 8 bits (i.e., 2 8 =256 possible encodings). In this simple example, if the primitive is subdivided into K sub-primitives, approximately 1.6K bits are usedPresence indication at the primitive. Compression of 2K bits to 1.6K bits represents a compression ratio of 80%, where the compression ratio is defined as the size of compressed data divided by the size of uncompressed data. Compressing the data to a greater extent results in a smaller compression ratio.
In the following examples, compression and decompression techniques are described that compress the presence indication to a greater extent (i.e., achieve a lower compression ratio) than the simple examples described above.
It is noted that having three states is attractive from a quality perspective, as opposed to simpler schemes with only "fully present" and "fully absent" states, because a dual state system is likely to result in visible aliasing (i.e., jagged edges), unless very high resolution, and therefore memory-intensive masking, is likely to be used. Furthermore, while a two-state scheme may benefit from not having to run a shader to "check texture," this also means that some use cases that do require partial transparency (e.g., modeling a tinted glass) will be suboptimal. Nevertheless, one skilled in the art can adapt the example of using the wavelet method described below to a system with only two states.
Fig. 3 illustrates a ray tracing system 300 including a ray tracing unit 302 and a memory 304. Ray tracing system 300 also includes a geometric data source 303 and a ray data source 305. Ray tracing unit 302 includes a processing module 306, an intersection test module 308, and processing logic 310. The intersection test module 308 includes one or more box intersection test units 312, one or more primitive intersection test units 314, and a decompression unit 318. The geometric data source includes a compression unit 316. The compression unit 316 includes a spatial decorrelation module 320 and an entropy encoding module 322. The decompression unit 318 includes an entropy decoding module 324 and a spatial re-correlation module 326. In operation, ray traced unit 302 receives geometric data defining objects within the 3D scene from geometric data source 303. Ray traced unit 302 also receives ray data from ray data source 305 that defines rays to be subjected to intersection testing. The light may be primary or secondary. The processing module 306 is configured to generate an acceleration structure based on the geometric data and to addThe speed structure is sent to memory 304 for storage therein. After the acceleration structure has been stored in the memory 304, the intersection test module 308 can retrieve nodes of the acceleration structure (e.g., including data defining an axis alignment box corresponding to the nodes) from the memory 304 to perform ray intersection tests for the retrieved nodes. The box intersection test unit 312 performs an intersection test to determine whether a ray intersects each of the bounding boxes corresponding to the nodes of the acceleration structure (where a miss may cull a large piece of the hierarchical acceleration structure). If a leaf node intersection is determined, primitive intersection test unit 314 performs one or more primitive intersection tests to determine which object(s), if any, the ray intersects. In this example, the primitives are triangles or pairs of triangles, but it is noted that in other examples, the primitives may be other shapes, e.g. other convex planar polygons, such as rectangles (including squares), pentagons, hexagons, etc., or (parametric) curved surfaces. The result of the intersection test indicates which primitive in the scene the ray intersects, and may also indicate other intersection data, such as the location on the object where the ray intersects the object (e.g., defined in terms of barycentric coordinates), and may also indicate the distance along the ray that the intersection occurs, e.g., euclidean or as a (signed) multiple of the ray's length. In some cases, the intersection determination may be based on whether the distance along the ray at which the intersection occurred is between the minimum clipping distance and the maximum clipping distance of the ray (which may be referred to as t min And t max ). The results of the intersection test are provided to processing logic 310. Processing logic 310 is configured to process the results of the intersection test to determine rendering values for images representing the 3D scene. The rendering values determined by processing logic 310 may be returned to memory 304 for storage therein to represent an image of the 3D scene.
In the examples described herein, ray tracing systems use acceleration structures in order to reduce the number of intersection tests that need to be performed on a ray for a primitive. It should be noted, however, that some other examples may not use acceleration structures and that rays may be simply tested for primitives without first attempting to reduce the number of intersection tests that need to be performed using acceleration structures.
When the primitive intersection test unit 314 of the intersection test module 308 determines that a ray intersects a primitive having a partial presence, then typically the intersection test module 308 will need to stop while a shader program is executing on the processing logic 310 to address the presence of the primitive at the intersection. Some of these pauses may be avoided by using sub-primitive presence indications as described herein.
In the examples described below, compression and decompression of the sub-primitive presence indication is performed using a wavelet method, which typically uses a lossless compression method. In this method, sub-primitive presence indication blocks are compressed into compressed data blocks for use in intersection testing in a ray tracing system. The inventors have realized that the distribution of presence indicators is rarely random because the primitives represent physical structures. Sub-primitives with a particular presence state are typically next to sub-primitives with the same presence state. This order of presence status distribution (i.e., non-randomness) may be used to achieve better compression of the presence indication block.
Note that in the example shown in fig. 3, compression unit 316 is implemented in geometry data source 303, but in other examples compression unit 316 may be implemented in a different component than geometry data source 303, and in some examples may be implemented in ray tracing unit 302, for example, as part of intersection test module 308. Furthermore, in the example shown in fig. 3, decompression unit 318 is implemented as part of intersection test module 308, but in other examples it may be implemented somewhere other than as intersection test module 308.
The method of compressing the sub-picture element presence indication block into a compressed data block is described at a high level with reference to the flowchart of fig. 4. The compression is performed by the compression unit 316.
In step S402, the compression unit 316 receives a block of sub-primitive presence indications to be compressed. For example, fig. 5a shows a partially existing object 500. In this example, the object is a portion of a leaf and is represented by a pair of triangle primitives 502 1 And 502 2 To represent that a quadrilateral is formed because they share edges. FIG. 5b showsThe presence of an object received at compression unit 316 indicates block 508. The object is divided into 256 sub-elements arranged in a 16 x 16 square. In other examples, the blocks may have different numbers of sub-primitives, and they may be arranged in other shapes (e.g., rectangular or triangular). In fig. 5b, each presence indication is represented by one of three shadows to represent one of three possible presence states. In particular, presence indications indicating the complete presence of the respective sub-element are represented by dark shading; presence indications indicating the complete absence of the corresponding sub-element are indicated by light shading; and the presence indication indicating the presence of the corresponding sub-primitive portion is represented by a medium level shading. As described above, the presence indication received in step S402 may be determined in a preprocessing step, which may be performed by an Application Programming Interface (API) or as part of a process of creating primitives and textures, for example, by a user. Each (uncompressed) sub-primitive presence indication is represented with 2 bits to indicate one of three presence states: (i) complete presence, (ii) complete absence, or (iii) partial presence. The presence indication is ternary data, i.e. they may have one of three possible values.
In step S404, the spatial decorrelation module 320 of the compression unit 316 performs spatial decorrelation on the sub-primitive presence indication in the sub-primitive presence indication block 508 to determine a spatial decorrelated presence indication. Each of three different presence states (partially present, fully present, and fully absent) is assigned a certain value. In the examples described herein, a partial presence state is assigned a zero value. In some cases, a fully present state is assigned a value of one, while a fully absent state is assigned a value of two; in other cases, however, the fully present state is assigned a value of two, while the fully absent state is assigned a value of one. As will be seen below, a modulo-3 operation is used in the spatial decorrelation process performed in step S404, and when a modulo-3 operation is used, then the value 2 is all equal to the value-1.
In step S406, the entropy encoding module 322 of the compression unit 316 performs entropy encoding on the spatial decorrelated presence indication to determine entropy encoded data.
In step S408, the compression unit 316 stores the entropy-encoded data in the compressed data block.
In step S410, the compression unit 316 outputs the compressed data block for storage. The compressed data blocks may be stored with the primitive data for the primitives, for example, in the geometric data source 303. The compressed data blocks may be passed to ray tracing unit 302 along with primitive data for the primitives and may be stored in memory 304 and/or memory within intersection test module 308 for use by primitive intersection test unit 314 as part of performing intersection tests of the rays with respect to the primitives.
Performing spatial decorrelation (e.g., in step S404) tends to reduce the magnitude of the values in the presence indication block. In particular, as will be seen in the following more detailed description, after spatial decorrelation, the presence of a number of spatial decorrelations indicates that there is a zero value. Entropy encoding produces entropy encoded data having a variable length, where the length (or "size") of the entropy encoded data depends on the value of the data on which the entropy encoding is performed. In particular, in some entropy coding schemes (e.g., biased Elias coding), entropy coded data will tend to have a relatively small size if the value on which entropy coding is performed has a relatively low magnitude, and entropy coded data will tend to have a relatively large size if the value on which entropy coding is performed has a relatively high magnitude. Thus, by performing spatial decorrelation (which reduces the magnitude of the value of the presence indication) and then performing entropy encoding on the spatially decorrelated presence indication, the data representing the presence indication is typically compressed. In particular, if the presence indication has an organized structure such that many presence indications have the same state as their neighboring presence indications in the presence indication block 508, the spatially decorrelated values will have low values and the entropy encoded data will tend to have smaller sizes; whereas if the presence indications have few structures (e.g., if they are mostly random or pseudo-random) such that fewer presence indications have the same state as their neighboring presence indications in the presence indication block 508, the spatial decorrelation value will have a high value and the entropy encoded data will tend to have a larger size. As described above, primitives processed in a ray tracing system represent physical structures, and the distribution of sub-primitive presence indications of primitives is not random.
Fig. 6 is a flowchart illustrating an exemplary process performed by spatial decorrelation module 320 for performing spatial decorrelation on the sub-primitive existence indication block in step S404. Spatial decorrelation module 320 receives a presence indication block, such as block 508 shown in fig. 5 b.
In step S602, for each of the multiple rows of presence indications (e.g., for all rows), for one or more of the presence indications in the row (e.g., for an alternative presence indication in the row), spatial decorrelation module 320: (i) Determining a predicted value of the presence indication based on one or more other presence indications in the row (e.g., based on neighboring (or adjacent) presence indications in the row), and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication.
In step S604, for each of a plurality of columns of presence indications (e.g., for all columns), for one or more of the presence indications in that column (e.g., for alternative presence indications in that column), spatial decorrelation module 320: (i) Determining a predicted value of the presence indication based on one or more other presence indications in the column (e.g., based on neighboring (or adjacent) presence indications in the column), and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication.
In the example shown in fig. 6, spatial decorrelation is performed on the rows of presence indication blocks and then spatial decorrelation is performed on the columns, but in other examples, spatial decorrelation may be performed on the columns of presence indication blocks and then spatial decorrelation is performed on the rows (i.e. steps S602 and S604 may be reordered).
As described above, with reference to the presence indication block 508 shown in fig. 5b, in an example, the value of the presence indication of the presence state representing the partial presence is zero, the value of the presence indication of the presence state representing the complete presence is one, and the value of the presence indication of the presence state representing the complete absence is two.
In each of steps S602 and S604, where spatial decorrelation is performed on the row presence indication in either the first dimension or the second dimension, by performing a difference calculation using a modulo 3 operation to determine a difference between the presence indication and a predicted value of the presence indication, the alternative presence indications may each be replaced with a difference between the presence indication and the predicted value of the presence indication. The use of modulo-3 arithmetic means that the value 2 is all equal to the value-1.
For example, a prediction function may be used to perform a predicted value of a presence indication based on two adjacent presence indications in the same row, the function operating according to the following equation:
And alternatively presence indication (p 2i+1 ) Can be indicated by presence (p 2i+1 ) The difference between the predicted value of the presence indication is replaced according to the following equation:
p 2i+1 :=p 2i+1 -Predict(p 2i ,p 2i+2 )mod 3,
wherein p is 2i Is the (2 i) th presence indication in the row, p 2i+1 Is the (2i+1) th presence indication in the row, and p 2i+2 Is the (2i+2) th presence indication in the row. i is an integer ranging from 0 to (N/2) -1, where N is the number of rows (e.g., rows or columns) in a block or sub-block on which spatial decorrelation is being performed. In this way, each "odd" presence indication in each row may be replaced with a linear prediction based on its value of the adjacent (i.e., neighboring) "even" presence indications in that row.
Table 1 below shows the output of the Predict function for the different inputs (L, R), where the value-1 is all equal to the value 2.
L R Predict(L,R)
-1 -1 -1
-1 0 0
-1 1 0
0 -1 0
0 0 0
0 1 0
1 -1 0
1 0 0
1 1 1
TABLE 1
As can be seen from table 1, the output of the prediction function is zero unless l=r+.0, in which case the output of the prediction function is equal to L (which is equal to R). It can also be seen from Table 1 that the Predict function is functionally equivalent toWhich represents the average of the input values rounded to zero.
If p is 2i+1 Is the last presence indication in the row such that p 2i+2 Not present in the presence indication block, p 2i The value of (2) is used to replace p in the prediction function 2i+2 . For example, for a presence indication on the right boundary of the presence indication block, step S602 predicts a presence indication based only on the adjacent presence indication to the left of the predicted presence indication.
Fig. 7a shows the result 702 of applying spatial decorrelation to the presence indication bank in the presence indication block shown in fig. 5b in step S602. As shown in fig. 5b, in fig. 7a, a dark shade indicates a presence indication (whose value is 1) with a completely existing presence state, a medium shade indicates a presence indication (whose value is 0) with a partially existing presence state, and a light shade indicates a presence indication (whose value is 2, which is equal to-1) with a completely absent presence state. Furthermore, in fig. 7a, a white square represents a spatial decorrelation presence indication of zero, a square with a horizontal line shading represents a spatial decorrelation presence indication of 1, and a square with a vertical line shading represents a spatial decorrelation presence indication of 2 (all equal to-1).
Fig. 7b shows the result 704 of applying spatial decorrelation to the presence indication column in the presence indication block shown in fig. 7 a.
In step S606, the spatial decorrelation module 320 deinterleaves the presence indication in the block, thereby grouping the presence indication that is not replaced into sub-blocks within the presence indication block. Fig. 7c shows the result 706 of de-interleaving the presence indication columns in the presence indication block shown in fig. 7b, i.e. all column presence indications that are not replaced are grouped into the left half of the block 706. Fig. 7d shows the result 708 of de-interleaving the presence indication rows in the presence indication block shown in fig. 7c, i.e. all row presence indications that are not replaced are grouped into the top half of the block 708. In this way, the spatial decorrelation module 320 deinterleaves the presence indication, thereby grouping the presence indication that is not replaced into the sub-block 710 located in the upper left quadrant of the presence indication block. In other words, in step S606, the values are deinterleaved by moving the even rows and columns to the upper left of the block. The upper right quadrant of the presence indication block 708 shown in fig. 7d includes presence indications: (i) In a row that includes some presence indications that have been replaced, and (ii) in a column that does not include any presence indications that have been replaced. The lower left quadrant of the presence indication block 708 shown in fig. 7d includes presence indications: (i) In columns that include some presence indications that have been replaced, and (ii) in rows that do not include any presence indications that have been replaced. The lower right quadrant of the presence indication block 708 shown in fig. 7d includes presence indications: (i) In a row that does not include any presence indication that has been replaced, and (ii) in a column that does not include any presence indication that has been replaced.
The spatial decorrelation process may then be repeated on the top left quarter, i.e. on sub-block 710. In step S608, the spatial decorrelation module 320 determines whether another level of spatial decorrelation is to be performed. If another level of spatial decorrelation is to be performed, the method returns from step S608 to step S602 and spatial decorrelation is performed on sub-block 710.
In this iteration, in step S602, for each of the multiple rows of presence indications within the presence indication sub-block 710 (e.g., for all of the row presence indications within the sub-block 710), for one or more of the presence indications in the row (e.g., for the alternate presence indication in the row), the spatial decorrelation module 320: (i) Determining a predicted value of the presence indication based on one or more other presence indications (e.g., adjacent presence indications) in the row within the sub-block 710, and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication. Fig. 7e shows the result 712 of applying spatial decorrelation to the presence indication bank in the presence indication sub-block 710 shown in fig. 7 d.
In this iteration, in step S604, for each column of the multiple columns of presence indications within the presence indication sub-block 710 (e.g., for all columns of presence indications within the sub-block 710), for one or more of the presence indications in that column (e.g., for alternative presence indications in that column), the spatial decorrelation module 320: (i) Determining a predicted value of the presence indication based on one or more other presence indications (e.g., adjacent presence indications) in the columns within the sub-block 710, and (ii) replacing the presence indication with a difference between the presence indication and the determined predicted value of the presence indication. Fig. 7f shows the result 714 of applying spatial decorrelation to the presence indication column in the presence indication sub-block 710 shown in fig. 7 e.
In this iteration, in step S606, spatial decorrelation module 320 deinterleaves the presence indication in sub-block 710, thereby grouping the presence indication that is not replaced into a second level sub-block within the presence indication block. Fig. 7g shows the result 716 of de-interleaving the rows and columns of presence indications in the presence indication sub-block 710 shown in fig. 7f, i.e. grouping presence indications that are not replaced into a second level sub-block 718 located in the upper left quadrant of the presence indication sub-block. In other words, in step S606, the values are deinterleaved by moving the even rows and columns to the upper left of the sub-block.
In step S608, the spatial decorrelation module 320 determines whether another level of spatial decorrelation is to be performed. This process may be repeated one or more times. If another level of spatial decorrelation is to be performed, the method returns from step S608 to step S602 and spatial decorrelation is performed on the second level sub-block 718. If another level of spatial decorrelation is not performed, the spatial decorrelation process (in step S404) ends and the method proceeds from step S608 to step S406. In this example, there are two levels of spatial decorrelation. In other numbers there may be (only) one level of spatial decorrelation or more than two levels of spatial decorrelation, e.g. there may be three or four levels of spatial decorrelation.
Fig. 7h shows different regions within the presence indication block 720 of spatial decorrelation, wherein three levels of spatial decorrelation are applied. These three levels may be referred to as "high", "medium" and "low". The upper right quadrant 722 of block 720 is a representation of the high frequency information content in the horizontal direction. The lower left quadrant 724 of block 720 is a representation of the high frequency information in the vertical direction. The lower right quadrant 726 of block 720 is a representation of high frequency information in the diagonal direction. Subdividing the upper left quadrant into four smaller quadrants, recursively yields: (i) in the upper right smaller quadrant 732, a representation of "middle" frequency information in the horizontal direction, (ii) in the lower left smaller quadrant 734, a representation of "middle" frequency information in the vertical direction, and (iii) in the lower right smaller quadrant 736, a representation of "middle" frequency information in the diagonal direction. This process may be repeated depending on how many wavelet steps are applied. Fig. 7h shows that the process has been repeated a third time, wherein the upper left "smaller" quadrant has been subdivided into four "even smaller" quadrants, to yield: (i) in the upper right and even smaller quadrant 742, a representation of "low" frequency information in the horizontal direction, (ii) in the lower left and even smaller quadrant 744, a representation of "low" frequency information in the vertical direction, and (iii) in the lower right and even smaller quadrant 746, a representation of "low" frequency information in the diagonal direction. The remaining upper left and even smaller quadrant 750 represents the sub-sampled source data, i.e. it represents the presence indication that is not replaced by the spatial decorrelation process.
In the encoding scheme, the non-decorrelated portion of the data, i.e., the upper left portion (shown as sub-block 718 in fig. 7g or as even smaller quadrant 750 in fig. 7 h) corresponding to the resulting data, may be stored directly at each presence indication 2 bits.
Considering only the decorrelation results, the result of the spatial decorrelation process (step S404) is that the block comprises many more zeros than one or two, for example as shown in fig. 7g, wherein a white square represents a zero value. In the example shown in fig. 7g, 189 of the decorrelated values (i.e. 79% of the 240 decorrelated values) are zero and 51 of the decorrelated values (i.e. 21% of the 240 decorrelated values) are non-zero. The greater the spatial correlation in the original image, the more zeros in the block and thus the greater the compression that the entropy coding can achieve.
In different examples, the entropy encoding performed by entropy encoding module 322 may be different.
One exemplary entropy encoding scheme is based on huffman coding. Based on an evaluation of a set of sample data, probability P of decorrelation square with zero coefficients 0 Typically 0.7 to 0.8, and it is a probability P of one 1 Typically 0.1 to 0.15. Similarly, probability P of two 2 Typically 0.1 to 0.15. Note that P 0 +P 1 +P 2 =1. With these probabilities, huffman techniques that resolve from LSB to MSB may suggest codes such as: a zero value will be represented by the binary string "0", a value of one by "01", and a value of two by "11". Those skilled in the art will appreciate that equivalent codes exist, such as "1", "00", "10", for example. Those skilled in the art will understand how to construct the following examples from such alternative encodings. By this scheme, the expected encoded bit for each value is P 0 +2(P 1 +P 2 )。
However, a common problem with entropy encoding is the difficulty of random access at decoding, i.e. it can be challenging to quickly derive arbitrary values (or consecutive sets of values starting from arbitrary indices) from the encoded string. As a simple example, the following 6 values "one, zero, one, two, zero, one" (listed in right-to-left order, from the "first" value, with an index of 0 on the "right" to the last, with an index of 5 on the "left") will be represented by the binary string "0100111001" in the huffman coding described above. For example, if it is desired to obtain the value at index 3, without sequentially decoding the values corresponding to indexes 0, 1, and 2, it is not immediately clear where in the string one or more encoded bits of the value reside, as the length of each value is not known in advance. As the number of encoded values increases, it becomes increasingly difficult-the cost of decoding term N is O (N).
To help alleviate this problem, in another exemplary encoding, the huffman code is instead divided into always present prefix bits and optional suffix bits that are present only for one or two values. All prefix bits are stored consecutively, as are all suffix bits. For fig. 7g, there are always 240 prefix bits in this coding scheme. They are stored at known offsets in the data structure and, therefore, can be randomly accessed for a given value. The beginning of the suffix bits is also stored at a known offset, but the content has a variable length-in the example of fig. 7g, there will be 51 suffix bits.
The decoding process for the value at index k may proceed as follows: the bit k in the prefix bit can be accessed randomly (either in hardware by MUX or in software by shift and mask) because all values are present. If the bit has a value of "0", the corresponding value is automatically zero and no further decoding is required. However, if the found prefix bit is "1", then a suffix bit corresponding to the kth value needs to be found. This can be achieved as follows. A bit mask is constructed that sets all LSBs up to (but not including) bit k to 1 and all other LSBs are cleared. Performing a bitwise AND between the mask bits and the prefix bits-counting/summing the number of set bits in the result produces a digital offset. The offset represents the position of the suffix of the kth value in the suffix bits, which can then be accessed by the MUX or shift/mask.
The short run of consecutive values starting from the kth value may also use the method described above to obtain the position of the kth value in the data, but may then revert to standard value-by-value decoding to obtain the remaining values.
Referring to fig. 8 and 9, in another coding scheme, random access may be further simplified, but possibly at the cost of less compression. Fig. 8 is a flowchart illustrating an exemplary process for performing entropy encoding on the spatial decorrelated presence indication in step S406 to determine entropy encoded data. Fig. 9 shows an entropy encoded data block 900 comprising a size indication 902 and encoded data values 904.
A spatial decorrelated presence indication block is received at entropy encoding module 322 (e.g., as shown in fig. 7 g). The block of spatially decorrelated presence indications 716 is subdivided into a plurality of spatially decorrelated subsets of presence indications. For example, each subset may be a 2 x 2 subset of the spatial decorrelated presence indication. In other examples, the subsets may be different sizes and/or shapes, e.g., they may be 4 x 4 or 2 x 4 subsets of spatially decorrelated presence indications. Steps S802, S804 and S806 shown in fig. 8 are performed for each subset of the spatial decorrelated presence indication.
In step S802, the entropy encoding module 322 determines the number of bits that may be used to represent the maximum value of the presence indication of spatial decorrelation in the subset. For example, the determined number of bits may be a minimum number of bits that may be used to losslessly represent a maximum value of the presence indication of spatial decorrelation in the subset. Alternatively, module 322 may identify a set of indications used in the subset and divide each indication into one of four categories: all decorrelation present values are 0, all values are 0 or 1, all values are 0 or 2, or these values may be any of 0, 1 or 2. These mean that the subset will require 0, N, N or 2N bits, respectively, where N is the number of values present in the subset.
In step S804, the entropy encoding module 322 includes an indication of the determined number of bits of the subset in the size indication field 902 of the entropy encoded data 900.
In step S806, the entropy encoding module 322 includes encoded data values representing the presence indication of spatial decorrelation in the subset in the encoded data value field 904 of the entropy encoded data 900, wherein each encoded data value has the determined number of bits of the subset.
For example, if the four values in the 2 x 2 subset are 2, 1, and 0, the minimum number of bits available to lossless representation of the maximum value in the subset is two. Thus, for the subset, four encoded data values may be represented (in binary) as 10, 01, and 00 in encoded data value field 904, and a size indication indicating 2 bits may be stored for the subset in size indication field 902. As another example, if the four values in the 2 x 2 subset are 0, 1, and 0, the minimum number of bits available to lossless represent the maximum value in the subset is one. Thus, for the subset, four encoded data values may be represented (in binary) as 0, 1, and 0 in encoded data value field 904, and a size indication indicating 1 bit may be stored for the subset in size indication field 902. As another example, if the four values in the 2 x 2 subset are 0, and 0, the minimum number of bits available to lossless representation of the maximum value in the subset is zero. Thus, for this subset, the four values may be represented without storing any bits in the encoded data value field 904, and a size indication indicating 0 bits may be stored for this subset in the size indication field 902. Finally, since the size indication would require 2 bits, the fourth example (i.e., comprising only a subset of 0 and 2) may also be encoded with 4 bits. It will thus be appreciated that entropy encoding tends to achieve higher levels of compression (i.e., lower compression ratios) when there are more zeros in the presence indication of spatial decorrelation being entropy encoded.
In the above example, in the case of a 16 x 16 block of spatial decorrelated presence indications encoded using a 2 x 2 subset, then there are 64 subsets, thus, the size indication 902 will use 128 bits in the entropy encoded data 900, the size of the encoded data value field 904 is variable.
The exemplary entropy encoding process described with reference to fig. 8 and 9 allows for fast decompression because, although the encoded data values have variable lengths, the size indications have fixed lengths, so they can be used to identify where the data is for each encoded data value within the encoded data value field 904.
As another example, the entropy encoding module 322 may perform run-length encoding to perform entropy encoding to determine entropy encoded data for the presence indication of spatial decorrelation. Run-length encoding is a known entropy encoding process. The run-length encoding may be performed according to any order, such as raster scan order or morton order. Run-length encoding may achieve good (i.e., low) compression ratios, but performing fast decompression may be difficult because it is difficult to decode individual encoded values without decoding all previous encoded values.
When the compression unit 316 has compressed the block indicated by the sub-primitive presence into a compressed data block, the size of the compressed data block may be compared to a threshold size corresponding to the budget (i.e., maximum acceptable size) of the compressed data block. Compression is acceptable if the size of the compressed data block is less than or equal to the threshold size. However, if the size of the compressed data block is greater than the threshold size, the source (i.e., the block of sub-primitive presence indications) may be filtered, and then the filtered block of sub-presence indications may be compressed using the same method as described above. In this context "filtering" means that a low pass filter has been applied to the input data in the sense that it will replace some of the completely existing indications and/or the completely absent indications with indications of partial existence. This may be done until the size of the compressed data block is not greater than the threshold size. Looking at the compression process in fig. 7 a-7 g, assuming that the size of the resulting data is less than the specified budget, the compression process may stop in the step indicated in fig. 7 d. If not, the process proceeds as planned and the data size is again assessed in the step indicated in FIG. 7 g.
With reference to the flowchart in fig. 10, a method of decompressing compressed data performed by decompression unit 318 to determine one or more sub-primitive presence indications for use in intersection testing in a ray tracing system is described at a high level. The decompression process described with reference to fig. 10 is easy to implement, so the latency and power consumption of the decompression unit 318 are low, and the silicon area is small if the decompression unit is implemented in hardware.
In step S1002, the decompression unit receives a compressed data block of the sub-primitive existence indication block.
In step S1004, the decompression unit 318 receives an indication of a sample position within the sub-primitive presence indication block, wherein the presence indication is to be determined. The sample position indication may comprise two coordinates (x, y) to indicate the position within the block of the sub-picture presence indication.
In step S1006, the entropy decoding module 324 of the decompression unit 318 reads entropy encoded data from the compressed data block. In step S1008, the entropy decoding module 324 of the decompression unit 318 performs entropy decoding on the entropy encoded data to determine a block of entropy decoded data values. The entropy decoding module 324 is configured to perform an entropy decoding process that is complementary to an entropy encoding process that is performed to encode data (e.g., by the entropy encoding module 322 as described above).
In step S1010, the spatial re-correlation module 326 of the decompression unit 318 performs spatial re-correlation on the entropy decoded value blocks to determine one or more (e.g., all) of the sub-primitive presence indications in the sub-primitive presence indication block. The output of the spatial re-correlation module 326 (i.e., the spatial re-correlation block of entropy decoded values) represents one or more (e.g., all) of the sub-primitive presence indications in the sub-primitive presence indication block.
In step S1012, the decompression unit 318 uses one or more of the determined sub-primitive presence indications in the sub-primitive presence indication block to determine a presence indication of the sample location for which an indication was received in step S1004.
In step S1014, the decompression unit 318 outputs the presence indication at the determined sample position. The determined presence indication at the sample location may be used to determine the presence of a primitive at an intersection with a ray as part of performing an intersection test on a ray in a ray tracing system.
In different examples, the entropy decoding performed by entropy decoding module 324 may be different. Fig. 11 is a flowchart showing a first exemplary process for performing entropy decoding on the entropy-encoded data 900 to determine blocks of entropy-decoded data values in step S1008. This example entropy decoding process decodes data that has been encoded as described above with reference to fig. 8 and 9. In particular, fig. 9 shows an entropy encoded data block 900 comprising a size indication 902 and encoded data values 904, which may be decoded by the entropy decoding module 324 in step S1008. In particular, there is a subset of encoded data values corresponding to the subset of presence indications. For example, each subset may be a 2 x 2 subset of the encoded data values. In other examples, the subsets may be different sizes and/or shapes, e.g., they may be 4 x 4 or 2 x 4 subsets of encoded data values. For each subset, a size indication is stored in the size indication field 902, and for each encoded data value in the subset, the encoded data value is stored in the encoded data value field 904. Each size indication may have 2 bits. Each encoded data value in the subset has a number of bits indicated by the size indication of the subset.
In step S1102, the entropy encoded data 900 is received at the entropy decoding module 324.
In step S1104, the entropy decoding module 324 reads an indication (i.e., a size indication) identifying, for each subset of encoded data values, the number of bits for each encoded data value in the subset from the size indication field 902 of the entropy encoded data 900. In step S1104, each size indication of the different subsets may be read in parallel. This is possible because the size indication has a fixed bit length (e.g., 2 bits each) and is stored in a field separate from the variable length encoded data value (size indication field 902).
In step S1106, the entropy decoding module 324 parses the encoded data value in the encoded data value field 904 of the entropy encoded data 900 based on the identified number of bits (i.e., based on the size indication), thereby interpreting the encoded data value.
In step S1108, the entropy decoding module 324 determines entropy decoded data values by selectively presetting leading zeros to the interpreted encoded data values such that each entropy decoded data value has the same number of bits (e.g., 2 bits) as each presence indication in the sub-primitive presence indication block. The entropy decoded data values may then be output from the entropy decoding module 324 and passed to the spatial re-correlation module 326 (i.e., after step S1108, the method may pass to step S1010).
The exemplary entropy decoding process described with reference to fig. 11 allows for fast decompression because, although the encoded data values have variable lengths, the size indications have fixed lengths, so they can be used to identify where the data is for each encoded data value within the encoded data value field 904. This allows each encoded data value to be read in parallel.
As another example, the entropy decoding module 324 may perform huffman decoding to perform entropy decoding on the entropy encoded data to determine a block of entropy decoded data values. This is accomplished if a huffman coding process is used to perform the entropy coding process that produces entropy coded data. Huffman decoding is a known entropy decoding process. Huffman decoding may be used to decompress multiple sets of entropy encoded data values instead of separate entropy encoded data values.
As another example, the entropy decoding module 324 may perform run-length decoding to perform entropy decoding on the entropy encoded data to determine a block of entropy decoded data values. This is accomplished if the run-length encoding process is used to perform an entropy encoding process that produces entropy encoded data. Run-length decoding is a known entropy decoding process. The run-length decoding may be performed according to any order, such as raster scan order or morton order, as long as it matches the order in which run-length encoding is performed. With run-length decoding, it may be difficult to decode individual encoded values without decoding all previously encoded values.
Fig. 12 is a flowchart illustrating an example process performed by spatial re-correlation module 326 for performing spatial re-correlation on blocks of entropy decoded values. The spatial re-correlation module 326 receives the block of entropy decoded values. As an example, the entropy decoded value block may have the values shown in fig. 7 g. The spatial re-correlation process performs a process that is the inverse of the spatial de-correlation process performed during compression.
In step S1202, the spatial re-correlation module 326 interleaves the entropy-decoded values from the second-level sub-block 718 with the entropy-decoded values in the block of entropy-decoded values that have not been spatially re-correlated. If the input to step S1202 is an entropy decoded value block 716 having a value as shown in fig. 7g, the output of step S1202 is an entropy decoded value block 714 having a value as shown in fig. 7 f. As described above, sub-block 710 is located in the upper left quadrant of block 714.
In step S1204, for each of the multiple columns of entropy decoded values (e.g., for all columns within the sub-block 710), for one or more of the entropy decoded values in the columns (e.g., for alternative entropy decoded values in the columns within the sub-block 710), the spatial re-correlation module 326: (i) A predicted value of the entropy decoded value is determined based on one or more other entropy decoded values in the column (e.g., based on neighboring (or adjacent) entropy decoded values in the column within the sub-block 710), and (ii) the entropy decoded value is replaced with a value of the sum of the entropy decoded value and the determined predicted value of the entropy decoded value. The result of step S1204 is an entropy decoded value block 712, the value of which is shown in fig. 7 e.
In step S1206, for each of the plurality of rows of entropy decoded values (e.g., for all of the rows within the sub-block 710), for one or more of the entropy decoded values in the row (e.g., for alternative entropy decoded values in the row within the sub-block 710), the spatial re-correlation module 326: (i) A predicted value of the entropy decoded value is determined based on one or more other entropy decoded values in the row (e.g., based on neighboring (or adjacent) entropy decoded values in the row within the sub-block 710), and (ii) the entropy decoded value is replaced with a value of the sum of the entropy decoded value and the determined predicted value of the entropy decoded value. The result of step S1206 is an entropy decoded value block 708, the value of which is shown in fig. 7 d. It can be seen in fig. 7d that all entropy decoded values within sub-block 710 have now been spatially re-correlated.
In the example shown in fig. 12, spatial re-correlation is performed on the columns of the entropy decoding value blocks and then spatial re-correlation is performed on the rows, but in other examples, spatial re-correlation may be performed on the rows of the entropy decoding value blocks and then spatial re-correlation may be performed on the columns (i.e., steps S1204 and S1206 may be reordered).
In step S1208, the spatial re-correlation module 326 determines whether another level of spatial re-correlation is to be performed. If another level of spatial re-correlation is to be performed, the method returns from step S1208 to step S1202 and spatial re-correlation is performed at the other level. In this example, spatial re-correlation will be performed at more than one level, i.e. at the level of the entire block, instead of on sub-blocks, so the method returns to step S1202.
In this iteration, in step S1202, the spatial re-correlation module 326 interleaves the entropy-decoded values from the sub-block 710 with the entropy-decoded values in the block of entropy-decoded values that have not been spatially re-correlated. Interleaving may be performed in two stages, for example by interleaving rows and then by interleaving columns. If the input to step S1202 in this iteration is a block of entropy decoded values 708 having values as shown in fig. 7d, the block will be as shown in fig. 7c after the rows are interleaved. Then after the columns are interleaved the block will be as shown in figure 7 b. Thus, in this iteration, the output of step S1202 is an entropy decoded value block 704, the value of which is shown in fig. 7 b.
In this iteration, in step S1204, for each of the multiple columns of entropy decoded values (e.g., for all columns within block 704), for one or more of the entropy decoded values in the columns (e.g., for alternative entropy decoded values in the columns within block 704), the spatial re-correlation module 326: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the column (e.g., based on neighboring (or adjacent) entropy decoded values in the column within block 704), and (ii) replacing the entropy decoded value with a value of the sum of the entropy decoded value and the determined predicted value of the entropy decoded value. The result of step S1204 in this iteration is an entropy decoded value block 702, the values of which are shown in fig. 7 a.
In this iteration, in step S1206, for each of the plurality of rows of entropy decoded values (e.g., for all rows within block 702), for one or more of the entropy decoded values in the row (e.g., for alternative entropy decoded values in the row within block 702), spatial re-correlation module 326: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row (e.g., based on neighboring (or adjacent) entropy decoded values in the row within the block 702), and (ii) replacing the entropy decoded value with a value of the sum of the entropy decoded value and the determined predicted value of the entropy decoded value. The result of step S1206 in this iteration is an entropy decoded value block 508, the values of which are shown in fig. 5 b. As can be seen in fig. 5b, all entropy decoded values within block 508 have now been spatially re-correlated.
In this iteration, in step S1208, the spatial re-correlation module 326 determines that no more levels of spatial re-correlation are to be performed, so the spatial re-correlation process (in step S1010) ends, and the method passes from step S1208 to step S1012. Block 508 represents the decompressed sub-primitive presence indication of the sub-primitive presence indication block, which may then be used in step S1012 to determine a presence indication at the sample location. In this example, there are two levels of spatial re-correlation. In other numbers there may be (only) one level of spatial re-correlation or more than two levels of spatial re-correlation, e.g. there may be three or four levels of spatial re-correlation.
As described above, in the example, the value of the presence indication of the presence state representing the partial presence is zero, the value of the presence indication of the presence state representing the complete presence is one, and the value of the presence indication of the presence state representing the complete absence is two.
In each of steps S1204 and S1206, wherein spatial re-correlation is performed on the rows of entropy-decoded values in the first dimension or the second dimension, the alternative entropy-decoded values may be replaced with the value of the sum of the entropy-decoded values and the predicted values of the entropy-decoded values by performing a summation calculation using a modulo 3 operation to determine the value of the sum of the entropy-decoded values and the predicted values of the determined entropy-decoded values. The use of modulo-3 arithmetic means that the value 2 is all equal to the value-1.
For example, a prediction value of an entropy decoding value based on two adjacent entropy decoding values in the same row may be performed using a prediction function that operates according to the following equation:
and alternative entropy decoded values (d 2i+1 ) Can be entropy decoded into a value (d 2i+1 ) And the value of the sum of the prediction values indicated by the presence of the entropy decoded value is replaced according to the following formula:
d 2i+1 :=p 2i+1 +Predict(d 2i ,d 2i+2 )mod 3,
wherein d is 2i Is the (2 i) th entropy decoded value in the row, d 2i+1 Is the (2i+1) th entropy decoded value in the row, and d 2i+2 Is the (2i+2) th entropy decoded value in the row. i is an integer ranging from 0 to (N/2) -1, where N is the number of rows (e.g., rows or columns) in a block or sub-block on which spatial re-correlation is being performed. In this way, each "odd" entropy decoded value in each row may be replaced with a linear prediction of its value based on neighboring (i.e., adjacent) "even" entropy decoded values in that row.
If d 2i+1 Is the last entropy decoded value in the row such that d 2i+2 Not present in the presence indication block, d 2i The value of (2) is used to replace d in the prediction function 2i+2 . For example, for entropy-decoded values on the right boundary of the block of entropy-decoded values, step S1206 predicts the entropy-decoded value based only on the neighboring entropy-decoded values to the left of the entropy-decoded value being predicted.
In the examples described herein, the sub-primitives are squares, for example as shown in fig. 5b, but in other examples they may be other shapes, for example triangles.
It should be understood that the specific numbers in the examples described herein (e.g., the number of sub-primitive presence indications in the sub-primitive presence indication block) are given by way of example, and that in other implementations these numbers may be different.
Furthermore, the examples provided herein use triangles and barycentric coordinates, but the solutions presented herein are also applicable to (part of) a surface that can be parametrically represented, e.g. tensor product sheets, such as bicubic sheets, spheres or surfaces that are rotated or extruded. These parameters may be used to index into the presence indication.
The primary examples described herein have used presence indication to accelerate ray tracing, but the method is also applicable to other rendering techniques, such as rasterization. As described in the background section above, GB patent 2538856 and 2522868 describe the use of opacity state diagrams in rasterization systems to accelerate the processing of through primitives. In particular, an opacity state diagram is used to indicate whether a block of texels of a texture is completely opaque, completely transparent, partially transparent, or a mixture of these states. The indication in the opacity state diagram may be used to speed up processing through polygons in a rasterization system. Similar to the presence indication described above with reference to the ray tracing system, each opacity state in the rasterization systems of GB2538856B and GB2522868B is represented by two bits. The compression/decompression methods of presence indication described herein may also be applied to compress/decompress indications of opacity states in rasterization systems such as described in GB2538856B and GB 2522868B. The "partially transparent" state and the "hybrid" state may be combined into a single state such that there are only three states, which may then be compressed/decompressed in the same manner as the fully present, partially present, and fully absent states in the ray tracing system described above.
FIG. 13 illustrates a computer system in which the compression and decompression units described herein may be implemented. The computer system includes a CPU 1302, GPU 1304, memory 1306, and other devices 1314, such as a display 1316, speakers 1318, and a camera 1322. Processing block 1310 (corresponding to ray tracing unit 302) is implemented on GPU 1304 and Neural Network Accelerator (NNA) 1311. In other examples, the processing block 1310 may be implemented on the CPU 1302 or within the NNA 1311. The components of the computer system may communicate with each other via a communication bus 1320. Storage 1312 (corresponding to memory 304) is implemented as part of memory 1306.
While FIG. 13 illustrates one implementation of a graphics processing system, it should be appreciated that a similar block diagram may be drawn for an artificial intelligent accelerator system, for example, by replacing the CPU 1302 or GPU 1304 with a Neural Network Accelerator (NNA) 1311, or by adding the NNA as a separate unit. In such cases, the processing block 1310 may also be implemented in the NNA.
The ray traced unit of fig. 3 is shown as including several functional blocks. This is merely illustrative and is not intended to limit the strict division between the different logic elements of such entities. Each of the functional blocks may be provided in any suitable manner. It should be understood that intermediate values described herein as being formed by the compression and/or decompression unit need not be physically generated by the compression and/or decompression unit at any point, and may merely represent logical values that conveniently describe the processing performed by the compression and/or decompression unit between its input and output.
The compression and/or decompression units described herein may be implemented in hardware on an integrated circuit. The compression and/or decompression unit described herein may be configured to perform any of the methods described herein. Generally, any of the functions, methods, techniques, or components described above may be implemented in software, firmware, hardware (e.g., fixed logic circuitry) or any combination thereof. The terms "module," "functionality," "component," "element," "unit," "block," and "logic" may be used herein to generally represent software, firmware, hardware, or any combination thereof. In the case of a software implementation, the module, functionality, component, element, unit, block or logic represents program code that performs specified tasks when executed on a processor. The algorithms and methods described herein may be executed by one or more processors executing code that causes the processors to perform the algorithms/methods. Examples of a computer-readable storage medium include Random Access Memory (RAM), read-only memory (ROM), optical disks, flash memory, hard disk memory, and other memory devices that can store instructions or other data using magnetic, optical, and other techniques and that can be accessed by a machine.
The terms computer program code and computer readable instructions as used herein refer to any kind of executable code for a processor, including code expressed in a machine language, an interpreted language, or a scripting language. Executable code includes binary code, machine code, byte code, code defining an integrated circuit (such as a hardware description language or netlist), and code expressed in programming language code such as C, java or OpenCL. The executable code may be, for example, any kind of software, firmware, script, module, or library that, when properly executed, handled, interpreted, compiled, run in a virtual machine or other software environment, causes the processor of the computer system supporting the executable code to perform the tasks specified by the code.
The processor, computer, or computer system may be any kind of device, machine, or special purpose circuit, or a collection or portion thereof, that has processing capabilities such that instructions can be executed. The processor may be or include any kind of general purpose or special purpose processor, such as CPU, GPU, NNA, a system on a chip, a state machine, a media processor, an Application Specific Integrated Circuit (ASIC), a programmable logic array, a Field Programmable Gate Array (FPGA), or the like. The computer or computer system may include one or more processors.
The present invention is also intended to cover software defining a configuration of hardware as described herein, such as Hardware Description Language (HDL) software, for designing integrated circuits or for configuring programmable chips to perform desired functions. That is, a computer readable storage medium may be provided having encoded thereon computer readable program code in the form of an integrated circuit definition data set that, when processed (i.e., run) in an integrated circuit manufacturing system, configures the system to manufacture a compression and/or decompression unit configured to perform any of the methods described herein, or to manufacture a compression and/or decompression unit comprising any of the apparatus described herein. The integrated circuit definition data set may be, for example, an integrated circuit description.
Accordingly, a method of manufacturing a compression and/or decompression unit as described herein in an integrated circuit manufacturing system may be provided. Furthermore, an integrated circuit definition data set may be provided that, when processed in an integrated circuit manufacturing system, causes a method of manufacturing a compression and/or decompression unit to be performed.
The integrated circuit definition data set may be in the form of computer code, for example, as a netlist, code for configuring a programmable chip, as a hardware description language defining a hardware suitable for fabrication at any level in an integrated circuit, including as Register Transfer Level (RTL) code, as a high-level circuit representation (such as Verilog or VHDL), and as a low-level circuit representation (such as OASIS (RTM) and GDSII). A higher-level representation, such as RTL, logically defining hardware suitable for fabrication in an integrated circuit may be processed at a computer system configured to generate a fabrication definition of the integrated circuit in the context of a software environment that includes definitions of circuit elements and rules for combining the elements to generate a fabrication definition of the integrated circuit so defined by the representation. As is typically the case when software is executed at a computer system to define a machine, one or more intermediate user steps (e.g., providing commands, variables, etc.) may be required to configure the computer system to generate a manufacturing definition for an integrated circuit to execute code that defines the integrated circuit to generate the manufacturing definition for the integrated circuit.
An example of processing an integrated circuit definition data set at an integrated circuit manufacturing system to configure the system to manufacture compression and/or decompression units will now be described with respect to fig. 14.
Fig. 14 illustrates an example of an Integrated Circuit (IC) manufacturing system 1402 configured to manufacture compression and/or decompression units as described in any of the examples herein. Specifically, IC fabrication system 1402 includes a layout processing system 1404 and an integrated circuit generation system 1406. The IC fabrication system 1402 is configured to receive an IC definition data set (e.g., defining a compression and/or decompression unit as described in any of the examples herein), process the IC definition data set, and generate an IC (e.g., embodying a compression and/or decompression unit as described in any of the examples herein) from the IC definition data set. Processing of the IC definition data set configures the IC fabrication system 1402 to fabricate an integrated circuit embodying the compression and/or decompression unit as described in any of the examples herein.
Layout processing system 1404 is configured to receive and process the IC definition data set to determine a circuit layout. Methods of determining circuit layout from IC definition data sets are known in the art and may involve, for example, synthesizing RTL codes to determine gate level representations of circuits to be generated, for example in terms of logic components (e.g., NAND, NOR, AND, OR, MUX and FLIP-FLOP components). By determining the location information of the logic components, the circuit layout may be determined from the gate level representation of the circuit. This may be done automatically or with the participation of a user in order to optimize the circuit layout. When the layout processing system 1404 has determined a circuit layout, the layout processing system may output the circuit layout definition to the IC generation system 1406. The circuit layout definition may be, for example, a circuit layout description.
As is known in the art, the IC generation system 1406 generates ICs according to a circuit layout definition. For example, the IC generation system 1406 may implement a semiconductor device fabrication process for generating ICs that may involve a multi-step sequence of photolithography and chemical processing steps during which electronic circuits are gradually formed on wafers made of semiconductor material. The circuit layout definition may be in the form of a mask that may be used in a lithographic process to generate an IC from the circuit definition. Alternatively, the circuit layout definition provided to the IC generation system 1406 may be in the form of computer readable code that the IC generation system 1406 may use to form an appropriate mask for generating the IC.
The different processes performed by IC fabrication system 1402 may all be implemented in one location, e.g., by a party. Alternatively, the IC manufacturing system 1402 may be a distributed system such that some of the processes may be performed at different locations and by different parties. For example, some of the following phases may be performed at different locations and/or by different parties: (i) Synthesizing an RTL code representing the IC definition dataset to form a gate level representation of the circuit to be generated; (ii) generating a circuit layout based on the gate level representation; (iii) forming a mask according to the circuit layout; and (iv) using the mask to fabricate the integrated circuit.
In other examples, processing of the integrated circuit definition data set at the integrated circuit manufacturing system may configure the system to manufacture the compression and/or decompression unit without processing the integrated circuit definition data set to determine the circuit layout. For example, an integrated circuit definition dataset may define a configuration of a reconfigurable processor, such as an FPGA, and processing of the dataset may configure the IC manufacturing system to generate (e.g., by loading configuration data into the FPGA) the reconfigurable processor having the defined configuration.
In some embodiments, the integrated circuit manufacturing definition data set, when processed in the integrated circuit manufacturing system, may cause the integrated circuit manufacturing system to generate an apparatus as described herein. For example, an apparatus as described herein may be manufactured by configuring an integrated circuit manufacturing system in the manner described above with reference to fig. 14 through an integrated circuit manufacturing definition dataset.
In some examples, the integrated circuit definition dataset may include software running on or in combination with hardware defined at the dataset. In the example shown in fig. 14, the IC generation system may also be further configured by the integrated circuit definition data set to load firmware onto the integrated circuit or to otherwise provide the integrated circuit with program code for use with the integrated circuit in accordance with the program code defined in the integrated circuit definition data set at the time of manufacturing the integrated circuit.
Embodiments of the concepts set forth in the present application in apparatuses, devices, modules, and/or systems (and in methods implemented herein) may provide improved performance over known embodiments. Performance improvements may include one or more of increased computational performance, reduced latency, increased throughput, and/or reduced power consumption. During the manufacture of such devices, apparatuses, modules and systems (e.g., in integrated circuits), a tradeoff may be made between performance improvements and physical implementation, thereby improving the manufacturing method. For example, a tradeoff may be made between performance improvement and layout area, matching the performance of known implementations, but using less silicon. This may be accomplished, for example, by reusing the functional blocks in a serial fashion or sharing the functional blocks among elements of a device, apparatus, module, and/or system. In contrast, the concepts described herein that lead to improvements in the physical implementation of devices, apparatus, modules, and systems (e.g., reduced silicon area) may be weighed against performance improvements. This may be accomplished, for example, by fabricating multiple instances of the module within a predefined area budget.
The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the application.

Claims (24)

1. A method of decompressing compressed data to determine an indication of the presence of one or more sub-primitives for use in a rendering system, the method comprising:
receiving a compressed data block of a sub-primitive existence indication block;
reading entropy encoded data from the compressed data block;
performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values; and
performing spatial re-correlation on a block of entropy decoded values to determine one or more of the sub-primitive presence indications in the sub-primitive presence indication block, the performing spatial re-correlation comprising:
(a) For each row of a plurality of rows of entropy decoded values in a first dimension within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(b) For each row of a plurality of rows of entropy decoded values in a second dimension within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value.
2. The method of claim 1, wherein the one or more entropy decoded values of the determined prediction values in a row in the first dimension are alternative entropy decoded values in the row in the first dimension, and wherein the one or more entropy decoded values of the determined prediction values in a row in the second dimension are alternative entropy decoded values in the row in the second dimension.
3. The method of claim 1 or 2, wherein the one or more other entropy decoded values in the row on which the determination of a predictor is based are adjacent to the entropy decoded value for which the predictor is determined.
4. A method as claimed in any preceding claim, wherein the plurality of rows of entropy decoded values in a first dimension comprise all rows of entropy decoded values in the first dimension within the block of entropy decoded values, and wherein the plurality of rows of entropy decoded values in a second dimension comprise all rows of entropy decoded values in the second dimension within the block of entropy decoded values.
5. The method of any preceding claim, wherein each of the presence indications indicates a presence state, the presence state being one of: (i) complete presence, (ii) complete absence, and (iii) partial presence.
6. The method of claim 5, wherein the presence status of a partial presence is represented by a zero value in the sub-primitive presence indication block.
7. The method of claim 5 or 6, wherein any of the following is present:
in the sub-primitive presence indication block, a completely present presence state is represented by a value of one and a completely absent presence state is represented by a value of two; or alternatively
In the sub-element presence indication block, the presence state of complete presence is represented by a value of two and the presence state of complete absence is represented by a value of one.
8. The method of any preceding claim, wherein the replacing the entropy decoded value with a value of the entropy decoded value and a sum of the determined prediction values of the entropy decoded value comprises performing a summation calculation using a modulo 3 operation to determine the value of the entropy decoded value and the sum of the determined prediction values of the entropy decoded value.
9. A method as claimed in any preceding claim, wherein said determining a predicted value of said entropy decoded value based on one or more other entropy decoded values in said row is performed using a prediction function, said function operating according to:
and wherein said replacing the entropy decoded value with a value of the sum of the entropy decoded value and the determined predicted value of the entropy decoded value is performed according to:
d 2i+1 :=d 2i+1 +Predict(d 2i ,d 2i+2 )mod 3,
Wherein d is 2i Is the (2 i) th entropy decoded value in the row, d 2i+1 Is the (2i+1) th entropy decoded value in the row, and d 2i+2 Is the (2i+2) th entropy decoded value in the row.
10. The method of claim 9, wherein if d 2i+1 Is the last entropy decoded value in the row, such that d 2i+2 Not present in the entropy decoded value block, d 2i Is used to replace d in the prediction function 2i+2
11. The method of any preceding claim, wherein any of the following is present:
the row of entropy decoding values in the first dimension within the block of entropy decoding values is a row of entropy decoding values within the block of entropy decoding values, and the row of entropy decoding values in the second dimension within the block of entropy decoding values is a column of entropy decoding values within the block of entropy decoding values; or alternatively
The row of entropy decoding values in the first dimension within the block of entropy decoding values is a column of entropy decoding values within the block of entropy decoding values, and the row of entropy decoding values in the second dimension within the block of entropy decoding values is a row of entropy decoding values within the block of entropy decoding values.
12. The method of any preceding claim, wherein prior to (a) and (b), the performing spatial re-correlation further comprises:
(c) For each row of a plurality of rows of entropy decoded values in the first dimension within an entropy decoded value sub-block within the entropy decoded value block:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(d) For each row of a plurality of rows of entropy decoded values in the second dimension within the sub-block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(e) The entropy decoded values from the sub-blocks are interleaved with entropy decoded values in the block of entropy decoded values that have not been spatially re-correlated.
13. The method of any preceding claim, wherein the performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values comprises:
For each of a plurality of subsets of encoded data values, reading from the entropy encoded data an indication identifying a number of bits for each of the encoded data values in the subset; and
the encoded data values in the entropy encoded data are parsed based on the identified number of bits, thereby interpreting the encoded data values.
14. The method of claim 13, wherein the entropy decoded data values are determined by selectively presetting leading zeros to the interpreted encoded data values such that each of the entropy decoded data values has the same number of bits as each of the presence indications in the sub-picture presence indication block.
15. The method of claim 13 or 14, wherein each of the subsets of encoded data values is a 2 x 2 subset of encoded data values.
16. The method of any of claims 1-12, wherein the performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values comprises: huffman decoding is performed.
17. The method of any of claims 1-12, wherein the performing entropy decoding on the entropy encoded data to determine a block of entropy decoded data values comprises: run-length decoding is performed.
18. The method of claim 17, wherein the run-length decoding is performed according to a raster scan order or a morton order.
19. The method of any preceding claim, further comprising:
receiving an indication of a sample location within the sub-primitive presence indication block for which a presence indication is to be determined; and
a presence indication of the sample location is determined using one or more of the determined sub-primitive presence indications in the sub-primitive presence indication block.
20. The method of claim 19, wherein the rendering system is a ray tracing system, and wherein the method further comprises using the determined presence indication at the sample location to determine the presence of a determined primitive at an intersection with a ray as part of performing an intersection test on the ray in the ray tracing system.
21. The method of any preceding claim, wherein the rendering system is a ray tracing system or a rasterization system.
22. A decompression unit configured to decompress compressed data to determine one or more sub-primitive presence indications for use in a rendering system, the decompression unit configured to receive a compressed data block of sub-primitive presence indication blocks, the decompression unit comprising:
An entropy decoding module configured to perform entropy decoding on entropy encoded data that has been read from the compressed data blocks to determine blocks of entropy decoded data values; and
a spatial re-correlation module configured to perform spatial re-correlation on the block of entropy decoded values to determine one or more of the sub-picture presence indications in the sub-picture presence indication block, the performing spatial re-correlation comprising:
(a) For each row of a plurality of rows of entropy decoded values in a first dimension within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value; and
(b) For each row of a plurality of rows of entropy decoded values in a second dimension within the block of entropy decoded values:
for one or more of the entropy decoded values in the row: (i) Determining a predicted value of the entropy decoded value based on one or more other entropy decoded values in the row, and (ii) replacing the entropy decoded value with a value of a sum of the entropy decoded value and the determined predicted value of the entropy decoded value.
23. A computer readable storage medium having computer readable code stored thereon, the computer readable code being configured to cause the method of any of claims 1 to 21 to be performed when the code is run.
24. A computer readable storage medium having stored thereon a computer readable dataset description of an integrated circuit, the computer readable dataset description when processed in an integrated circuit manufacturing system configured to manufacture a decompression unit as claimed in claim 22.
CN202310605511.5A 2022-05-30 2023-05-26 Compression and decompression of sub-primitive presence indication for use in a rendering system Pending CN117156148A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
GB2207936.2 2022-05-30
GB2207939.6 2022-05-30
GB2207939.6A GB2614350A (en) 2022-05-30 2022-05-30 Compression and decompression of sub-primitive presence indications for use in intersection testing in a rendering system

Publications (1)

Publication Number Publication Date
CN117156148A true CN117156148A (en) 2023-12-01

Family

ID=82324254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310605511.5A Pending CN117156148A (en) 2022-05-30 2023-05-26 Compression and decompression of sub-primitive presence indication for use in a rendering system

Country Status (2)

Country Link
CN (1) CN117156148A (en)
GB (1) GB2614350A (en)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11164359B2 (en) * 2019-12-27 2021-11-02 Intel Corporation Apparatus and method for using alpha values to improve ray tracing efficiency

Also Published As

Publication number Publication date
GB202207939D0 (en) 2022-07-13
GB2614350A (en) 2023-07-05

Similar Documents

Publication Publication Date Title
CN109660261A (en) Data compression
US12020362B2 (en) Methods and control stream generators for generating a control stream for a tile group in a graphics processing system
CN111508056B (en) Graphics processing system using extended transform level masks
CN113256477B (en) Method and tiling engine for storing tiling information in a graphics processing system
CN117788676A (en) Ray tracing
EP4116924A1 (en) Mapping multi-dimensional coordinates to a 1d space
CN117156148A (en) Compression and decompression of sub-primitive presence indication for use in a rendering system
CN117152277A (en) Compression and decompression of sub-primitive presence indication for use in a rendering system
EP4287133A1 (en) Compression and decompression of sub-primitive presence indications for use in a rendering system
US20210352292A1 (en) Methods and Decompression Units for Decompressing Image Data Compressed Using Pattern-Based Compression
EP4287128A1 (en) Compression and decompression of sub-primitive presence indications for use in a rendering system
GB2593708A (en) Methods and decompression units for decompressing image data compressed using pattern-based compression
EP4290461A1 (en) Compression of sub-primitive presence indications for use in a rendering system
US20240119634A1 (en) Compression and decompression of sub-primitive presence indications for use in a rendering system
CN117156149A (en) Compression and decompression of sub-primitive presence indication for use in a rendering system
CN117152276A (en) Compression and decompression of sub-primitive presence indication for use in a rendering system
GB2613418A (en) Compression and decompression of sub-primitive presence indications for use in intersection testing in a rendering system
CN117152278A (en) Compression and decompression of sub-primitive presence indication for use in a rendering system
GB2614351A (en) Compression and decompression of sub-primitive presence indications for use in intersection testing in a rendering system
CN117152279A (en) Compression and decompression of sub-primitive presence indication for use in a rendering system
GB2593706A (en) Pattern-based image data compression
GB2616637A (en) Methods and apparatus for processing graphics data
CN117788675A (en) Ray tracing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication