GB2609246A - Anisotropic texture filtering - Google Patents
Anisotropic texture filtering Download PDFInfo
- Publication number
- GB2609246A GB2609246A GB2110744.6A GB202110744A GB2609246A GB 2609246 A GB2609246 A GB 2609246A GB 202110744 A GB202110744 A GB 202110744A GB 2609246 A GB2609246 A GB 2609246A
- Authority
- GB
- United Kingdom
- Prior art keywords
- filter
- anisotropic
- sampling points
- texture
- filtering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001914 filtration Methods 0.000 title claims abstract description 228
- 238000000034 method Methods 0.000 claims abstract description 186
- 238000005070 sampling Methods 0.000 claims abstract description 139
- 230000004044 response Effects 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims description 61
- 238000004519 manufacturing process Methods 0.000 claims description 51
- 230000003595 spectral effect Effects 0.000 claims description 19
- 238000003860 storage Methods 0.000 claims description 9
- 239000000523 sample Substances 0.000 description 69
- 239000012634 fragment Substances 0.000 description 53
- 230000006870 function Effects 0.000 description 49
- 230000008569 process Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 13
- 230000001419 dependent effect Effects 0.000 description 9
- 238000010606 normalization Methods 0.000 description 9
- 238000013507 mapping Methods 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 8
- 239000013598 vector Substances 0.000 description 8
- 238000009877 rendering Methods 0.000 description 6
- 230000014509 gene expression Effects 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000000996 additive effect Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000012886 linear function Methods 0.000 description 3
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000011449 brick Substances 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 239000004234 Yellow 2G Substances 0.000 description 1
- 101100395365 Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081) irp2 gene Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 238000012993 chemical processing Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000005389 semiconductor device fabrication Methods 0.000 description 1
- JXRFTGPGWGUBQB-LHOUOPCDSA-M sodium;2-[(2r,3s,4s,5r,6s)-2,4-dihydroxy-6-[(1r)-1-[(2s,5r,7s,8r,9s)-7-hydroxy-2-[(2r,5s)-5-[(2r,3s,5r)-5-[(2s,3s,5r,6s)-6-hydroxy-3,5,6-trimethyloxan-2-yl]-3-[(2s,5s,6r)-5-methoxy-6-methyloxan-2-yl]oxyoxolan-2-yl]-5-methyloxolan-2-yl]-2,8-dimethyl-1,10-d Chemical compound [Na+].O1[C@H](C)[C@@H](OC)CC[C@@H]1O[C@@H]1[C@H]([C@@]2(C)O[C@H](CC2)[C@@]2(C)O[C@]3(O[C@@H]([C@H](C)[C@@H](O)C3)[C@@H](C)[C@H]3[C@@H]([C@@H](O)[C@H](C)[C@@](O)(CC([O-])=O)O3)OC)CC2)O[C@@H]([C@@H]2[C@H](C[C@@H](C)[C@@](C)(O)O2)C)C1 JXRFTGPGWGUBQB-LHOUOPCDSA-M 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/04—Texture mapping
Landscapes
- Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Separation Using Semi-Permeable Membranes (AREA)
- Combined Means For Separation Of Solids (AREA)
- Feedback Control In General (AREA)
Abstract
A method of performing anisotropic texture filtering. The method includes generating one or more parameters describing an elliptical footprint in texture space, performing isotropic filtering at each sampling point of a set of sampling points in an ellipse to be sampled to produce a plurality of isotropic filter results with the ellipse to be sampled being based on the elliptical footprint. Weights of an anisotropic filter are then selected, based on one or more parameters of the set of sampling points and one or more parameters of the ellipse to be sampled with the weights being selected to minimize a cost function that penalises high frequencies in the filter response of the anisotropic filter under a constraint that the variance of the anisotropic filter is related to an anisotropic ratio squared, the anisotropic ratio being the ratio of a major radius of the ellipse to be sampled and a minor axis of the ellipse to be sampled. The plurality of isotropic filter results are then combined using the selected weights of the anisotropic filter to generate at least a portion of a filter result.
Description
ANISOTROPIC TEXTURE FILTERING
BACKGROUND
[0001] A graphics processing unit (GPU) may be used to process geometry data (e.g. vertices defining primitives or patches) generated by an application in order to generate image data. Specifically, a GPU may determine pixel values (e.g. colour values) of an image to be stored in a frame buffer which may be output to a display.
[0002] A GPU may process the received geometry data in two phases -a geometry processing phase and a rasterization phase. In the geometry processing phase a vertex shader is applied to the received geometry data (e.g. vertices defining primitives or patches) received from an application (e.g. a game application) to transform the geometry data into the rendering space (e.g. screen space). Other functions such as clipping and culling to remove geometry (e.g. primitives or patches) that falls outside of a viewing frustum, and/or lighting/attribute processing may also be performed in the geometry processing phase.
[0003] In the rasterization phase the transformed primitives are mapped to pixels and the colour is identified for each pixel. This may comprise rasterizing the transformed geometry data (e.g. by performing scan conversion) to generate primitive fragments. The term "fragment" is used herein to mean a sample of a primitive at a sampling point, which is to be processed to render pixels of an image. In some examples, there may be a one-to-one mapping of pixels to fragments. However, in other examples there may be more fragments than pixels, and this oversampling can allow for higher quality rendering of pixel values.
[0004] The primitive fragments that are hidden (e.g. hidden by other fragments) may then be removed through a process called hidden surface removal. Texturing and/or shading may then be applied to primitive fragments that are not hidden to determine pixel values of a rendered image. For example, in some cases, the colour of a fragment may be identified by applying a texture (e.g. an image) to the fragment. As is known to those of skill in the art, a texture, which may also be referred to as a texture map, is an image which is used to represent precomputed colour, lighting, shadows etc. Texture maps are formed of a plurality of texels (i.e. colour values), which may also be referred to as texture elements or texture pixels. Applying a texture to a fragment generally comprises mapping the location of the fragment in the render space to a position or location in the texture and using the colour at that position in the texture as the texture colour for the fragment. The texture colour may then be used to determine the final colour for the fragment. A fragment whose colour is determined from a texture may be referred to as a texture mapped fragment.
[0005] As fragment positions rarely map directly to a specific texel, the texture colour of a fragment is typically identified through a process called texture filtering. In the simplest case, which may be referred to as point sampling, point filtering or nearest-neighbour interpolation, a fragment in screen space is mapped to a position in the texture (i.e. to a position in texture space) and the value (i.e. colour) of the closest texel to the identified position in the texture is used as the texture colour of the fragment. However, in most cases, the texture colour for a fragment is determined using more complicated filtering techniques which combine a plurality of texels close to the identified position in the texture. Examples of more complicated filtering techniques include isotropic filtering techniques and anisotropic filtering techniques. Isotropic filtering techniques uniformly filter textures across perpendicular axes, whereas anisotropic filtering techniques do not uniformly filter textures, instead filtering textures based on the local (i.e. anisotropic) warping that the texture undergoes in the neighbourhood of a fragment. In some cases, the warping may take into account the texture's location on the screen relative to the camera angle. Examples of isotropic filtering techniques include, but are not limited to, bilinear filtering and trilinear filtering.
[0006] In bilinear filtering the four nearest texels to the identified position in the texture are combined by a pairwise linear weighted average according to distance. Compared with point sampling, this generally provides a smoother reconstruction of a continuous image from the bitmapped texture. Bilinear filtering has proven to be particularly suitable for applications in which textures, as a result of texture mapping, are magnified. However, neither point sampling nor bilinear filtering provide an adequate solution when textures are minified as they do not take into account the size of the fragment footprint in texture space.
[0007] Point sampling and bilinear filtering can be combined with a technique referred to as mipmapping. In mipmapping, a series (or pyramid) of mipmaps are pre-computed (e.g. generated in advance and/or offline). Each mipmap is a lower resolution version of the original texture. Specifically, according to standards, each mipmap has a height and width that are a factor of 2 smaller than the previous level, wherein odd dimensions are rounded down, and any dimension less than one are rounded up to one. The standards assign an integer level of detail (LOD) to each mipmap (zero for the highest resolution and increasing by one for each subsequent level). Mipmaps allow an appropriate level of detail to be selected for a fragment, in the sense that the mipmap level whose texel footprints most closely match the fragment's footprint is a good candidate for filtering. Specifically, higher resolution mipmaps can be used for fragments/objects that are closer to the screen/viewer, and lower resolution mipmaps can be used for fragments/objects that are further from the screen/viewer. Mipmaps thus provide an efficient solution to enable texture minification without having to introduce additional filtering, with potentially unbounded computation and memory bandwidth cost. Mien point sampling and bilinear filtering are used with mipmapping, the texel(s) are selected from the closest mipmap level (or a scaled version of the closest mipmap level).
[0008] Trilinear filtering comprises performing bilinear filtering on the two closest mipmap levels (one higher resolution and one lower resolution) and then linearly interpolating between the results of the bilinear filtering. In analogy with bilinear filtering, trilinear filtering provides a smoother approximation of the continuous range of minificafion that a texture may undergo.
[0009] Neither bilinear nor trilinear filtering takes into account the fact that a fragment footprint may be warped by different amounts in different directions (e.g. when the texture is at a receding angle with respect to the screen/viewer), making it difficult to approximate the fragment footprint in texture space using a single parameter (e.g. the level of detail). In such cases, bilinear or trilinear filtering can produce blurry results.
[0010] Anisotropic filtering addresses this problem by combining several texels around the identified position in the texture, but on a sample pattern mapped according to the projected shape of the fragment in screen space onto the texture (i.e. in texture space). While anisotropic filtering can reduce blur at extreme viewing angles, anisotropic filtering is more computationally intensive than isotropic filtering.
[0011] The texture colour(s) output by the texture filtering may then be used as input to a fragment shader. As is known to those of skill in the art, a fragment shader (which may alternatively be referred to as a pixel shader) is a program (e.g. a set of instructions) that operates on individual fragments to determine the colour, brightness, contrast etc. thereof. A fragment shader may receive as input a fragment (e.g. the position thereof) and one or more other input parameters (e.g. texture co-ordinates) and output a colour value in accordance with a specific shader program. In some cases, the output of a pixel shader may be further processed. For example, where there are more samples than pixels, an anti-aliasing technique, such as multi-sample anti-aliasing (MSAA), may be used to generate the colour for a particular pixel from multiple samples (which may be referred to as sub-samples). Anti-aliasing techniques apply a filter, such as, but not limited to, a box filter to the multiple samples to generate a single colour value for a pixel.
[0012] A GPU which performs hidden surface removal prior to performing texturing and/or shading is said to implement 'deferred' rendering. In other examples, a GPU might not implement deferred rendering in which case texturing and shading may be applied to fragments before hidden surface removal is performed on those fragments. In either case, the rendered pixel values may be stored in memory (e.g. frame buffer).
[0013] The embodiments described below are provided by way of example only and are not limiting of implementations which solve any or all of the disadvantages of known methods and hardware for performing anisotropic texture filtering.
SUMMARY
[0014] This summary is provided to introduce a selection of concepts that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
[0015] Described herein are methods of performing anisotropic texture filtering. The methods includes: generating one or more parameters describing an elliptical footprint in texture space; performing isotropic filtering at each sampling point of a set of sampling points in an ellipse to be sampled to produce a plurality of isotropic filter results, the ellipse to be sampled based on the elliptical footprint; selecting, based on one or more parameters of the set of sampling points and one or more parameters of the ellipse to be sampled, weights of an anisotropic filter that minimize a cost function that penalises high frequencies in the filter response of the anisotropic filter under a constraint that the variance of the anisotropic filter is related to an anisotropic ratio squared, the anisotropic ratio being the ratio of a major radius of the ellipse to be sampled and a minor axis of the ellipse to be sampled; and combining the plurality of isotropic filter results using the selected weights of the anisotropic filter to generate at least a portion of a filter result.
[0016] A first aspect provides a method of performing anisotropic texture filtering, the method comprising: generating one or more parameters describing an elliptical footprint in texture space; performing isotropic filtering at each sampling point of a set of sampling points in an ellipse to be sampled to produce a plurality of isotropic filter results, the ellipse to be sampled based on the elliptical footprint; selecting, based on one or more parameters of the set of sampling points and one or more parameters of the ellipse to be sampled, weights of an anisotropic filter that minimize a cost function that penalises high frequencies in the filter response of the anisotropic filter under a constraint that the variance of the anisotropic filter is related to an anisotropic ratio squared, the anisotropic ratio being the ratio of a major radius of the ellipse to be sampled and a minor axis of the ellipse to be sampled; and combining the plurality of isotropic filter results using the selected weights of the anisotropic filter to generate at least a portion of a filter result.
[0017] A second aspect provides method of generating an image, the method comprising performing the method of the first aspect, and generating an image based on the at least a portion of the filter result.
[0018] A third aspect provides a texture filtering unit for use in a graphics processing system, the texture filtering unit configured to perform the method of the first aspect.
[0019] A fourth aspect provides a graphics processing system comprising the texture filtering unit of the third aspect.
[0020] The texture filtering units and/or the graphics processing systems described herein may be embodied in hardware on an integrated circuit. There may be provided a method of manufacturing, at an integrated circuit manufacturing system, a texture filtering unit and/or a graphics processing system as described herein. There may be provided an integrated circuit definition dataset that, when processed in an integrated circuit manufacturing system, configures the system to manufacture a texture filtering unit and/or a graphics processing system as described herein. There may be provided a non-transitory computer readable storage medium having stored thereon a computer readable description of a texture filtering unit or a graphics processing system that, when processed in an integrated circuit manufacturing system as described herein, causes the integrated circuit manufacturing system to manufacture an integrated circuit embodying the texture filtering unit or the graphics processing system.
[0021] There may be provided an integrated circuit manufacturing system comprising: a non-transitory computer readable storage medium having stored thereon a computer readable description of a texture filtering unit or a graphics processing system as described herein; a layout processing system configured to process the computer readable description so as to generate a circuit layout description of an integrated circuit embodying the texture filtering unit or the graphics processing system; and an integrated circuit generation system configured to manufacture the texture filtering unit or the graphics processing system according to the circuit layout description.
[0022] There may be provided computer program code for performing a method as described herein. There may be provided non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform the methods as described herein.
[0023] The above features may be combined as appropriate, as would be apparent to a skilled person, and may be combined with any of the aspects of the examples described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0024] Examples will now be described in detail with reference to the accompanying drawings in which: [0025] FIG. 1 is a schematic diagram illustrating mapping a circular footprint in screen space to an elliptical footprint in texture space; [0026] FIG. 2 is a schematic diagram illustrating a first example method of performing anisotropic texture filtering; [0027] FIG. 3 is a schematic diagram illustrating a second example method of performing anisotropic texture filtering; [0028] FIG. 4 is a flow diagram of an example method of performing anisotropic texture filtering in accordance with an embodiment; [0029] FIG. 5 is a schematic diagram illustrating the method of FIG. 4; [0030] FIG. 6 is a schematic diagram illustrating a symmetric anisotropic texture filtering method in which a desired elliptical footprint is approximated using two concentric ellipses, one smaller than the desired ellipse and one larger than the desired ellipse; [0031] FIG. 7 is a schematic diagram illustrating an asymmetric anisotropic texture filtering method in which a desired elliptical footprint is approximated using two ellipses of differing eccentricity; [0032] FIG. 8 is an example method of combining the isotropic filtering results of the method of FIG. 4 using a Gaussian filter; [0033] FIG. 9 is a schematic diagram illustrating an example set of isotropic filtering results for an example set of sampling points of a texture; [0034] FIG. 10 is a graph illustrating linear interpolation factors for different anisotropic ratios; [0035] FIG. 11 is a graph illustrating linear interpolation factors for different anisotropic ratios for a truncated set of samples; [0036] FIG. 12 is a flow diagram of a second example method of performing anisotropic texture filtering in accordance with an embodiment; [0037] FIG. 13 is a graph showing the impulse response for two example filters; [0038] FIG. 14 is a graph showing the frequency response for the two example filters of FIG. 13; [0039] FIG. 15 is a block diagram of an example graphics processing system which comprises a texture filtering unit configured to perform the method of FIG. 4, the method of FIG. 8 and/or the method of FIG. 12; [0040] FIG. 16 is a block diagram of an example computer system in which the texture filtering units and/or the graphics processing systems described herein may be implemented; and [0041] FIG. 17 is a block diagram of an example integrated circuit manufacturing system for generating an integrated circuit embodying a texture filtering unit and/or a graphics processing system as described herein.
[0042] The accompanying drawings illustrate various examples. The skilled person will appreciate that the illustrated element boundaries (e.g., boxes, groups of boxes, or other shapes) in the drawings represent one example of the boundaries. It may be that in some examples, one element may be designed as multiple elements or that multiple elements may be designed as one element. Common reference numerals are used throughout the figures, where appropriate, to indicate similar features.
DETAILED DESCRIPTION
[0043] The following description is presented by way of example to enable a person skilled in the art to make and use the invention. The present invention is not limited to the embodiments described herein and various modifications to the disclosed embodiments will be apparent to those skilled in the art. Embodiments are described by way of example only.
[0044] As described above, texture mapping is a process in which a texture (i.e. an image) is mapped onto an object in a 3D scene. For example, a texture representing a brick pattern may be applied to a wall object to make it look as if the wall is made of brick. The term "screen space" is used herein to represent the 3D coordinate system of the display in which 3D objects such as primitives and patches are defined. Each pixel in screen space is defined by pixel coordinates (x, y) and a depth z. The term "texture space" is used herein to represent the 20 coordinate system of a texture. Each texel in texture space is defined by texture coordinates (u, v).
[0045] In anisotropic texture filtering, several texels around an identified position in a texture are combined, but on a sample pattern mapped according to the projected shape of the filter in screen space onto the texture (i.e. in texture space). Anisotropic texture filtering can improve the look of textures that are angled and farther from the camera compared to other filtering methods, such as isotropic texture filtering methods. One method known for implementing anisotropic texture filtering is the elliptical weighted average (EWA) filter technique first proposed by Paul S. Heckbert and Ned Greene. In the EWA technique, which is described with reference to FIG. 1, pixels are treated as having a circular footprint 102 in screen space 104 which project to an ellipse 106 with arbitrary orientation in texture space 108. The texture filtering result is then calculated as the convolution of the texels inside the elliptical footprint with the projected weights of the pixel filter. If the centre of the pixel in texture space can be translated to (0,0) then the elliptical footprint of a pixel can be calculated according to equation (1) where d2 is the distance squared from the centre of the pixel when the pixel is mapped back to screen space: d2(u,v) = Au2 + Buy + Cv2 (1) where An" = (0v/ax)2 + (frulaY)2 Bitn, = -2 * (au/Ox * av/dx + du/dy * ()tidy) Cmi = (au/ax)2 + (Oulay)2 F = A"",C"-Bm"2 I 4 A = A"IF B= C = C"/F.
[0046] The partial derivatives (au/Ox, Oviax, du/ dy, av/dy) represent the rate of change of the of u and v in texture space relative to changes in x and y in screen space. Texels inside the elliptical footprint are then sampled, weighted (according to a filter profile), and accumulated. The result is then divided by the sum of the weights (which is the elliptical filter's volume in texture space).
[0047] EWA, when used in conjunction with a Gaussian filter profile, which may be referred to herein as Gaussian EWA, is considered to be one of the highest quality texture filtering techniques and is often used as a benchmark to measure the quality of other filtering techniques. However, the EWA technique has proven difficult to implement in hardware. Specifically, it can be expensive, in terms of computing resources, to calculate the weights and ellipse parameters, and in some cases may require obtaining many texels.
[0048] Different methods known to the Applicant, which is not an admission that they are well-known, have been proposed to approximate Gaussian EWA which can be more easily implemented in hardware. Some of these techniques involve performing isotropic filtering, such as trilinear filtering, at several points along a line in the ellipse and combining the results of the isotropic filtering. For example, in one method which is described with reference to FIG. 2, and may be referred to as the TEXRAM method, the ellipse 202 is represented by a parallelogram 204, and trilinear filtering 206 is performed at several sample points 208 (which may be referred to as probes) along the major axis 210 of the parallelogram 204, and a weighted average of the results of the trilinear filtering is generated. Specifically, TEXRAM uses the four partial derivatives (au/ax, avIax, au/ay, ay/ay)) to create two vectors in texture space (du/ax, ay/Ox) and (au/ay, ay/ay). The sample points 208 are placed along the line 210 that has the length and slope of the longer of the two vectors.
[0049] In another method which is illustrated with reference to FIG. 3, and may be referred to as the Feline method, the major axis 302 of the ellipse 304 is generated, trilinear filtering 306 is performed at several sample points 308 (which may be referred to as probes) along the major axis 302 of the ellipse, and the results of the trilinear filtering are combined with Gaussian weights. The length of the sampling line (lineLength) is calculated as set out in equation (2) below where p, is the major radius of the ellipse and p_ is the minor radius of the ellipse. The sampling points are distributed symmetrically about the midpoint (urn) vni) 310 of the sampling line such that the location of the nut sampling point (un, vn) can be calculated as shown in equation (3) below, wherein n=+/-1, +/-3, if the number of probes is even; and n=0, +/-2,+/-4 if the number of probes is odd. The distance between the sampling points is A/Z1u2 + Av2 where Au and are calculated in accordance with equations (4) and (5) shown below where theta is the angle of the minor axis and iProbes is the number of probes. A Gaussian weight is then applied to each probe 71 by computing the distance squared of the probe from the centre of the pixel filter in screen space, then exponentiating. The accumulated probe results are then divided by the sum of all the weights applied.
lineLength = 2 * (c+ -p_) (2) (un, vn) = (urn, vnt) + (41t, 4v) (3) Au = cos(theta)* lineLength/ (iProbes-1) (4) At' = sin(theta) * lineLength/(iProbes-1) (5) [0050] In some cases, instead of computing the stepping vector (Au, Av) with trigonometric functions, it may be determined by scaling the longer vector directly.
[0051] The inventors have identified that Gaussian EWA can be more accurately estimated by performing isotropic filtering at several sample points along the major axis of the ellipse where the distance between sample points is proportional to IR -q-2 units, q being the ratio of the major radius of the ellipse (p") and the minor radius of the ellipse (p_), such that q = P-The ratio between the major radius of the ellipse and the minor radius of the ellipse (i.e. q) may be referred to as the anisotropic ratio. As described in more detail below, when the distance between sampling points is proportional to V1 -77-2 units the maximum error in the estimation is not dependent on the anisotropic ratio.
[0052] The inventors have also identified that, independent of the spacing of the sample points along the major axis, Gaussian EWA can be more accurately estimated by combining the results of the isotropic filtering in a recursive manner. As described in more detail below, not only can this decrease the cumulative error that may occur with summing a plurality of small values, but it can also simplify the calculation of the weights.
[0053] In some cases, the two techniques may be combined to obtain an even more accurate estimate of Gaussian EWA.
[0054] Accordingly, described herein are methods and texture filtering units for performing anisotropic filtering of a texture using one or more of these techniques.
[0055] Reference is now made to FIG. 4 which illustrates an example method 400 for performing anisotropic filtering on a texture. The method 400 begins at block 402 where parameters defining an elliptical footprint in texture space (e.g. elliptical footprint 502 of FIG. 5) are generated based on information defining the relationship between screen space and texture space and information identifying a point or position of interest in the texture. The elliptical footprint represents the projection of a circular or elliptical footprint of a sampling kernel in screen space into texture space. The sampling kernel identifies, for a pixel or fragment of interest, the pixels or fragments near or surrounding the pixel or fragment of interest for which the corresponding texture colour is to be used to determine the texture colour for the pixel or fragment of interest.
[0056] The information identifying the position of interest in the texture may comprise a set of texture co-ordinates (it, v) that identify a position in the texture that corresponds to a particular fragment or pixel in screen space. The set of texture co-ordinates may define the midpoint (e.g. midpoint 510 of FIG. 5) of the desired elliptical footprint.
[0057] In some cases, which may be referred to as explicit level of detail cases, the information defining the relationship between screen space and texture space may include the partial derivatives (au/ax, av/ax, au/ay, Dv/ay) that represent the rate of change of it and v in texture space relative to changes in x and y in screen space. In other cases, which may be referred to as implicit level of detail cases, the information defining the relationship between screen space and texture space may include information from which the partial derivatives can be determined, or at least estimated. For example, the information defining the relationship between screen space and texture space may comprise texture co-ordinates corresponding to neighbouring pixels/fragments to the relevant pixel/fragment (e.g. a 2x2 block of pixels/fragments). In some cases, the parameters of the elliptical footprint may also be based on the dimensions of the texture.
[0058] The parameters defining the elliptical footprint that may be generated may include, but are not limited to, the major axis of the elliptical footprint (e.g. the major axis 504 of the elliptical footprint 502 of FIG. 5), and the lengths of the major and minor axes (p+ and p_, which may also be referred to as the major radius and minor radius respectively).
[0059] There are many known methods for generating the parameters of an elliptical footprint in texture space. The parameters of the elliptical footprint in texture space may be generated in any suitable manner. In some cases, the major axis may be identified by performing a single value decomposition (SVD) of the total derivative matrix of partial derivatives (e.g. Jacobian). This involves taking a matrix M, squaring it MMT (or MTM) and then diagonalising. This is illustrated by equation (6): 1 = x x = (J-tu)71-1-u = uTJ-171-1 = u7(JJT)-1u (6) where J is the Jacobian of the partial derivatives as shown in equation (7) and the Jacobian squared is as shown in equation (8). =
lad ± layl au di, du dv \--axax ± ay ay du dv ± du dv ax ax ay ay (avy ( avy Ux,) ± lad I (8) [0060] The inverse of the Jacobian squared can then be expressed as shown in equation (9).
01' 1 7 (dry tavy (du dv ± du dv)\ (9) = k,a,c) ± Up?) k.ax dx ay ay) (du Ov du dv\ (avy (ally Vax ay ay) The,1 Uy) I detil [0061] It will be evident to a person of skill in the art that this results in equation (1). It will also be evident to a person of skill in the art that this is an example only. Another example method for generating the parameters of the elliptical footprint in texture space is described in GB Patent No. 2583154 which is hereby incorporated by reference in its entirety.
[0062] Once the parameters defining the elliptical footprint in texture space have been identified the method 400 proceeds to block 404.
[0063] At block 404, at least one set of equally spaced sampling points (which may also be referred to as sample points) along the major axis are identified (e.g. sampling points 506 on the major axis 504 of FIG. 5 may be identified). Each set of equally spaced sampling points corresponds to, is associated with, or is related to, an ellipse in texture space to be sampled. The ellipse to be sampled is based on, and/or related to, the elliptical footprint identified in block 402. In some cases, the ellipse to be sampled is the elliptical footprint identified in block 402. However, in other cases, described below, the ellipse to be sampled may be smaller than, larger than, more eccentric than, or less eccentric than the elliptical footprint identified in block 402. Identifying a set of equally spaced sampling points along the major axis may comprise (i) identifying a number of sampling points in the set; OD identifying a spacing of the sampling points in the set; and/or (iii) identifying a location of at least one sampling point in the set (from which the other sampling points can be identified).
[0064] The number of sampling points in a set, N, may be selected based on the ratio of the major radius (p+) and the minor raidus (p_) of the associated ellipse to be sampled (i.e. the anisotropic ratio n). In some cases, described in more detail below, N may be proportional to the anisotropic ratio i, a sampling rate 13, and the width of the Gaussian kernel in standard deviations (which is preferably a multiple of 2 standard deviations -i.e. 2a). In general, the sampling rate,3 controls how closely spaced along the major axis kernel (i.e. the Gaussian kernel) the samples are. The higher 13 is, the more closely spaced the samples are along the major axis. For example, if 13 is two, samples are taken every one standard deviation of the minor axis, instead of every two standard deviations of the minor axis when p is one. In some cases, N may be equal to 241771. The parameters a and f? may be explicitly provided (e.g. based on a sampling budget), or may be dynamically selected based on information provided. For example, in some cases, the information provided may simply indicate a level of quality -e.g. high quality or low quality, and a and p may be selected accordingly.
[0065] Preferably the distance a between sampling points in each set is proportional, by a proportionality factor fr, to /1 -77-2 units. As described in more detail below, it has been determined that when the distance or spacing a between sampling points is proportional to -77-2 units, the upper bound on the error associated with the anisotropic filtering result is not dependent on the anisotropic ratio, thereby ensuring a balanced quality of approximation for a specified performance budget.
[0066] In some cases, the proportionality factor K may be equal to -1. As described in
R n
more detail below, when the proportionality factor K is equal to 7-fnl the Gaussian weights that are applied in block 408 to the results of the isotropic filtering performed in block 406 may only be generated for integer anisotropic ratios. This can reduce the cost (e.g. tabulation of weights) of implementing the method. However, the disadvantage of this expression of the proportionality factor is that the spacing (and hence approximation quality) is discontinuous and specifically may jump discontinuously when there is a jump from one integer anisotropic ratio to another. This is especially true if the reconstruction quality is low (for example if the sample rate is low, e.g. ,3 = 1/2) which may result in discontinuous jumps in a filtered image, which may be prominent.
[0067] In other cases, the proportionality factor lc may be equal to pi. As described in more detail below, when the proportionality factor is equal to 1p the Gaussian weights that are applied to the results of the isotropic filtering are not limited to integer anisotropic ratios which makes calculation and storage of the weights more complicated, but the spacing is continuous for various ratios and gives a uniform error in the limit as the number of samples tends to infinity.
[0068] In some cases, the first sample in a set may be offset from the middle point, midpoint or centre point of the major axis by a fraction th of the spacing (the fraction V) may also be referred to as the offset). In these cases, the location of each sample point it can be expressed by (n. + tp)(0.11-77-2p_ (a) where 11 is any integer in the half-open interval P+ -+ -2 -ip) and p+ is the major axis radius vector. In some cases, if the number of sampling points N is even, the offset i may be set to% such that the sampling points are symmetrically positioned about the middle point of the major axis. For example, if tp = 1 1 1 1 -2, N = 2 then it E [-1--2, +1 --2) n E [-1,0] it +4 = +-However, an offset of 1/2 can also be used for an odd number of samples. For example, if lp = N = 3 then it E 3 1 3 1 E [-2,0] it = t± -1 --31.
- 2 + - -) -2 2 2 2 2 [0069] It will be evident to a person of skill in the art that this is an example only and that the offset may be set to other suitable values. In particular, if the number of sampling points N is odd, a symmetric distribution of points about the middle or midpoint of the major axis may be attained when 1,/) is set to 0. For example, if ip = 0, N = 3 then it c I-,-PD it c [ -1,1] IL + = to, +11. However, an offset of 0 can also be used for an even number of samples. For example, if ip = 0, N = 4, then n E [-2, +2) = n E [-2,1] n + = [0, +1, -2).
[0070] In yet other examples, the distribution pattern may alternate between offsets of 0 and 1/2 for odd and even numbers of samples, respectively. However, for the sake of continuity (as the anisotropic ratio increases), it may be preferable to use a consistent offset, which tends to favour an even number of samples since fewer samples may be required when the anisotropic ratio is small (e.g. perhaps only two samples, as opposed to three, when the ratio is close to unity).
[0071] In some cases, a single set of equally spaced sampling points along the major axis is identified and the ellipse to be sampled is the elliptical footprint identified in block 402. In these cases, isotropic filtering is performed at each of the identified sampling points (see block 406), and the results of the isotropic filtering are combined using a Gaussian filter (see block 408). In some cases, the isotropic filtering performed at the sampling points may incorporate a mipmap interpolation technique. In such cases, performing isotropic filtering at a sampling point may comprises: performing a first isotropic filtering at the sampling point at a first mipmap level; performing a second isotropic filtering at the sampling point at a second mipmap level; and interpolating between a result of the first isotropic filtering and a result of the second isotropic filtering to generate a result of the isotropic filtering for the sampling point. Therefore, in these cases, the interpolation between mipmaps is performed before the combination (i.e. before block 408). To implement trilinear filtering, the first and second isotropic filtering may be bilinear filtering, the first and second mipmap levels may comprise one higher resolution mipmap level and one lower resolution mipmap level, and the result of the trilinear filtering at each sampling point is combined at block 408.
[0072] For example, let the length of the minor axis p_ be equal to base mipmap level texels, and the length of the major axis p, be equal to 3.'\/ base mipmap level texels. As is known to those of skill in the art, the minor and major axis lengths have a fixed size in normalised texture co-ordinates so will have a smaller texel size for lower resolution mipmap levels. In this example, 77 = 3, and the level of detail (LOD) A = log2 p_ = which indicates that it may be beneficial to sample from multiple mipmaps. Where only one set of equally spaced sampling points along the major axis are identified, N samples (where N is proportional to n = 3) at a spacing of K \11. - are identified which produces an effective \ 1 c 1 1 spacing of KI1--V2 texels from the base mipmap level and K 1---texels from the second mipmap level. It will be evident to those of skill in the art that these locations are aligned in (u, v) co-ordinates. Thus the relevant positions of each mipmap level can be identified by a single set of points along the major axis.
[0073] In other cases, where multiple mipmap levels are used, multiple sets of equally spaced sampling points along the major axis are identified -one for each mipmap level. Each set of equally spaced sampling points relates to, or is associated with, a different ellipse to be sampled, wherein each ellipse to be sampled is based on, and/or related to, the elliptical footprint identified in block 402. When multiple sets of sampling points are identified each set may have the same number of sampling points or different sets may have a different number of sampling points.
[0074] Where multiple sets of sampling points are identified, isotropic filtering may be performed at each of the identified points of each set (see block 406), the results of the isotropic filtering for each set may be combined separately (see block 408), and then interpolation may be performed on the two combination results (e.g. using a fractional level of detail (LOD) mipmap interpolation weight). For each mipmap level the steps are in units of texels. In some examples, described with reference to FIG. 6, the major axis may be rescaled to be 27j texels to preserve the eccentricity of the ellipse and so that the target ellipse 602 (i.e. the elliptical footprint identified in block 402) is a linear interpolation of an ellipse 604 that is smaller than the target ellipse 602 (from the higher resolution mipmap 606) and an ellipse 608 that is larger than the target ellipse 602 (from the lower resolution mipmap 610), each of which are concentric with respect to the target ellipse 602. Accordingly, the two ellipses to be sampled are the smaller ellipse 604 and the larger ellipse 608. This may be referred to herein as symmetric anisotropic filtering. Since the anisotropic ratio is the same for both ellipses 604, 608 to be sampled the number of samples in each set of sampling points is the same.
[0075] For example, let the length of the minor axis p_ be equal to ^/7 base mipmap level texels, and the length of the major axis p+ be equal to 3V7 base mipmap level texels. In this example a spacing of u\il -texels is used for each mipmap level. Since a width of a texel from the second mipmap level is twice that of the first (base) mipmap level, the kernel width of the second mipmap level is double that of the first. As the sample locations do not align, a separate set of sample points is identified for each mipmap level.
[0076] In other examples, described with reference to FIG. 7, the major axis is not rescaled (i.e. it is left unaltered) so that the target ellipse 702 is approximated by a linear interpolation of a higher eccentricity ellipse 704 with respect to the target ellipse 702 (from the higher resolution mipmap 706) and a lower eccentricity ellipse 708 with respect to the target ellipse 702 (from the lower resolution mipmap 710). Accordingly, the two ellipses to be sampled are the higher eccentricity ellipse 704 and the lower eccentricity ellipse 708. This may be referred to herein as asymmetric anisotropic filtering. Since the anisotropic ratio of the two ellipses 704, 708 to be sampled is different, the number of samples in each set of sampling points may be different. Generally, the set of sampling points for the higher resolution mipmap 706 will have twice as many sampling points as the set of sampling point for the lower resolution mipmap 719. In these cases, if there is a ratio less than 2 on the higher resolution mipmap this may be clamped to a minimum ratio of 1 on the lower resolution mipmap level (so it still overblurs in the lower ratio limit).
[0077] For example, let the length of the minor axis p_ be equal to base mipmap level texels, and the length of the major axis p, be equal to 3*N/ base mipmap level texels. In this P+P+ 3 example rih 3-\/ and rho r. The spacing base mipmap feveltexel second mipmap leveltexel v 2 \ for the higher resolution mipmap level is KI 1 -*32 texels and the spacing for the lower \ resolution mipmap level is tc,/ 1 -L3, texels. As the sample locations do not align, a separate set of sample points is identified for each mipmap level.
[0078] Once the sampling point set(s) have been identified the method 400 proceeds to block 406.
[0079] At block 406, isotropic filtering is performed at each of the sampling points identified in block 404 (e.g. isotropic filtering 508 may be performed at each of the sampling points 506 in FIG. 5). The isotropic filtering performed at a sampling point may be any type of isotropic filtering such as, but not limited to, point filtering, bilinear filtering (with or without mipmapping) and trilinear filtering. In general, the closer, or more similar, the isotropic filter is to a Gaussian filter, the better the results of the method 400 of FIG. 4. For example, a tent filter is more similar to a Gaussian filter than a box filter thus performing the isotropic filtering using a tent filter may improve the result of the anisotropic filtering. Specifically, the box filter is the first order cardinal B-spline, and the tent filter is the second order cardinal B-spline, formed by convolving the box filter by itself. The central limit theorem indicates that if one applied repeated convolution of these filters (i.e. distributions), they tend towards the normal distribution, so the cardinal B-splines can be seen as a series of increasing order approximations of the Gaussian. In general, higher order B-splines have better anti-aliasing properties and thus may produce better results. Once isotropic filtering has been performed at each of the sampling points, the method 400 proceeds to block 408.
[0080] At block 408, the isotropic filtering results for each set of sampling points are combined used a Gaussian kernel. For example, where there is a single set of sampling points, the isotropic filtering results for those sampling points are combined using a Gaussian kernel. Where, however, there are multiple sets of sampling points -one corresponding to each mipmap level, then the isotropic filtering results corresponding to each mipmap level may be combined separately using a Gaussian kernel. Specifically, the isotropic filtering results corresponding to the lower resolution mipmap level may be combined using a Gaussian kernel, and the isotropic filtering results corresponding to the higher resolution mipmap level may be combined using a Gaussian kernel.
[0081] In some cases, the results of the isotropic filtering for a set of sampling points may be combined by identifying the appropriate Gaussian weight for each isotropic filtering result based on the location of the related sampling point, calculating the product of each filtering result and the corresponding weight, and calculating the sum of the products. However, as described in more detail below with respect to FIG. 8, in other cases the results of the isotropic filtering for a set of sampling points may be combined via recursive linear interpolation.
[0082] Where only one set of sampling points were identified, then the method 400 may end. Where, however, multiple sets of sampling points were identified then the method 400 may proceed to block 410.
[0083] At block 410, it is interpolated between the combination results generated in block 408 for the different sets of sampling points. For example, block 410 may comprise interpolating between the Gaussian combination result generated in block 408 for the set of sampling points for a higher resolution mipmap and the Gaussian combination result generated in block 408 for the set of sampling points for a lower resolution mipmap. In some cases, the interpolation may be performed using a fractional level of detail (LOD) mipmap interpolation weight. However, it will be evident to a person of skill in the art that this is an example only. Once the interpolation has been performed, the method 400 may end. The result of the method (block 408 or block 410), which may be referred to as a filter result, may be output for further processing. For example, the output of the method (block 408 or block 410) may be output to a shader (or another component of a graphics processing unit or a graphics processing system) for use in generating a rendering output (e.g. an image).
[0084] The different approaches described above for using multiple mipmap levels have different advantages and disadvantages. Where the interpolation between isotropic filtering results for different mipmap levels is performed prior to combining the filtering results (e.g. where a single set of sampling points are identified), the sampling points along the major axis may be spaced too sparsely on the high resolution mipmap and too densely on the low resolution mipmap, which may result in a relatively poor and high quality approximation respectively, in terms of the minor axis scale. In contrast, if the interpolation between isotropic filtering results for different mipmap levels is performed after Gaussian combination of the filtering results (e.g. where a set of sampling points are identified for each mipmap level) there will be a consistent spacing on each mipmap, so the quality in terms of the kernel sample density (i.e. minor axis spacing) will be consistent. However, if the symmetric anisotropic filtering approach is used where a pair of concentric ellipses are used to approximate the desired ellipse (one smaller than the desired ellipse and one larger than the desired ellipse), the kernel major axis length may be too sparsely separated (between adjacent fragments in the gram buffer) on the high resolution mipmap level and too densely separated on the lower resolution level. This is because using a pair of different sized ellipses affects the degree of overlap with neighbouring fragments. In contrast, if the asymmetric anisotropic filtering is used where a pair of ellipses with different levels of eccentricity are used to approximate the desired ellipse, then for the major axes the desired filter width is selected for each mipmap level and therefore the anfialiasing issue associated with linearly interpolating two different sized kernels is avoided.
Spacing of the sampling points along the major axis [0085] In Gaussian EWA a circular Gaussian filter in screen space is mapped to an elliptical filter in texture space. In the methods described herein the continuous Gaussian filter of Gaussian EWA is approximated as a discrete Gaussian weighted sum of smaller filters (i.e. the isotropic filters which are preferably Gaussian filters) which can be represented as a convolution between a Gaussian filter and an isotropic filter.
[0086] In determining a preferred spacing of the sampling points one must first identify the preferred covariance of the convolution and the filters that make up the convolution. As will be described in more detail below, the inventors have identified that the preferred variance of the convolution is p+2, and the preferred variance of the isotropic filter is p2, and since the variances of functions are additive under convolution the preferred variance of the Gaussian filter is p2+ -p2. Methods known to the Applicant for estimating forms of EWA (e.g. Gaussian EWA), which is not an admission that they are well-known, do not generally account for the contribution of the isotropic filter to the target variance, leading to inexact resultant kernel profiles.
[0087] Specifically, in anisotropic filtering preferably there is a symmetric covariance matrix in screen space which maps to an anisotropic covariance matrix in texture space. In some examples, an asymmetric covariance matrix in space may be defined, but this generates additional complexity in the determination of the elliptical footprint in textures space. Equation (10) shows a 2D mapping of a covariance matrix (xxT) in screen space to an anisotropic covariance matrix (JxxTJT) using the total derivative or Jacobian (J) (typically formed from the screen-space coordinate derivatives, as described above) as an affine mapping approximation. In particular, if the covariance matrix is symmetric (e.g. xxT = (1 0) k0 1)) the determination of the mapping simplifies to the earlier equations (e.g. JxxTJT = JJT). It can be seen that this describes an ellipse with a major axis p, (cos 0) and a minor axis sin cp p_ (-sin q5), where for texture space coordinates (11), 0 describes the angular displacement cos 0 of the major axis from the u axis in texture space. 2 0
(COS cf) -sin 0)(P+ ( COS II sin 0) k"sin 0 cos 0)k 0 p2) -sin (p cost (10) [0088] This can be divided into an isotropic portion and anisotropic portion. Specifically, let the anisotropic filter be represented by A and the isotropic filter be represented as T then, using the additive property of covariance where the covariance of a filter F is written as (F), equation (10) can be re-written as equation (11) which can be re-written as equation cos 0) (12) where, Q 24 is proportional to the variance of the anisotropic filter A along the axis ( ksin 0) and p* is proportional to the covariance of the isotropic filter T (which is along both the major axis and the minor axis). It can be seen from equation (12) that the covariances along the major axis are additive to produce a final variance for the convolution of p 242 + p27 xxT (A * T) = xxT (A) + xxT (T) (cos -sin 6/\ (141 0) ( cos sin 0) ( fiq 0) (11) sin cos 0) 0 0) k-sin 0 cos 0 (4.
= (cos 0 -sin 0\ (Pi + 0 cos 0 sin 0) (12) ksin 0 cos 0) 0 sin 0 cos 0) [0089] Preferably equation (12) is equal to equation (10). In other words, preferably the covariance of the convolution is equal to the preferred covariance expressed in equation (10). For equation (10) and (12) to be equal, then p = 0, p2T = p2, and p A2 = p2+ -p2.
Therefore the preferred covariance for the anisotropic filter is p2+ -p2. Accordingly, in determining the covariance of the anisotropic filter, the covariance of the isotropic filter is to be taken into account.
[0090] Now that the preferred covariance of the Gaussian filter (i.e. the anisotropic filter) has been determined to be g -p _2, an analysis of the error is used to identify a preferred spacing of the sample points along the major axis. Specifically, as the discrete weighted sum is only an approximation of a continuous Gaussian there will be an error between the continuous Gaussian and the discrete approximation thereof. The inventors have identified that the upper bound on this error is not dependent on the anisotropic ratio if the spacing between the samples is proportional to /1 -77-2 units. This is advantageous because, when the error is not dependent on the anisotropic ratio, a consistent quality can be achieved for a given (kernel) sampling rate budget. Specifically, when the error is not dependent on the anisotropic ratio it can be seen that if the number of samples are reduced for lower ratios this will result in a greater error. In other words, one cannot cut corners for lower anisotropic ratios.
[0091] Specifically, let a discrete Gaussian weighted sum f(x) with the preferred covariance of 14 -p2 be represented in 1 dimension as shown in equation (13), wherein ip is the offset from the centre of the major axis of the ellipse at which the first sample is situated, a is the distance between samples along the major axis, and S is the Dirac delta function, used to reduce a convolution defined inside an integral to a discrete sum. As described above, in some cases the offset 0 is equal to IA such that when there are an even number of samples the samples are evenly distributed on either side of the middle or centre point of the major axis (i.e. the samples are symmetric about the middle or centre of the major axis). However, in other cases, the offset may be other values such as, but not limited to, zero.
r(X) = \i,(6-p2)EnEZ e +i 6(x -(n + Ou) (13) [0092] The Gaussian weighted sum is convolved with the isotropic filter. Let the isotropic filter be a Gaussian filter with a covariance of p2 as shown in equation (14), which is the preferred covariance for the isotropic filter. 2x2
FP-2(x) = \He P-(14) rip [0093] The general form of the convolution can then be expressed by equation (15) which can be re-written as equation (16) by using the Dirac delta function 6 to eliminate the integral such that the convolution can be written as a sum of smaller Gaussians at each sample location weighted by the discrete Gaussian filter r. Specifically, the result of the application of the Dirac delta function 6 in the integral is only non-zero when x' = x -(n + ip)a, therefore x' is replaced with x -(n + tp)o-and the integral is removed.
\ 2 (15) -2 x-x 2x (rr) (x) -704;7_22),4 Aiff2p.*J. LEL e \47-2P-(1 6(x - co -(n +1.14e 0- 2a2 2(ip)o-2) (16) E exp 2 ( ( E.+ exp 2 (x 'n+ ) re0--,1-2)P+ \IffP1 nE 11-11-2 P+ 9- [0094] The exponents in equation (16) can then be arranged as shown below to result in a representation of the exponents shown in equation (17) wherein the exponents are expressed as the combination of two terms -a first term which is dependent on it and a second term which is not dependent on n. It can be seen that this rearranging has been accomplished by completing the square.
-2 (n +1P)a V \/1--77-2P+) P- ) X \21 2 -- (0--r/-2)/12F + ig)o-2(n +0) 2 2xu(n IP) +P
P-2 [ =-2 2 [( 1
(1 --2)10)0_2(n ± 0)2 \ 77_2)p2) 2(1 - )xo-(n + 1p) + ((1 -)p(1 712)2x2 ( 1 n-2)2x2 r)21 \ (1 -77-2)plJ 207-2 -1 + 1) 2 2 \ (o-(rt + -( n-2)x)2 (0_ 77-2)p2) 1 p2 X = 2 (a(n±r1P) (1 212 2 (r. )2 (17) v1-n-2p_ P+ [0095] The exponents in equation (16) can then be replaced with the expression thereof in equation (17) which results in equation (18). Since the second term of equation (17) is not dependent on n it can be removed from the summation which results in equation (19). It can be seen that the terms outside of the summation represent a Gaussian filter with a variance of p2+. Thus equation (19) can be re-written as a function of a Gaussian filter with a variance of p2+ (i.e. Fr,y(X)) a shown in equation (20) (F * Fp2) (X) 20-2 2, 2 (i-n-2)p p2 E72EZ exp exp ( ) (18) Tr3E.17 P+ = Lint exp (:f.32)] [ 20-2 v (n+.17)_6-1(1_11-2)x)2)1 jir(1-1?-202 GneZ exp ( 2 ( (19) Cf -2p_ 20-2 = rio(X)F\I ETLEZ eXP ( 2 riP)-6-1(1-17-2)X)2)1 (20) IT(1-11-2)p2 0-111-11-2p [0096] A Gaussian filter with a variance of p2+(i.e. F2 (x)) is the desired result of the convolution thus it is desirable for the terms within the bracket of equation (20) to be as close to one as possible. Equation (20) can be further simplified by using a Fourier transform of a Gaussian to itself as expressed in equation (21). It can be seen from equation (21) that the Fourier transform of a Gaussian is a Gaussian. It can be seen that equation (21) is rco generated by completing the square and the fact that dx e-ax2 = dk e-prptcr ezffikx Tr [)2 X'2 x = dk exp (-( -(k -2i--2 El) 2 irp2 2 GO Orp)2 Ti-p2 = e 2C:° ) dk exp (--=4 2 e (21) p 2 [0097] If p = a-111-17-2 p_ then equation (21) can be re-written as equation (22).
2 2 20-2 oo dk e 2 2nikx (22) 4fr(1-q-2)p2 e = \cr-12P- [0098] Equation (20) can then be re-written as equation (23) using equation (22).
* Fp2) (X) = Fp2+ (X) En EZ Co dk exp (-2+ (1 17-2)(ThP_k)2)exp (2ffik ((n + - 6-1(1 -71-2)x)) (23) [0099] Using the identity EnEz exp(27rikn) = Znez 5(n -k), which specifies that a discrete series of integer frequency sinusoidal functions is equivalent to a train of Dirac delta functions with integer spacing, equation (23) can be re-written as equation (24) which can be simplified to equation (25) because the integral with respect to the Dirac delta function selects the integrand at integer locations.
(F * Fp 2)(X) = Fp+2 (x)Entz r:dk 5(n -k) exp (-2+2 (1 -r7-2)(ffp_02)exp(2ffik(ip -77-2)x)) (24) = Fg(x)Enci exp (-.2+ (1 -77-2)(rp_n)2) exp(2n-in(tp - -17-2)x)) (25) [00100] One way to determine the worst case error on an operator like a kernel is to use the supremum norm when the kernel K acts on some function f (which is a texture in the anisotropic filtering case). The supremum norm can be expressed by equation (26), and the Li norm can thus be expressed by equation (27).
IIK fll. = suplr K(x -xDf(x)1< suplf(x)1 I w 11((x)1 = 11K11 1 Ilf II.(26) xEt[t xeta 111(111 = Lc: dx K(x) (27) [00101] The error between the continuous desired Gaussian F 2 and the convolution
P
can thus be expressed as the L1 norm as shown in equation (28). Equation (28) can be rewritten as equation (29) using equation (25). It can be seen that the summation has been reduced to positive integers it only. The upper bound on the error can thus be expressed by equation (30) because the integral of a Gaussian (f 4. dx!7z (x)) is one and the upper bound on the cosine is 1. (28)
= f dx r94(x) LTEN2 exp (- (1 -77-2)(ff p_n)2) COS (2min(p --1(1 -n-2)x))1 (29) EnEN 2 exp (-2÷, (1 --2) Or p_n)2) (30) [00102] It can be seen from equation (30) that if the spacing a of the samples along the major axis is proportional to Vi -77-2 units (i.e. p_ in this example) as shown in equation (31) where lc is some constant, then the upper bound on the error shown in equation (30) can be expressed as shown in equation (32).
= kil -77-2p_ (31) ri3 -LIP +11 X "N 2 exp (Irn)2) 2K2 (32) [00103] It can thus be seen from equation (32) that when the spacing a of the samples along the major axis is proportional to 11 -71-2 units then the upper bound on the error is not dependent on the ratio)7 or the units (e.g. p_ in this example) which is beneficial. Specifically, it allows there to be a uniform bound for a set of parameters.
[00104] It has been assumed in the above analysis that an infinite series of weights are applied in the discrete convolution, which is clearly not feasible. This error is treated herein as an independent parameter that, as described in more detail below, can be controlled by the extent of the kernel support.
[00105] Two different example implementations where the spacing a of the samples along the major axis is proportional to VI. -77-2 units will now be described. As is known to those of skill in the art, for a non-negative function such as a Gaussian function, the variance (or rather the standard deviation, i.e. the square root of the variance) gives an indication of the scale of the kernel and the degrees of filtering. Accordingly, if the anisotropic filter has the preferred variance of,c)12 -p.3 then the standard deviation of the filter is Jp -pK. Since the Gaussian (discrete or continuous) has infinite support, the series is truncated to generate a finite sequence of filtering operations. While higher quality windowing functions can be selected, the examples here assume a simple termination of the series after a finite number of terms.
[00106] It is desirable that the support of the Gaussian kernel along the major axis (i.e. the length of the major axis over which the sample points extend) be proportional, by a proportionality factor a, of two standard deviations 0.e. 2, 1[4 -pTh If there are N evenly spaced samples along the major axis then the spacing or distance a between the samples on the major access can be expressed as shown in equation (33), where p_ is the minor radius vector and p+ is the major radius vector. Accordingly, 2a represents the number of standard deviations of the major axis that is covered by the samples. Equation (33) can then be expressed in terms of V1 -17-2 as shown in equations (34) and (35).
a _ (33)
P-
a = (aN Al 1 -77-2p_(1=1 (34) P+ a = (IN) 11 -77-2p+ (35) [00107] In a first example implementation, N (which has to be an integer) is proportional to the anisotropic ration, a sampling rate fl, and the width of the kernel in standard deviations (i.e. 2a) as shown in equation (36). In general, the sampling rate 13 controls how closely spaced along the major axis kernel the samples are. The higher is, the more closely spaced the samples are along the major axis. For example, if /9is two, samples are taken every one standard deviation of the minor axis, instead of every two standard deviations of the minor axis when 13 is one. In this example, the spacing a between the sample points can be expressed by equation (37) by replacing N in equation (34) with the right-hand side of equation (36) N = 2a#17/1 (36) CT, 1 _ 77-2p_ (1F) (37) [00108] Accordingly, the proportionality factor lc is equal to 1 t± for this first example R P71 implementation. It can be seen from the inequality in equation (32) that the cap on the error in the approximation decreases as 4? increases. This indicates that the approximation quality, in the limit as a tends toward infinity, is highest when)7 is just larger than an integer. Specifically, it can be seen from equation (37) that when the anisotropic ratio q is an integer 0.e. i = [411) the spacing a of the samples along the major axis (in units of p_) is proportional, by a constant or fixed proportionality factor of - When the anisotropic ratio q is not an integer ((i.e. )7 =1771) then the spacing a of the samples along the major axis On units of p4 is proportional, by the variable factor -)7-2 This 13 1711' means that the sample spacing for non-integer anisotropic ratios is at least as dense as for the corresponding integer anisotropic ratio.
[00109] As described above, in the methods described herein, isotropic filtering is performed at the sampling points along the major axis, and the results of the isotropic filtering are combined via a weighted sum where the weights are determined in accordance with a Gaussian kernel. This can be represented as a convolution of an anisotropic filter (a discrete Gaussian weighted sum) and an isotropic filter. The anisotropic filter can be represented by equation (13) shown above. From equation (13) it can be determined that the weight for the result of the isotropic filtering performed for the nth sample point can be determined in accordance with equation (38), where ip is an offset from the midpoint of the major axis at which the first sample is situated, and a is the spacing between samples. In some cases, the offset may be set to 1/2. However, in other cases, the offset may be set to other values, such as, but not limited to, zero. (77. +1p)a can be expressed as shown in equation (39). If IP)a in equation (38) is replaced with the right-hand side of equation (39), equation (38) can be written as equation (40). Equation (40) can be rearranged to produce equation (41).
2( (n +7P)a F(n) cc e 2°J (38) r
-
(n. + O)cr = (n + (T) -77-2 p_ (39) 9+ NT-H0(2÷771,11-a-2P-) 11-tr2p± -2a2(2"7,),2;9)2 = e (40) r(n) -me 2(211+20) e 2(' N) \e 2a2(2m+24' 2 -01 (41) [00110] In this first example, N is as set forth in equation (36) which means equation (41) can be re-written as weights equation (42).
1M211+1)2 e 7 nl r(n) 1(2/72+1)2 e E N N gin] ) (42) [00111] It can be seen from equation (42) that in this first example the weights only need to be calculated for integer values of the anisotropic ratio. This can reduce the cost (e.g. tabulation of weights) of implementing this first example. However, the disadvantage of this first example is that the spacing is discontinuous and specifically may jump discontinuously when there is a jump from one integer anisotropic ratio to another. This is especially true if the reconstruction quality is low (for example if the sample rate is low, e.g. = 12) which may result in discontinuous jumps in a filtered image, which may be prominent.
[00112] Accordingly, in a second example implementation, to avoid the discontinuous spacing, the anisotropic filter support (as indicated by a) can be widened for non-integer ratios. Specifically, in this second example a is expressed as a function of the anisotropic ratio n as shown in equation (43) where a() is an integer, representing the base proportionality factor of the kernel support (i.e. multiples of two standard deviations). N can then be written as shown in equation (44) and the spacing a between the sample points can be expressed by equation (45) by replacing a in equation (34) with the right hand side of equation (43), and replacing N in equation (34) with the right-hand side of equation (44).
a(17) = a121 (43) N = 2a0fl1ffi (44) a (2an) /1 77_20_ (t) paon fi 17_2a_ °LH) \71 n_20_ (45) V N VP+) V N) V VP+) kP-F,/ [00113] In this second example, the spacing of the samples On units of p _) is always proportional, by a fixed or constant proportionality constant 731, to -17-2 regardless of whether the anisotropic ratio is an integer or not.
[00114] In this second example N is as set forth in equation (44) which means equation (41) can be re-written as equation (46).
e 71(2 $ito -t2 1) (46) ErnE 2 lpm+1)2 e 2 MI) 1,1_0) \ [00115] It can be seen from equation (46) that in this second example the weights are not limited to integer anisotropic ratios which makes calculation and storage of the weights more complicated, but this second example has the advantage that the spacing is continuous for various ratios and gives a uniform error in the limit as the number of samples tends to infinity.
[00116] It is noted that the first and second example formulations of the spacing of the sample points and the corresponding weights end up being the same for integer anisotropic ratios. Furthermore, if a in the first example equals ao in the second example, an identical number of samples is used in each example, irrespective of the anisotropic ratio. In the former case, the sample density is increased, and the kernel support (in terms of standard deviations) is held constant, yielding an increased sample density quality of approximation but with roughly constant error associated with the truncation of the series. In the latter case, the sample density is held constant, but the kernel support is increased, reducing the error associated with the truncation of the series, but, as noted above, this effect diminishes as the base kernel support increases and vanishes in the limit.
Recursive Linear Interpolation [00117] The methods of performing anisotropic filtering described above comprise performing isotropic filtering (e.g. bilinear filtering or trilinear filtering) at a plurality of sampling points along a major of an ellipse in texture space and combining the results of the isotropic filtering using a Gaussian kernel. In other words, a Gaussian-weighted sum of the results of the isotropic filtering is generated. As described above, one way of combining the results of the isotropic filtering using a Gaussian kernel is to identify a Gaussian weight for each isotropic filtering result based on the location of the sampling point, calculate the product of each weight and the corresponding isotropic filtering result, and calculate the sum of the products. However, calculating a Gaussian weighted sum in this manner has a number of disadvantages. First, because some of the weights, and thus the associated products, can get quite small, this method comprises summing a number of small values which can result in a large accumulated error. Secondly, this method of calculating a Gaussian weighted sum comprises calculating and storing complicated Gaussian weights at a high level of precision.
[00118] Accordingly, described herein is an improved method of calculating the Gaussian-weighted sum of isotropic filtering results which can reduce the accumulated error from small weights, and it can also save time and resources. More particularly, in the methods described herein the Gaussian-weighted sum is calculated over a sequence of linear interpolations starting with the isotropic filtering results that correspond to the sampling points furthest from the centre of the major axis and moving towards the centre of the major axis.
[00119] For example, a Gaussian-weighted sum can be expressed as a sequence of linear interpolations as shown in equation (47) where Ft, is the result after the kth iteration, yk is a linear interpolation factor for the kth iteration, and K = where N is the even number of sampling points along the major axis.
Lk_qc_n+Rk_(k-0), (47) Fk = (1-Yk)Fk-1± yk 1< k < K2 [00120] Each iteration a current total (Fk_1) is blended with the isotropic filtering result LK_(k_l) corresponding to the (K -(k -1))th sampling point to the left of the middle point of the major axis, and the isotropic filtering result RK_k corresponding to the (K -(k -1))th sample point to the right of the middle point of the major axis using a linear interpolation factor yk. Accordingly, L is used to denote an isotropic filtering result that corresponds to a sampling point that is to the left of the middle point of the major axis and R is used to denote an isotropic filtering result that corresponds to a sampling point that is to the right of the middle point of the major axis. The final result FK is obtained after K iterations.
[00121] In another example, instead of combining corresponding L and R isotropic filtering results in the same iteration, the L isotropic filtering results and the R isotropic filtering results may be interpolated separately and the interpolation of the L isotropic filtering results may be averaged with the interpolation of the R isotropic filtering results. Specifically, in this example, each iteration a current total of L isotropic filtering results, 6, is blended with the isotropic filtering result LK_(k_i) using a linear interpolation factor yk as shown in equation (48); and a current total of R isotropic filtering results, Plc', is blended with the isotropic filtering result RK_(k_i) using a linear interpolation factor yk as shown in equation (49). After K iterations the final result FK is calculated as the average of the final total of L isotropic filtering results Ftc. and the final total of R isotropic filtering results Fil as shown in equation (50).
6 = (1-yk)6_1 + yk Lk_k_i), 1 k K (48) 6 = (1-yk)6?-1+ yk Rk_(k_1),1 k K (49) (Fk+Fin rK 2 (50) [00122] Although in the second example two separate partial results are generated and stored, this approach can have the advantage of increasing cache coherency as the inner loop is performed over adjacent samples (as opposed to opposite ends of the kernel), and the outer loop can potentially be interleaved, or otherwise take account, of neighbouring fragment filter operations. For example, an anisotropic filter aligned with the horizontal screen space axis may choose to process a pair of vertically aligned fragments in parallel, first recursively interpolating the left points (which should be spatially coherent on account of the horizontal direction of anisotropy), followed by the right points, before moving on to the next vertical fragment pair to the right. Since the right points of the fragment pair to the left will move from the right to the left, and the left points of the fragment pair to the right will move from the left to the right, this can mean that the final samples from the left vertical fragment pair are close to the first samples from the right vertical fragment pair.
[00123] Reference is now made to FIG. 8 which illustrates a method 800 of combining the results of isotropic filtering performed at a plurality of sampling points using a Gaussian filter via recursive linear interpolation. In other words, method 800 is an example implementation of block 408 of the method 400 of FIG. 4. The method 800 may be performed by a texture filtering unit of a graphics processing system, such as the texture filtering unit 1506 of FIG. 15 [00124] The method 800 begins at block 802 where an iteration counter /c is initialised (e.g. to 1) and the current total (i.e. F0) is initialised to a starting value. If the initial interpolation factor yi is 1, then F0 does not contribute to the final result and the current total can be initialised to any value. This would define a normalised weighted sum purely in terms of a set of recursive weights. However, as described below, the weights for a Gaussian sum may be simplified when written directly in terms of the infinite series.
[00125] Since this series is truncated to achieve a realisable filter, the starting value F0 may represent the partial result of all the truncated terms. The advantage of initialising the starting value to the partial result of all the truncated terms is that a single set of weights y (working from the inside out) may be defined, irrespective of the degree of truncation. In some examples, the starting value F0 may be an estimate of the sum of the truncated terms in the form of a texture sample (e.g. the central value) or a known average.
[00126] In other cases, the current total (i.e. F0) may be initialised to zero. In such cases, the final result may be normalised (e.g. by tabulating the missing weights for the degree of truncation for a particular ratio). Accordingly, where the current total is initialised to zero, the method 800 may further comprise, prior to block 810, a final normalisation step to rescale the result by 1. Once the iteration counter k and the current total have 1-missinyWeights been initialised the method 800 proceeds to block 804.
[00127] At block 804, the current total is blended with the isotropic filtering result corresponding to the (K -(k -1)) th sample point to the left of the middle point of the major axis of the ellipse (i.e. LK_(k_i)) and/or the isotropic filtering result corresponding to the (K -(k -1)) th sample point to the right of the middle point of the major axis of the ellipse (i.e* RK -(k-1)) using a linear interpolation factor of yk to generate a new current total. In some cases, the method 800 may be used to linearly interpolate between only the L isotropic filtering results, between only the R isotropic filtering results, or between both the L and R isotropic filtering results.
[00128] Where the method 800 is used to linearly interpolate between only the L isotropic filtering results then at block 804 (1 -yk)Fk_i + yk (LK_(k-1)) may be calculated where Fk_i is the current total. In these cases, the method may further comprise repeating blocks 802 to 810 for the R isotropic filtering results and then determining the average of the interpolation of the L isotropic filtering results and interpolation of the R isotropic filtering results. Similarly, where the method 800 is used to linearly interpolate between only the R isotropic filtering results then at block 804(1 -yk)Fk_j. + yk (RK_(k_i)) may be calculated. In these cases, the method may further comprise repeating blocks 802 to 810 for the L isotropic filtering results and then determining the average of the interpolation of the R isotropic filtering results and interpolation of the L isotropic filtering results may be calculated to generate a final result. Where the method 800 is used to linearly interpolate between the L and R and isotropic filtering results then at block 804 (1 -yk)Fk_i + yk may be calculated.
[00129] For example, where the method 800 is used to interpolate between the L and R isotropic filtering results corresponding to a set of evenly spaced sample points along a line shown in FIG. 9, in the first iteration (i.e. iteration 1, k = 1) the average of L" and R" is blended with F0 to generate F1 using linear interpolation factory1. In the second iteration (i.e. iteration 2, k = 2) the average of Lic_i and RK_i is blended with F1 using linear interpolation factor 3,2 to generate F2. Then, in the third iteration (i.e. iteration 3, k = 3) the average of LK_2 and RK _2 is blended with F'2 using linear interpolation factor y3 to generate F3.
[00130] In some cases, the linear interpolation factors yk may be approximated with a linear function of k. One example linear function for calculating the linear interpolation factors yk is shown in equation (51) where c and m may be calculated in accordance with equations (52) and (53) respectively.
yk = m(K - c (51) (52) (53) [00131] In another example, the linear interpolation factors yk may be calculated in accordance with equation (54). In this case the results may be tabulated (e.g. in a hardware lookup table) directly for the finite (e.g. fixed point) set of anisotropic ratios for m and then apply lookup table optimisation techniques.
_(2m_1)2/2,72 (54) [00132] The fact that the factors yk are near linear means that it is likely amenable for lookup table optimisation. For example, FIG. 10 shows graphs 1002, 1004, 1006, 1008, 1010 of the factors yk of equation (54) for anisotropic ratios of 2, 3, 4, 6 and 8, respectively. It can be seen that they are generally linear. FIG. 11 shows graphs 1102, 1104, 1106, 1108 and 1110 of the factors yk of equation (54) for anisotropic ratios of 2, 3, 4, 6 and 8 4e 2,1 C = (1-c)8 172. -respectively when the number of samples cover two standard deviations. It can be seen that in these cases the factors are almost perfectly linear.
[00133] It will be evident to a person of skill in the art that these are examples only, and that other functions of k may be used to generate the linear interpolation factors. For example, in either of the examples described above 77 may be replaced with My orfl[171.
[00134] Once the blending is complete the method 800 proceeds to block 806.
[00135] At block 806, it is determined whether this is the last iteration. If the iteration counter is initialised to 1, it may be determined that this is the last iteration if the iteration counter k is equal to K. As described above, K is equal to where N is the even number of sample points along the major axis of the ellipse. If it is determined that this is not the last iteration, then the method 800 proceeds to block 808. If, however, it is determined that this is the last iteration then the method 800 ends 810.
[00136] At block 808, the iteration counter k is incremented (i.e. k = k + 1). Once the iteration counter has been incremented the method 800 proceeds back to block 804.
[00137] Calculating the Gaussian-weighted sum in accordance with the method 800 of FIG. 8 can reduce the accumulated error from the small weights because the method 800 of FIG. 8 combines isotropic filtering results for samples that are close together and thus will have similar weights. Calculating the Gaussian-weighted sum of isotropic filtering results in accordance with the method 800 of FIG. 8 can also save time and resources because, as described above, the linear interpolation factors yk can be approximated with a linear function of k that can use relatively few bits of precision (e.g. 8 bits) and/or a look-up table in which to store the function parameters.
[00138] Although the method 800 of FIG. 8 is described as being used in conjunction with the method 400 of FIG. 4 wherein the spacing of the sample points along the major axis is proportional to.71 -77-2, the method 800 of FIG. 8 can be used independent of whether the spacing of the sample points along the major axis is proportional to [00139] Although the method 800 of FIG. 8 has been described as being used to combine the results of isotropic filtering using a Gaussian filter, the method 800 may also be used to combine the results of isotropic filtering using a non-Gaussian filter or a filter that is not strictly a Gaussian filter. For example, as described in more detail below, in some cases, it may be advantageous to combine the results of isotropic filtering using weights that are not strictly Gaussian.
Selecting Anisotropic Filter Weights [00140] In the examples described above, the weights used to combine the results of the isotropic filtering (i.e. the anisotropic filter weights) are Gaussian. In other words, the anisotropic filter is Gaussian. Using a Gaussian anisotropic filter has proven to provide good results in many cases, partially because of the frequency response of a Gaussian filter. Specifically, a Gaussian filter, which has a Gaussian response in the frequency or spectral domain, acts as a low pass frequency filter which has a minimum spectral width or variance On the modulus squared sense as described below) for a given spatial width or variance.
[00141] This can be illustrated mathematically. Specifically, let F: 1W -> 1a be a continuous filter defined by equation (55) where (p: 1W -> R is a real function (which may be referred to as the first function) defined such that the weights of the filter are non-negative. The filter F identifies, or specifies, the weights associated with different positions along an X axis (i.e. for different x values). In the anisotropic texture filtering context x represents the position of a sampling point with respect to the midpoint of the major axis. The integral of the filter from negative infinity to positive infinity, denoted IFI, can be expressed by equation (56); the mean of the filter, denoted (F), can be expressed by equation (57); and the variance of the filter, denoted Tc2(F), can be expressed by equation (58).
F (x) = 1q5(x)12 (55) + IFI = lc& lop(x)12 (56) +00 x (F) = fee dx xio(x)12 (57) ±c* ±ce -x2 (F) = dx (x -(F))2 F(x) = -f dx Ix 0(x)12 (58) Ir Ic -on _oo [00142] The frequency response of ip is denoted (P with 4'): R C. The relationship between cp and its frequency response (P is described by equations (59) and (60). +co
(f) = 5 dx (p(x) e-12WIX (59) +oo Ø(x)= Jdif) 4;( e+1271x o [00143] The modulus squared spectral variance of can then be written as shown in equation (61) where T(f)=1(1)(f)12 and o'(x) is the first derivative of q5 with respect to x (i.e. Ø'(x) = Let) . +.0 +Ex _ 1 fat 1( f -7(17).Y k( f)12 I 4r1.191(x)12 _ -Do 4,12 47r2 [00144] The expression of the modulus squared spectral variance of q5 shown in equation (61) can be used to express the product of the modulus squared spatial and spectral variances purely in terms of integral expressions involving 0(x) and its derivative, as shown in equation (62). (60)
df Hib(f)12 f ctx143(x)12 (61) (/+00I 4x1x(x)12) (+Ex+0+00I 4x101(x)12) 00 + + axli)(r)12 ax10(x)12 (62) [00145] The derivative of the product (equation (62)) can then be expressed by equation (63) where 0"(x) is the second derivative of 0 with respect to x 0.e 0"(x) = x2 4)(x) 49"(x) 256(x) (63) +Ex +Do +00 I cbc1x0(x)12 I 4x10'(x)12 I dx10(x)12 [00146] It can be verified that when the filter F is a Gaussian filter as shown in equation (64), the product of equation (62) is minimized (i.e. equation (63) = 0). Specifically, this can be seen by direct substitution of equation (64) into equation (63) where I dx10'(x)12 and 0"(x) = -)0(x) for the Gaussian filter F of equation (64). x 1
4,12 417 2n2 F(x)=14)(x)12 -1__eW (64) 2/n72 +co [00147] However, when a continuous Gaussian is approximated by a truncated Gaussian (such that only a portion of the Gaussian curve is represented), and particularly when a discrete Gaussian is truncated to a finite number of sample points, the frequency response becomes less ideal (i.e. it looks less Gaussian). Specifically, more unwanted, or higher, frequencies are allowed to pass (i.e. they are not sufficiently attenuated). This becomes more pronounced as the sample points decreases. Accordingly, using an anisotropic filter with Gaussian weights may not provide the best filtering results in all cases. The inventors have identified that instead of automatically using Gaussian weights, the best weights for the anisotropic filter may be determined by selecting the weights that minimize a cost function which penalizes high frequencies in the frequency response under certain constraints. For example, the weights of the filter F may be selected so as to minimize the product of the modulus squared spatial and spectral variances of 4) (i.e. equation (65)) so as to achieve the most Gaussian-like frequency response -i.e. a frequency response with a minimum spectral variance for 0 for a given spatial variance. It is noted that it is the product of the modulus squared spatial and spectral variances of 4) that is minimized rather than simply the spectral variance of 0 as minimizing the spectral variance alone would not provide meaningful results as it would push the weights to values that result in no spectral variance -i.e. zero (non-zero) frequencies.
2rz (65) [00148] Reference is now made to FIG. 12 which illustrates an example method 1200 of performing anisotropic texture filtering in which the weights of the anisotropic filter are selected to minimize the spectral width or variance thereof. The method 1200 begins at block 1202 where the parameters defining an elliptical footprint in texture space are determined. Block 1202 generally corresponds to block 402 of the method 400 of FIG. 4 thus the parameters that may be generated and the methods of generating the parameters described with respect to block 402 equally apply to block 1202. Once the parameters defining an elliptical footprint in texture space have been determined the method 1200 proceeds to block 1204.
[00149] At block 1204, one or more sets of equally spaced sampling points (which may also be referred to as sample points) in the texture space, based on the elliptical footprint defined in block 1202, are identified. In some cases, the equally spaced sample points may lie along the major axis of the ellipse. All of the methods and techniques described above for (i) determining the number of sample points per set, 00 identifying the number of sets, and (iii) identifying the position and/or spacing of the sample points described above (e.g. the methods and techniques described in relation to block 404 of the method 400 of FIG. 4) equally apply to block 1204. Once one or more sets of equally spaced samples points in texture space have been identified, the method 1200 proceeds to block 1206.
[00150] At block 1206, isotropic filtering is performed at each sampling point identified in block 1204. Block 1206 generally corresponds to block 406 of the method 400 of FIG. 4 so any of the methods and techniques for performing isotropic filtering on the sample points described in relation to block 406 equally apply to block 1206. Once the isotropic filtering at the sample points has been performed, the method 1200 proceeds to block 1208.
[00151] At block 1208, an anisotropic filter is selected for each set of equally spaced sample points identified in block 1204, based on one or more parameters of the set of samples and at least one or more parameters of the elliptical footprint. The one or more parameters of the set of samples may comprise the number of samples N in the set, the offset 1p indicating a location of a first sample in the set, and/or the spacing a between adjacent samples in the set. The one more parameters of the elliptical footprint may comprise the anisotropic ratio or parameters from which the anisotropic ratio can be determined.
[00152] The weights of each anisotropic filter are selected, based on the parameters, to be the weights that minimize a cost function that penalizes high frequencies in the frequency response of the anisotropic filter, under one or more constraints to ensure that the filter satisfies one or more features of a texture filter.
[00153] For example, in some cases it may be desirable for a texture filter to be normalised to remove a global brightness factor (i.e. a DC component). Accordingly, the anisotropic filter may be constrained to be normalised. Such a constraint may be referred to as the normalization constraint, and for the example filter F defined in equation (55) may be expressed by equation (66). +0o
IFI = f dxicp(x)12 = 1 (66) [00154] In some cases, it may be desirable that the weights of a texture filter be centred around the origin of the co-ordinate system. Accordingly, the mean of the anisotropic filter may be constrained to be zero. Such a constraint may be referred to as the mean constraint, and for the example filter F defined in equation (55) may be expressed by equation (67). +cc
T(F) = 17(1 fee dxxfrp(x)12 = 0 (67) [00155] As described above, it has been determined that it is desirable that the anisotropic filter have a variance of if where n is the anisotropic ratio when expressed in units of standard deviations of the corresponding isotropic filter. Accordingly, the anisotropic filter may be constrained to have a variance of 712. Such a constraint may be referred to as the variance constraint, and for the example filter F defined in equation (55) may be expressed by equation (68).
+00 +Go 1 1 = -f dx(x -(F))2 F = -f dxixrcb(x)12 = 772 (68) I Fi _00 _oo [00156] The cost function that is minimized may be any cost function that penalizes high frequencies in the frequency response or spectrum response of the anisotropic filter. As described above, when a continuous Gaussian filter is approximated by a truncated Gaussian, the frequency response deviates from the ideal continuous Gaussian filter frequency response and may allow unwanted frequencies. Accordingly, the quality of the filtering result produced by the anisotropic filter may be improved by selecting filter weights that result in a desirable frequency response (e.g. more Gaussian-like frequency response).
[00157] As described above, a more Gaussian-like frequency response may be achieved by selecting weights that minimize the product of the modulus squared spatial and spectral variances of q5 where F (x) = 4'(x)12. So, if the spatial variance of q5 is fixed or known, then minimizing equation (65) minimizes the spectral variance of op. Equation (65) may be referred to as the Gaussian cost function.
[00158] In other cases, instead of selecting weights that minimize a cost function (e.g. equation (65)) that pushes the anisotropic filter to have a frequency response as close as possible to a Gaussian frequency response, another cost function may be used which pushes the anisotropic filter to have a frequency response that is not Gaussian but is similar. For example, instead of selecting weights that minimize equation (65), the weights may be selected to minimize the L2 norm of the anisotropic filter F. As is known to those of skill in the art, the L2 norm, also referred to as the Euclidean norm, is the length of a vector. For the anisotropic filter F of equation (55) the L2 norm can be expressed by equation (69). This may be referred to as the norm cost function.
+00 (+co 2 II F 112= ( fsdxIF(x)12) = kodx145(x)14) (69) [00159] However, equation (69) itself cannot be minimized, as selecting weights that minimize equation (69) would select weights that will tend towards no (non-zero) frequencies in the frequency domain. Specifically, equation (69) does not show a preference for desirable passband frequencies over undesirable stop band frequencies. Accordingly, Lagrange multipliers A and it may be used to enforce the constraints described above. Specifically, A can be used to enforce the variance constraint, and it can be used to enforce the normalization constraint. Equation (69) can then be written with the Lagrange multipliers as shown in equation (70). It is noted that the inventors have determined that the solution will satisfy the mean condition. Therefore the mean condition is assumed to be true and is not explicitly enforced using a Lagrange multiplier.
+cc +00 f dx10(x)14 + A( f clx1x0(x)12 -r12) (.1 dx1q5(x)12 -1) (70) [00160] The variational derivative of equation (70) is shown (with common factors removed) in equation (71). The filter F that minimizes equation (70) can thus be identified by setting equation (71) to zero. It can be shown that equation (71) is equal to zero, and thus equation (70) is minimized, when the filter F is as shown in equation (72). In other words, equation (69) is minimized under the constraints when the filter F is as shown in equation (72).
000(20200 + Ax2 + it) (71) f4: (1;:+2) IXI V5q2 (72) F (x) = icy(x)12 = 5-112 [00161] The impulse response of the filter as set out in equation (72) is shown at 1302 in FIG. 13 compared to the impulse response of a Gaussian filter shown at 1304 for n = 2. It can be seen that such a filter has finite support (i.e. width) with radius lx1 > 15172 [00162] The frequency response of the filter F of equation (72), denoted P(f), can be expressed by equation (73) and is shown at 1402 in FIG. 14 compared to the frequency response of a Gaussian filter shown at 1404 for q = 2. It can be seen that the frequency response is only mildly inferior compared to the Gaussian which indicates that this is a suitable method of identifying a good set of weights for the anisotropic filter. +75q2
P (f) = f dx 3 (1 475r12 ziffx _ 3 [sin(27,j5q2f) cos(27145,721) (73) (2775q2/)3 (27r157/2f)2 [00163] Accordingly, selecting weights for the anisotropic filter that minimize equation (69), under the constraints, and specifically under the variance constraint, will select weights that have a spatial response as close to 1302 of FIG. 13 as possible and a frequency response as close to 1402 as possible.
[00164] In other cases, instead of selecting weights that minimize equation (65) or equation (69), the weights that minimize the spectral spread of F2 as set forth in equation (74) may be selected. This may be referred to as the spectral cost function.
+.1cedx1cY(x)0(x)12 (74) [00165] Like the L2 norm equation (i.e. equation (69)), equation (74) itself cannot be minimized, as selecting weights that minimize equation (74) would select weights that tend to produce no (non-zero) frequencies in the frequency domain. Specifically, equation (74) does not show a preference for desirable passband frequencies over undesirable stop band frequencies. Accordingly, Lagrange multipliers A and p may be used to enforce the constraints described above. Specifically, A can be used to enforce the variance constraint, and p can be used to enforce the normalization constraint. Equation (74) can then be written with the Lagrange multipliers as shown in equation (75).
+co +co +co fodx1cp'(x)0(x)12 + A( fccdrixoP(x)12 -172) + li(icnclxi0(x)12 -1) (75) [00166] The variational derivative of equation (75) is shown in equation (76). The filter F that minimizes equation (75) can thus be identified by setting equation (76) to zero.
0(x)(-012 (x) -(40 (x) + Ax2 + p) (76) [00167] It will be evident to a person of a skill in the art that the filter set out in equation (68) will also set equation (76) to zero, and thus will minimize equation (75). So, selecting the weights using equations (69) or (74), under the constraints, appear to produce similar results -i.e. they will both select weights that have a spatial response as close to 1302 of FIG. 13 as possible and a frequency response as close to 1402 of FIG. 14 as possible.
[00168] In some cases, the weights the minimize of one of the cost functions, under the constraints, may be dynamically determined -i.e. on the fly. However, this can be a fairly time and resource intensive process. Accordingly, in other cases, the weights that minimize one or more of the cost functions, under the constraints, may be determined off-line for expected ranges of the parameters -e.g. for ranges of sets of sample points S (which is described above and below), anisotropic ratios q, and sample spacing a). In such cases, the results may be stored in a look up table which is indexed by the parameters (e.g. set of sample points S, anisotropic ratio q, and sample spacing a).
[00169] Once the weights for an anisotropic filter for each set of sampling points have been selected, the method 1200 proceeds to block 1210.
[00170] At block 1210, the isotropic filter results for each set of sampling points are combined using the corresponding filter weights identified in block 1208. Block 1210 generally corresponds to block 408 of the method 400 of FIG. 4, except that instead of using Gaussian weights to combine the results of the isotropic filtering, the weights selected in block 1208 are used to combine the results. Any of the methods and techniques described above for combining isotropic filter results using an anisotropic filter can be used in block 1210. For example, any of the methods and techniques described in relation to block 408 or any of the methods and techniques described in relation to method 800 of FIG. 8 for combining isotropic filter results may be used in block 1210. Where only one set of sampling points was identified then the method 1200 may end. However, where more than one set of sampling points was identified in block 1204 then the method 1200 may proceed to block 1212.
[00171] At block 1212, it is interpolated between the combination results generated in block 1210 for each set of sampling points. Block 1212 generally corresponds to block 410 of the method 400 of FIG. 4 and thus any of the methods and techniques for interpolating between combination results for different sets of sampling points equally apply to block 1212. Once the interpolation has been performed the method 1200 may end. The result of the method (block 1210 or block 1212) which may be referred to as a filter result or a filtered texture value may be output for further processing. For example, the output of the method (block 1210 or block 1212) may be output to a shader (or another component) of a graphics processing system or graphics processing unit for use in generating a rendering output (e.g. an image).
[00172] Examples of how the weights for a discrete anisotropic filter for a set of sampling points may be selected in accordance with block 1208 of the method 1200 of FIG. 12 for a different number of samples will now be described. Let the discrete anisotropic filter A(x) with weights An be defined by equation (77) where, as described above, 8 is the Dirac function, ip is the offset from the centre of the major axis at which the first sample point in the sets is placed and a is the distance between sampling points. It is noted that, in contrast with the earlier description, A(x) is given in units of standard deviations of the isotropic filter such that, for example a = 1 -ir2 or cr = -2 1±11-71-2 in these units (when applying fi In I equations (45) and (37) respectively), which should be borne in mind when comparing expressions. The set of sample points S comprises N sampling points and can be expressed as shown in equation (78).
A(x) = A"S(x -(it + tp)o-) (77) nES s=zn[2 0) (78) [00173] The normalization, mean, and variance constraints for the discrete anisotropic filter, which now take into account the variance of the isotropic filter such that n2 -1 is used (the variance of the isotropic filter is 1 by definition of the units) in place of 772, can be expressed by equations (79), (80) and (81) respectively. +co
E An = f dx E 5(x -(n + tk)a)An = IAI = 1 (79) nes -cc nes (n + = +.0 -nEs 1 fax E 6(x -(n + ip)o-)xA" 1 = -x(A) = 0 (80) cfIAI _00 nes 1-2)72-1 E (71+ 1P)2An = cr2IAI dx S(x -(it + 0)0)(x -Tc(A))2A" 1(A) = (81) C2 a2 nES ?TES [00174] If tp = 0 and N is odd, or if = -and N is even, then the mean constraint can be satisfied by setting A" = An when N is odd and A,_1= An when N is even. This also ensures that the spectrum of the anisotropic filter is real so the effects of phase do not have to be taken into account.
[00175] The mean and variance constraints can then be further simplified to equations (82) and (83) respectively.
nAn = + z (71 + ,p)Ar, = (82) nES nES E n2 A" = tp2 + 01+ _1 )2A = -002 (83) n nES nes [00176] As is known to those of skill in the art, each constraint reduces the degrees of freedom of the solutions. As there are three equations and only one unknown (i.e. weight Ao) the problem is over-constrained if there only one sample point (i.e. N = 1) unless n2 = 1, and as described in more detail below, the weights are fully constrained for 2, 3 or 4 samples.
[00177] For example, if there are two sampling points (i.e. N = 2) and 0 < 1 then there are two weights (A_1 + Ao) and the normalisation constraint gives A_1 + Ao = 1 and the mean constraint gives A_1 = OA° = 1 -;b. If tb = 0 then there is only one non-zero weight and thus the problem becomes over-constrained. Even if lib is greater than 0 then it can be seen from the variance constraint that the problem is still over-constrained (more equations than unknowns) unless the spacing between sampling points is as set out in equation (84). Therefore when there are only two sampling points, the spacing between sampling points is the only thing that can be controlled to satisfy the constraints. Accordingly the ideal Gaussian spacing described above with respect to FIG. 4 would not be used in the method 1200 of FIG. 12 for only two samples.
a _ n2-1 (84) [00178] As described above, the width of the filter is preferably an integer multiple of two standard deviations. When tb = the spacing set out in equation (80) may reach two standard deviations when 77 = V7 so two samples may be sufficient when the anisotropic ratio is less than.1/.
[00179] If there are three sampling points (i.e. N = 3) and -12 < 1 there will be three weights A_1, Ao and Al. The normalisation constraint gives A_1 + Ao + A1 = 1 and the mean constraint gives A1 -A_1 = --tp which gives the weights as shown at (85).
A_1= Al+ A0 =1-2A1-ip (85) [00180] The variance constraint then gives Al= -0(1- Unlike the two sample case, the weights will satisfy the constraints for all values of)7 and a. However, the variance may only be a meaningful indication of degree of low-pass filtering when An 0. The requirement that Ao is positive gives equation (86), and the requirement that A1 and A_1 are positive gives the equations at (87).
Ao > 0 = 2,41 1 -tp 1 -tp2 (86) A1 > 0 2 -7fi > 1 -0); A_, 0 -0(1 + (87) 0- (72 [00181] This then gives equation (88) which sets out, for a given spacing a and offset the values of q for which 3 sampling points is sufficient to satisfy the constraints.
101(' -101)1-ip2 (88) [00182] For example, if a = 271 -77-2 expressed in units of one standard deviation (which corresponds to a = 71 -)7-2 expressed in units of two standard deviations) and -tp = 0 are substituted into equation (88) 1 2.
[00183] If there are four sampling points (i.e. N = 4) the problem is under-constrained as there are four unknowns (i.e. four filter weights) and only three constraints. However, if and A_n_1 = An the mean constraint is guaranteed and the weights are then fully constrained. The normalisation constraint then gives 1 = A_2 A_1 ± A0 ± A1 = 2(140 + A1) which gives the weights shown at (89).
A_2 = 111; A_1= 110; A0 - (89) [00184] The variance constraint gives equation (90) and the constraint that the weights are positive gives the equations at (91). (n2_1
+ = 4A + A_1 +141 = 5A1 +--= = - (90) a2 4 2 4 a2 4 Ao 0 =17i < 2 + 7; A > 0 = 1 > 4 (91) cr2 [00185] If the spacing described above with respect to FIG. 4, which may be referred to as the Gaussian spacing, is used 0.e. a = 2/1 -77-2) then 1 n. This produces the weights in equation (92) for q = 2.
A_2 = -136 -1= J736; A0 = J756; A 3 (92) [00186] However, a different spacing of the sampling points may be used to mimic different filters. For example, if a spacing proportional to the inverse of standard deviation of a tent filter is used 0.e. a = V < q < T) then -2 This produces the weights shown at (93) for n=2 which are the weights for a sampled tent filter.
1 3 1 (93) A_2 =1; A_1= 3 171; Ao =-' * A1=- [00187] If a spacing proportional to the inverse of the standard deviation for a box filter is used (i.e. a = then 2 n 5. This produces the weights shown at (94) for n=2 and the weights as shown at (95) for 77=4, which are the weights for a sampled box filter.
A_2 = 0; A_1 = 1 7; A0 = 1, 7 A1 = 0 1 1=-1' = 1 1 (95) A_2=-' A 4 - 4 4' 4 [00188] Accordingly, it can be seen that the texture filtering constraints themselves may be sufficient to determine the weights for a small number of sampling points. For more sampling points (i.e. N > 4) the weights can be selected as those that minimize one of the cost functions described above (or, as described in more detail below, analogous discrete cost functions).
[00189] For example, for more sampling points (i.e. N > 4) the anisotropic filter A can be pre-convolved with a Gaussian to define firp(x)I according to equation (96) where the goal of the minimization is to identify 0" and thus A. The pre-convolution with a Gaussian is performed because it is ultimately the frequency response of the composite filter that is of interest. (94)
10(x)12 = F (x) = (r * A)(x) = v217, X.e (n+M)cr) 210nI2 (96) [00190] For the discrete filter the normalization, mean and variance conditions as set out in equations (77), (80) and (81) can be written for the discrete function as shown in equations (97), (98) and (99) E 107/12 = (97) 7?. CS E n10 12 = (98) nES n 2 n2-1 E n210171 = 2 ± 11)2 (99) a nes [00191] Then An are selected that minimize one of the cost functions set out above (e.g. one the cost functions set out in equations (65), (70) and (75).
[00192] It can be seen that the constraints set out in equations (97), (98) and (99) can be used to simplify the cost functions set out in equation (65), (70) and (75) to equations (100), (101) and (102) respectively, as the constraints can be imposed directly (and hence substituted into the earlier equations). +
arlx0(x)12 I atIO (x)I2 + + dx10(x)12 f dx10(x)12 MY dx 14; (x) 12 (100) +-I drIc5(04 (1w) _00 + (m-n),2 dx10(x)14 = Ee.2\1ff m,nES) 10 0.12 (102) [00193] In some cases, the cost functions may be further simplified using general algebraic and/or numerical techniques. For example 1 q5(x)14 may be expressed by equation (103) which allows the cost function set out in equation (101) to be re-written as equation (104).
10(04 = 1 Ee-71("n210,112] (103) 27 RES dx10(x)14 = a(m-n))2 E e 2) 10,41112 (104) m,nes [00194] In some cases, instead of selecting weights that minimize one of the continuous cost functions described above, an equivalent discrete cost function may be minimized. Example discrete versions of the continuous Gaussian, norm and spectral cost functions described above are shown in equations (105), (106) and (107) respectively where 71 E S = 0. The discrete versions of the cost function may be easier to work with and/or solve. The discrete versions may be more suitable for on-the fly calculations of the weight. However, minimizing a discrete version of a continuous cost function may still be fairly time and/our resource intensive.
X I On-Fi 95n 12 + X 195n -On-2 (105) InEs107,14 (106) E 1(0.+1 -(P730.12 + E 1(07,-q5.-307,12 (107) nES nes Texture Filtering Unit and Graphics Processing System [00195] Reference is now made to FIG. 15 which illustrates an example graphics processing system 1500. The graphics processing system 1500 comprises a graphics processing unit (GPU) 1502 and a memory 1504. The GPU comprises a texture filtering unit 1506. The texture filtering unit 1506 is configured to perform anisotropic texture filtering in accordance with any of the methods described herein. The texture filtering unit 1506 may be configured to receive texture co-ordinates from, for example, a shader program, and perform anisotropic filtering based on the received texture co-ordinates. In general, the texture filtering unit 1506 may be implemented in hardware (e.g. fixed function circuitry), software, or a combination thereof, but it may be preferable for it to be implemented in hardware because this tends to provide a lower latency texture filtering operation, at the cost of inflexibility of operation, but since the desired operation of the texture filtering unit 1506 is known at design time, its operation does not need to be flexible. The memory 1504 comprises a portion of memory 1508 for storing a texture (e.g. storing a mipmap representing the texture) and a portion of memory 1510 (which may be referred to as a frame buffer) for storing image data (e.g. rendered pixel values) which are output from the GPU 1502.
[00196] FIG. 16 shows a computer system in which the texture filtering units and/or graphics processing systems described herein may be implemented. The computer system comprises a CPU 1602, a GPU 1604, a memory 1606 and other devices 1614, such as a display 1616, speakers 1618 and a camera 1620. A processing block 1610 (corresponding to the texture filtering unit 1506 of FIG. 15) is implemented on the GPU 1604 as well as a Neural Network Accelerator (NNA) 1611. In other examples, the processing block 1610 may be implemented on the CPU 1602 or within the NNA 1611. The components of the computer system can communicate with each other via a communications bus 1622.
[00197] While FIG. 16 illustrates one implementation of a graphics processing system, it will be understood that a similar block diagram could be drawn for an artificial intelligence accelerator system -for example, by replacing either the CPU 1602 or the GPU 1604 with a Neural Network Accelerator (NNA) 1611, or by adding the NNA as a separate unit. In such cases, again, the processing block 1610 can be implemented in the NNA.
[00198] The texture filtering unit and graphics processing system of FIG. 15 are shown as comprising a number of functional blocks. This is schematic only and is not intended to define a strict division between different logic elements of such entities. Each functional block may be provided in any suitable manner. It is to be understood that intermediate values described herein as being formed by a unit or system need not be physically generated by the unit or system at any point and may merely represent logical values which conveniently describe the processing performed by the unit or system between its input and output.
[00199] The texture filtering units and/or the graphics processing systems described herein may be embodied in hardware on an integrated circuit. The texture filtering units and/or the graphics processing system described herein may be configured to perform any of the methods described herein. Generally, any of the functions, methods, techniques or components described above can be implemented in software, firmware, hardware (e.g., fixed logic circuitry), or any combination thereof. The terms "module," "functionality," "component", "element", "unit", "block" and "logic" may be used herein to generally represent software, firmware, hardware, or any combination thereof. In the case of a software implementation, the module, functionality, component, element, unit, block or logic represents program code that performs the specified tasks when executed on a processor. The algorithms and methods described herein could be performed by one or more processors executing code that causes the processor(s) to perform the algorithms/methods. Examples of a computer-readable storage medium include a random-access memory (RAM), read-only memory (ROM), an optical disc, flash memory, hard disk memory, and other memory devices that may use magnetic, optical, and other techniques to store instructions or other data and that can be accessed by a machine.
[00200] The terms computer program code and computer readable instructions as used herein refer to any kind of executable code for processors, including code expressed in a machine language, an interpreted language or a scripting language. Executable code includes binary code, machine code, bytecode, code defining an integrated circuit (such as a hardware description language or netlist), and code expressed in a programming language code such as O, Java or OpenOL. Executable code may be, for example, any kind of software, firmware, script, module or library which, when suitably executed, processed, interpreted, compiled, executed at a virtual machine or other software environment, cause a processor of the computer system at which the executable code is supported to perform the tasks specified by the code.
[00201] A processor, computer, or computer system may be any kind of device, machine or dedicated circuit, or collection or portion thereof, with processing capability such that it can execute instructions. A processor may be any kind of general purpose or dedicated processor, such as a CPU, GPU, NNA, System-on-chip, state machine, media processor, an application-specific integrated circuit (ASIC), a programmable logic array, a field-programmable gate array (FPGA), or the like. A computer or computer system may comprise one or more processors.
[00202] It is also intended to encompass software which defines a configuration of hardware as described herein, such as HDL (hardware description language) software, as is used for designing integrated circuits, or for configuring programmable chips, to carry out desired functions. That is, there may be provided a computer readable storage medium having encoded thereon computer readable program code in the form of an integrated circuit definition dataset that when processed (i.e. run) in an integrated circuit manufacturing system configures the system to manufacture a texture filtering unit and/or a graphics processing system configured to perform any of the methods described herein, or to manufacture a texture filtering unit and/or a graphics processing system comprising any apparatus described herein. An integrated circuit definition dataset may be, for example, an integrated circuit description [00203] Therefore, there may be provided a method of manufacturing, at an integrated circuit manufacturing system, a texture filtering unit and/or a graphics processing system as described herein. Furthermore, there may be provided an integrated circuit definition dataset that, when processed in an integrated circuit manufacturing system, causes the method of manufacturing a texture filtering unit and/or a graphics processing system to be performed.
[00204] An integrated circuit definition dataset may be in the form of computer code, for example as a netlist, code for configuring a programmable chip, as a hardware description language defining hardware suitable for manufacture in an integrated circuit at any level, including as register transfer level (RTL) code, as high-level circuit representations such as Verilog or VHDL, and as low-level circuit representations such as OASIS (RIM) and GDSII. Higher level representations which logically define hardware suitable for manufacture in an integrated circuit (such as RTL) may be processed at a computer system configured for generating a manufacturing definition of an integrated circuit in the context of a software environment comprising definitions of circuit elements and rules for combining those elements in order to generate the manufacturing definition of an integrated circuit so defined by the representation. As is typically the case with software executing at a computer system so as to define a machine, one or more intermediate user steps (e.g. providing commands, variables etc.) may be required in order for a computer system configured for generating a manufacturing definition of an integrated circuit to execute code defining an integrated circuit so as to generate the manufacturing definition of that integrated circuit.
[00205] An example of processing an integrated circuit definition dataset at an integrated circuit manufacturing system so as to configure the system to manufacture a texture filtering unit or a graphics processing system as described herein will now be described with respect to FIG. 17.
[00206] FIG. 17 shows an example of an integrated circuit (IC) manufacturing system 1702 which is configured to manufacture a texture filtering unit and/or a graphics processing system as described in any of the examples herein. In particular, the IC manufacturing system 1702 comprises a layout processing system 1704 and an integrated circuit generation system 1706. The IC manufacturing system 1702 is configured to receive an IC definition dataset (e.g. defining a texture filtering unit or a graphics processing system as described in any of the examples herein), process the IC definition dataset, and generate an IC according to the IC definition dataset (e.g. which embodies a texture filtering unit, or a graphics processing system as described in any of the examples herein). The processing of the IC definition dataset configures the IC manufacturing system 1702 to manufacture an integrated circuit embodying a texture filtering unit or a graphics processing system as described in any of the examples herein.
[00207] The layout processing system 1704 is configured to receive and process the IC definition dataset to determine a circuit layout. Methods of determining a circuit layout from an IC definition dataset are known in the art, and for example may involve synthesising RTL code to determine a gate level representation of a circuit to be generated, e.g. in terms of logical components (e.g. NAND, NOR, AND, OR, MUX and FLIP-FLOP components). A circuit layout can be determined from the gate level representation of the circuit by determining positional information for the logical components. This may be done automatically or with user involvement in order to optimise the circuit layout. When the layout processing system 1704 has determined the circuit layout it may output a circuit layout definition to the IC generation system 1706. A circuit layout definition may be, for example, a circuit layout description.
[00208] The IC generation system 1706 generates an IC according to the circuit layout definition, as is known in the art. For example, the IC generation system 1706 may implement a semiconductor device fabrication process to generate the IC, which may involve a multiple-step sequence of photo lithographic and chemical processing steps during which electronic circuits are gradually created on a wafer made of semiconducting material. The circuit layout definition may be in the form of a mask which can be used in a lithographic process for generating an IC according to the circuit definition. Alternatively, the circuit layout definition provided to the IC generation system 1706 may be in the form of computer-readable code which the IC generation system 1706 can use to form a suitable mask for use in generating an IC.
[00209] The different processes performed by the IC manufacturing system 1702 may be implemented all in one location, e.g. by one party. Alternatively, the IC manufacturing system 1702 may be a distributed system such that some of the processes may be performed at different locations, and may be performed by different parties. For example, some of the stages of: (i) synthesising RTL code representing the IC definition dataset to form a gate level representation of a circuit to be generated, (ii) generating a circuit layout based on the gate level representation, (U) forming a mask in accordance with the circuit layout, and (iv) fabricating an integrated circuit using the mask, may be performed in different locations and/or by different parties.
[00210] In other examples, processing of the integrated circuit definition dataset at an integrated circuit manufacturing system may configure the system to manufacture a texture filtering unit or a graphics processing system without the IC definition dataset being processed so as to determine a circuit layout. For instance, an integrated circuit definition dataset may define the configuration of a reconfigurable processor, such as an FPGA, and the processing of that dataset may configure an IC manufacturing system to generate a reconfigurable processor having that defined configuration (e.g. by loading configuration data to the FPGA).
[00211] In some embodiments, an integrated circuit manufacturing definition dataset, when processed in an integrated circuit manufacturing system, may cause an integrated circuit manufacturing system to generate a device as described herein. For example, the configuration of an integrated circuit manufacturing system in the manner described above with respect to FIG. 17 by an integrated circuit manufacturing definition dataset may cause a device as described herein to be manufactured.
[00212] In some examples, an integrated circuit definition dataset could include software which runs on hardware defined at the dataset or in combination with hardware defined at the dataset. In the example shown in FIG. 17, the IC generation system may further be configured by an integrated circuit definition dataset to, on manufacturing an integrated circuit, load firmware onto that integrated circuit in accordance with program code defined at the integrated circuit definition dataset or otherwise provide program code with the integrated circuit for use with the integrated circuit.
[00213] The implementation of concepts set forth in this application in devices, apparatus, modules, and/or systems (as well as in methods implemented herein) may give rise to performance improvements when compared with known implementations. The performance improvements may include one or more of increased computational performance, reduced latency, increased throughput, and/or reduced power consumption. During manufacture of such devices, apparatus, modules, and systems (e.g. in integrated circuits) performance improvements can be traded-off against the physical implementation, thereby improving the method of manufacture. For example, a performance improvement may be traded against layout area, thereby matching the performance of a known implementation but using less silicon. This may be done, for example, by reusing functional blocks in a serialised fashion or sharing functional blocks between elements of the devices, apparatus, modules and/or systems. Conversely, concepts set forth in this application that give rise to improvements in the physical implementation of the devices, apparatus, modules, and systems (such as reduced silicon area) may be traded for improved performance. This may be done, for example, by manufacturing multiple instances of a module within a predefined area budget.
[00214] The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.
Claims (25)
- CLAIMS1. A method (1200) of performing anisotropic texture filtering, the method (1200) comprising generating one or more parameters describing an elliptical footprint in texture space (1202), performing isotropic filtering at each sampling point of a set of sampling points in an ellipse to be sampled to produce a plurality of isotropic filter results, the ellipse to be sampled based on the elliptical footprint (1204, 1206); selecting, based on one or more parameters of the set of sampling points and one or more parameters of the ellipse to be sampled, weights of an anisotropic filter that minimize a cost function that penalises high frequencies in the filter response of the anisotropic filter under a constraint that the variance of the anisotropic filter is related to an anisotropic ratio squared, the anisotropic ratio being the ratio of a major radius of the ellipse to be sampled and a minor axis of the ellipse to be sampled (1208); and combining the plurality of isotropic filter results using the selected weights of the anisotropic filter to generate at least a portion of a filter result (1210).
- 2 The method (1200) of claim 1, wherein the anisotropic filter is representable as the absolute value squared of a first function, and the cost function is a product of the modulus squared spatial and spectral variances of the first function. +00
- 3. The method (1200) of claim 2, wherein the cost function is n f dx 10' (x)I2, where n is the anisotropic ratio, cb(x) is the first function, (p'(x) is the derivative of cb(x) with respect to x, and x represents a position in texture space with respect to a midpoint of the major axis of the elliptical footprint.
- 4. The method (1200) of claim 2, wherein the cost function is E 10,2+1 _øI2 + nES X Iø - S is the set of sampling points, is the first function and n S q5, = nes 0.
- 5. The method (1200) of claim 1, wherein the cost function is a function representing the Euclidean norm of the anisotropic filter.
- 6. The method (1200) of claim 5, wherein the anisotropic filter is representable as the +a, absolute value squared of a first function 0(x), the cost function is f dx I 0(x)14, and x represents a position in texture space with respect to a midpoint of the major axis of the elliptical footprint
- 7. The method (1200) of claim 5, wherein the anisotropic filter is representable as the absolute value squared of a first function cb, and the cost function is Inesion14 wherein S is the set of sampling points and n ES q5". = 0.
- 8. The method (1200) of claim 1, wherein the cost function is a function representing a spectral spread of the anisotropic filter.
- 9. The method (1200) of claim 8, wherein the anisotropic filter is representable as the +co absolute value squared of a first function 0(x), the cost function is f-01x) is the derivative of 0(x) with respect to x, and x represents a position in texture space with respect to a midpoint of the major axis of the elliptical footprint.
- 10. The method (1200) of claim 8, wherein the anisotropic filter is representable as the absolute value squared of a first function 0, the cost function is E 1(0n-Hi. 0n)0n12 + nes 1(0n 0n-00n12, S is the set of sampling points and 72 ES cb" = 0. nEs
- 11. The method (1200) of any preceding claim, wherein the sampling points of the set of sampling points lie along a major axis of the elliptical footprint.
- 12. The method (1200) of any preceding claim, wherein the one or more parameters of the ellipse to be sampled comprises the anisotropic ratio of the ellipse to be sampled.
- 13. The method (1200) of any preceding claim, wherein the one or more parameters of the set of sampling points comprises a number of sampling points in the set of sampling points, an offset of a first sampling point from a middle point of the major radius of the elliptical footprint, and a spacing between adjacent sampling points in the set of sampling points in the texture space.
- 14. The method (1200) of any preceding claim, wherein selecting the weights of the anisotropic filter that minimize a cost function that penalises high frequencies in the filter response of the anisotropic filter under a constraint that the variance of the anisotropic filter is related to an anisotropic ratio squared comprises selecting a set of weights from a lookup table based on the one or more parameters of the set of sampling points and the one or more parameters of the ellipse to be sampled, the lookup table comprising weights that minimize the cost function for a plurality of values for the parameters.
- 15. The method (1200) of any preceding claim, wherein a number of sampling points in the set of sampling points is greater than two, and a spacing between adjacent sampling points in the set of sampling points is proportional to 71 -17-2 units, wherein.17 is the anisotropic ratio and a unit corresponds to the minor radius of the ellipse to be sampled.
- 16. The method (1200) of any preceding claim, wherein the weights of the anisotropic filter are further selected under a constraint that the anisotropic filter has a mean of zero and/or the anisotropic filter is normalised to one.
- 17. The method (1200) of any preceding claim, wherein the set of sampling points comprises N sampling points and N is proportional to the anisotropic ratio.
- 18. A method of generating an image, the method comprising performing the method of any preceding claim, and generating an image based on the at least a portion of the filter result.
- 19. A texture filtering unit for use in a graphics processing system, the texture filtering unit configured to perform the method of any of claims 1 to 17.
- 20. The texture filtering unit of claim 19, wherein the texture filtering unit is embodied in hardware on an integrated circuit.
- 21. A graphics processing system comprising the texture filtering unit of claim 19 or claim 20.
- 22. Computer readable code configured to cause the method of any of claims 1 to 18 to be performed when the code is run.
- 23. A computer readable storage medium having encoded thereon the computer readable code of claim 22.
- 24. An integrated circuit definition dataset that, when processed in an integrated circuit manufacturing system, configures the integrated circuit manufacturing system to manufacture the texture filtering unit of claim 19 or claim 20.
- 25. A computer readable storage medium having stored thereon a computer readable description of the texture filtering unit of claim 19 or claim 20 or the graphics processing system of claim 21 that, when processed in an integrated circuit manufacturing system, causes the integrated circuit manufacturing system to manufacture an integrated circuit embodying the texture filtering unit or the graphics processing system.
Priority Applications (12)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2110744.6A GB2609246A (en) | 2021-07-26 | 2021-07-26 | Anisotropic texture filtering |
CN202210884582.9A CN115701870A (en) | 2021-07-26 | 2022-07-25 | Anisotropic texture filtering |
CN202210877264.XA CN115690296A (en) | 2021-07-26 | 2022-07-25 | Anisotropic texture filtering |
CN202210879140.5A CN115690297A (en) | 2021-07-26 | 2022-07-25 | Anisotropic texture filtering |
EP22186842.5A EP4125042A1 (en) | 2021-07-26 | 2022-07-26 | Anisotropic texture filtering using weights of an anisotropic filter |
US17/873,425 US12020366B2 (en) | 2021-07-26 | 2022-07-26 | Anisotropic texture filtering by combining isotropic filtering results at each of a plurality of sampling points |
US17/874,019 US11961175B2 (en) | 2021-07-26 | 2022-07-26 | Anisotropic texture filtering by combining results of isotropic filtering at a plurality of sampling points with a gaussian filter |
EP22186844.1A EP4125043A1 (en) | 2021-07-26 | 2022-07-26 | Anisotropic texture filtering with specific spacing of samples |
EP22186840.9A EP4125041A1 (en) | 2021-07-26 | 2022-07-26 | Anisotropic texture filtering by means of isotropic filtering at a plurality of sampling points |
US17/873,894 US12026819B2 (en) | 2021-07-26 | 2022-07-26 | Anisotropic texture filtering using weights of an anisotropic filter that minimize a cost function |
US18/635,206 US20240303903A1 (en) | 2021-07-26 | 2024-04-15 | Anisotropic texture filtering by combining results of isotropic filtering at a plurality of sampling points with a gaussian filter |
US18/670,442 US20240312114A1 (en) | 2021-07-26 | 2024-05-21 | Anisotropic texture filtering using weights of an anisotropic filter that minimize a cost function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2110744.6A GB2609246A (en) | 2021-07-26 | 2021-07-26 | Anisotropic texture filtering |
Publications (2)
Publication Number | Publication Date |
---|---|
GB202110744D0 GB202110744D0 (en) | 2021-09-08 |
GB2609246A true GB2609246A (en) | 2023-02-01 |
Family
ID=77541131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2110744.6A Pending GB2609246A (en) | 2021-07-26 | 2021-07-26 | Anisotropic texture filtering |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2609246A (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2583154A (en) | 2019-10-17 | 2020-10-21 | Imagination Tech Ltd | Texture filtering |
-
2021
- 2021-07-26 GB GB2110744.6A patent/GB2609246A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2583154A (en) | 2019-10-17 | 2020-10-21 | Imagination Tech Ltd | Texture filtering |
Non-Patent Citations (2)
Title |
---|
JOEL MCCORMACK ET AL: "Simple and Table Feline: Fast Elliptical Lines for Anisotropic Texture Mapping", INTERNET CITATION, 1 October 1999 (1999-10-01), pages 16pp, XP007913416, Retrieved from the Internet <URL:http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-99-1.pdf> [retrieved on 20100610] * |
MANSON JOSIAH ET AL: "Bilinear Accelerated Filter Approximation", COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS, vol. 33, no. 4, 1 July 2014 (2014-07-01), Oxford, pages 33 - 40, XP055830795, ISSN: 0167-7055, DOI: 10.1111/cgf.12410 * |
Also Published As
Publication number | Publication date |
---|---|
GB202110744D0 (en) | 2021-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240312114A1 (en) | Anisotropic texture filtering using weights of an anisotropic filter that minimize a cost function | |
US20230377217A1 (en) | Anisotropic Texture Filtering for Sampling Points in Screen Space | |
CN106408626B (en) | System and method for generating rendering output and device and method for generating graphic texture | |
JP5512217B2 (en) | Graphics processing system | |
JP2008226233A (en) | Computer mounting method and device for filtering input data, using kernel filter | |
US7782334B1 (en) | Pixel shader-based data array resizing | |
US20100097388A1 (en) | Graphics processing systems | |
GB2609245A (en) | Anisotropic texture filtering | |
GB2609246A (en) | Anisotropic texture filtering | |
US20240355037A1 (en) | Anisotropic texture filtering by combining isotropic filtering results at each of a plurality of sampling points | |
EP3926577A1 (en) | Input/output filter unit for graphics processing unit | |
GB2609244A (en) | Anisotropic texture filtering | |
GB2610371A (en) | Texture filtering | |
US20230039364A1 (en) | Image processing using filtering function covariance | |
US20230050686A1 (en) | Texture filtering of texture represented by multilevel mipmap | |
Díaz García et al. | Downsampling and storage of pre-computed gradients for volume rendering | |
Jarosz et al. | Bilinear Accelerated Filter Approximation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
732E | Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977) |
Free format text: REGISTERED BETWEEN 20240822 AND 20240828 |