WO2022106016A1 - High-order texture filtering - Google Patents

High-order texture filtering Download PDF

Info

Publication number
WO2022106016A1
WO2022106016A1 PCT/EP2020/082790 EP2020082790W WO2022106016A1 WO 2022106016 A1 WO2022106016 A1 WO 2022106016A1 EP 2020082790 W EP2020082790 W EP 2020082790W WO 2022106016 A1 WO2022106016 A1 WO 2022106016A1
Authority
WO
WIPO (PCT)
Prior art keywords
texture
texels
order
filtering
interpolation
Prior art date
Application number
PCT/EP2020/082790
Other languages
French (fr)
Inventor
Baoquan Liu
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2020/082790 priority Critical patent/WO2022106016A1/en
Priority to CN202080102689.2A priority patent/CN115917606A/en
Publication of WO2022106016A1 publication Critical patent/WO2022106016A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping

Definitions

  • This invention relates to texture filtering in images, for example in video games.
  • the goal of image filtering is, given an input image A, to create a new image B.
  • the transformation operation from source A to target B is via an image filter.
  • Texture can provide more details than geometry and can therefore increase the realism of a rendered image.
  • GPU Graphics Processing Unit
  • GPU hardware can provide fast, filtered access to textures, but generally only for a few restrictive types of texture filtering methods.
  • Both OpenGL and Direct3D provide two very simple types of texture filtering: nearest-neighbor sampling and linear filtering, corresponding to zeroth and first-order filter schemes respectively. Both types are natively supported by all GPUs.
  • Other commonly used filtering methods on modern GPUs include bilinear filtering and bicubic filtering (supported by the Vulkan API: VK_IMG_filter_cubic, G L_l M G_text u re_f i Ite r_cu b i c) .
  • Nearest-neighbor filtering is the simplest filtering algorithm, which fetches one single nearest texel sample from a texture image for any target sampling location. This requires one texture fetch and no weighted calculations. However, as shown at 101 in Figure 1 , the resulting image quality is not good, with jagged, unrealistic features which are not life-like.
  • Bilinear filtering can achieve a smoother transition across different texels and is a simple texture filter method which needs four taps of texel samples at integer texel locations.
  • a 2 x 2 grid of neighboring texels have to be fetched and also a weighted sum is calculated by using sub-texel coordinate offset as weights to generate a smooth transition.
  • Figure 2(a) shows an example of the result of bilinear filtering. Compared to nearest-neighbor filtering, the image quality is improved. However, images may still look unrealistic with jagged edges and zigzag aliasing.
  • Figure 2(b) shows the 2 x 2 sampling pattern (sampling a 2 x 2 grid of texels surrounding the target UV coordinate).
  • a filter familiar to most users of imaging programs with "high-quality" resizing is commonly called the cubic filter.
  • the result is referred to as bicubic texture filtering.
  • a typical cubic filter kernel function, which is used to calculate the polynomial weights, is shown in Figure 3.
  • the y value of this function shows the relative weight that should be assigned to the texels that are distant from the center of a given texture sampling coordinate x. Texels more than two texels from that center are ignored due to their zero weight value, while texels at the center are given the largest weight. Note that for this particular filter, some weights may be negative.
  • the cubic filtering algorithm can be expressed as: where fi is the indexed neighboring texel values at four taps of integer sampling locations, which are multiplied by the corresponding cubic polynomial weights w,(x) from the convolution kernel.
  • the weighted sum is the final result of the filtering. This weighted sum-based filtering algorithm is illustrated in Figure 4 for cubic interpolation.
  • Bicubic filtering is a 2D extension of the 1 D cubic filtering for interpolating data points on a two-dimensional regular grid.
  • the 2D interpolation function is a separable extension of the 1 D interpolation function.
  • the 2D interpolation can be accomplished by two 1D interpolation with respect to each coordinate direction.
  • the bicubic interpolated result is smoother than the corresponding results obtained by bilinear interpolation or nearest-neighbor interpolation.
  • Figures 5(a)-5(c) illustrate the rendering result of the above-described filtering methods.
  • Figure 5(a) shows the result for nearest-neighbor filtering
  • Figure 5(b) for bilinear filtering
  • Figure 5(c) for bicubic filtering.
  • third-order bicubic filtering is very demanding for mobile GPUs, due to the very high requirements of bandwidth and computing cost.
  • it requires 16 texture fetching instructions and it requires computation of the dynamic cubic weights via a third-order polynomial and 4 x 4 weighted sum calculations along x and y. Altogether, it involves 32 multiplications (16+4+4+8) and many more arithmetic operations.
  • the bicubic filtering method described in US 10102181 B2 requires 4 x 4 data sampling and complex weighted calculations.
  • the method described in US 7106326 B2 can perform filtering operations (such as linear, bilinear, trilinear, cubic or bicubic filtering) on the data values of a neighbourhood (Np x Np ), where Np x Np is the size of the neighbourhood in texels.
  • the method requires Np x Np data sampling taps and very complex weighted filtering calculations based on the multiple samples and dynamic weights. Therefore, such higher- order filtering methods generally require a much greater computational cost and texturememory bandwidth than simpler filtering methods.
  • rendering high-quality texture-mapped 3D objects using high-order texture filtering algorithms is very demanding, especially for modern mobile GPUs, which have limited compute power and memory bandwidth.
  • mobile devices require realtime rendering performance (higher frame-rates, lower latency), long battery life (low power consumption) and low heat dissipation.
  • a graphics processing device for generating an image signal representing an image having a plurality of pixels, wherein the graphics processing device is configured to generate the image signal by performing the following operations for each of the plurality of pixels: determine a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in a texture space; and apply a texture filtering function to the sub-pixel data points of the plurality of texels in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
  • the device may allow for rendering of a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations than conventional methods. This may allow the application of a filtering function having at least third-order approximation accuracy.
  • Applying the texture filtering function may comprise taking the difference between (i) an interpolation among the texels for the location x, y and (ii) a weighted sum of second-order derivative approximations among the texels in the two orthogonal directions and a weighted sum of third-order derivative approximations among the texels in the two orthogonal directions. This may allow the application of a filtering function having fourth-order approximation accuracy.
  • the second-order and/or third-order derivative approximations among the texels in the two orthogonal directions may be weighted in dependence on the fractional position of the location x, y between two consecutive integer locations.
  • the derivative approximations may be weighted in dependence on parameters a and p, as described herein.
  • the second-order and/or third-order derivatives may be calculated using the dedicated hardware logic.
  • the second-order and/or third-order derivatives may be calculated in dependence on the interpolation among the texels.
  • the interpolation may be determined using a bilinear interpolation function.
  • the bilinear interpolation function fbilin(x,y) is supported by GPU hardware, and so may be easily determined.
  • the device may comprise a texture cache configured to store the bilinear interpolation function at the location x, y. This may further reduce the processing required.
  • the two orthogonal directions may be directions in a texture space. This may allow rendering to a texture.
  • the filtering function may be determined according to a third-order or fourth-order approximation. This may produce high-quality images.
  • the device may be configured to implement the texture filtering function in one of the GLSL, HLSL and Spir-V languages.
  • the device is therefore compatible with filtering algorithms and languages used in many modern image processing systems and video games.
  • the device may be configured to implement the texture filtering function in a single instruction, or fixed function hardware unit. This may reduce the processing cost required to render the image.
  • the device For each pixel, the device may be configured to perform fewer than sixteen texture fetches. Thus, the device may perform fewer texture fetches than for traditional bicubic rendering.
  • the device may be configured to perform five texture fetches or nine texture fetches.
  • Five texture fetches may allow for third-order approximation accuracy (bicubic filtering) and nine texture fetches may allow for fourth-order approximation accuracy.
  • Fewer texture fetches for the GPU can result in longer battery life, reduced latency and improved frame-rate for complex and demanding game rendering.
  • At least some of the pixels may represent a shadow of an object in said image and the texture filtering function may be a shadow filtering function.
  • the filtered value may be a filtered shadow value. This may allow for faster shadow filtering in 2D image applications to produce a smooth image quality result for soft shadows.
  • the device may be implemented by a mobile graphics processing unit. This may allow a mobile device to efficiently perform texture rendering.
  • a method for generating an image signal representing an image having a plurality of pixels comprises, for each of the plurality of pixels: determining a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in an texture space; and applying a texture filtering function to the sub-pixel data points in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
  • the method may allow for rendering of a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations than conventional methods. This may allow the application of a filtering function having at least third-order approximation accuracy.
  • a computer program which, when executed by a computer, causes the computer to perform the method described above.
  • the computer program may be provided on a non-transitory computer readable storage medium.
  • Figure 1 illustrates the rendering result of nearest-neighbor filtering for a shadow texture.
  • Figure 2a shows the rendering result when 2 x 2 bilinear filtering is used to produce a soft shadow.
  • Figure 2b illustrates sampling a 2 x 2 grid of texels surrounding the target UV coordinate.
  • Figure 3 shows a typical cubic filter kernel function.
  • Figure 4 illustrates the weighted sum based filtering algorithm for cubic interpolation.
  • Figures 5(a)-5(c) illustrate the rendering result of the traditional filtering methods.
  • Figure 5(a) shows the result for nearest-neighbor
  • Figure 5(b) for bilinear filtering
  • Figure 5(c) for bicubic filtering.
  • Figures 6(a)-6(b) schematically illustrate 1 D linear and cubic filtering at position x by calculating the weighted sum of the neighbouring texel-values (at integer locations) where the weights are polynomials based on the sub-texel offsets.
  • Figure 7 illustrates an example of a method for generating an image signal representing an image having a plurality of pixels.
  • Figure 8 shows an example of a device configured to implement the method described herein.
  • the graphics processing device may be a graphics processor, such as a mobile GPU.
  • the graphics processing device is configured to generate the image signal by performing the operations described herein for each of a plurality of pixels of the image.
  • the plurality of pixels may comprise a subset of all pixels of the image, or the method may be performed for every pixel in the image.
  • the image signal may be used, for example, for rendering high-quality soft shadows or other textures.
  • rendering refers to any form of generating a visible image, for example displaying the image on a computer screen, printing, or projecting.
  • the device and method can implement high-order texture filtering algorithms (for example, third-order and fourth-order) on a GPU to produce a high-quality rendering image, at the same time requiring very low cost of computation and texture-memory bandwidth than other prior art implementations.
  • high-order texture filtering algorithms for example, third-order and fourth-order
  • the solution described herein uses fewer weighted sum calculations on the GPU than conventional methods.
  • the solution involves fewer texture sampling instructions and fewer ALU calculations than previous methods. This may allow for faster high-order texture filtering with third-order and fourth-order approximation accuracy.
  • the device applies a texture filtering operation at a sampling location x, y in a 2D image (where x and y are the fractional positions of a texture coordinate) comprising taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
  • High-order texture filtering algorithms usually need to perform texture sampling in a N x N neighboring area from a texture data in DDR to get N x N texel data samples at integer texel location, and then perform a filtering operation on these N x N data samples to get a high- quality filtered value to shade the final pixel in the pixel shader.
  • the cubic filtering algorithm can be expressed as the following weighted sum equation: where fi is the indexed neighboring texel values at four taps of integer sampling locations, which are multiplied by the corresponding cubic polynomial weights Wi(x) from the convolution kernel.
  • the weighted sum is the final result of the filtering.
  • the third-order filtering equation used by the device and method described herein will first be discussed.
  • the third-order filtering method is derived first in 1 D, and then extended to 2D.
  • the directions x and y are directions in a texture space.
  • the values fi represent the samples of the original function f(t) at the integer location /.
  • the continuous function f(t) is not known except those texel values fi at the discrete integer grid locations.
  • fi and. f i+1 are the two discrete function values at integer grid location i and i+1 .
  • the 1 D linear and cubic filtering function at position x is therefore determined by calculating the weighted sum of the neighbouring texel-values (at integer locations), where the weights are polynomials based on the sub-texel offsets.
  • the third-order cubic interpolation/filtering function may be expressed as: where second-order central difference (see https://en.wikipedia.org/wiki/Finite_difference) can be used to calculate f" (x) as below:
  • fbiiin(x ’ y) is the hardware bilinear interpolation function at an arbitrary 2D sampling point (x,y), which is already supported by GPU hardware.
  • this fourth-order convolution kernel is composed of piecewise polynomials defined on the unit subintervals between [-3, 3], which means that it needs to fetch six taps of texels to perform the weighted-sum calculation of all six taps of texels in 1 D.
  • This is on the contrary to the third-order approximation kernel defined on the unit subintervals between [-2, 2] with only four taps of texels, as the third-order function approximation described above.
  • the fourth-order filtering algorithm can be expressed as the following weighted sum equation: where fi is the indexed neighboring texel values at six taps of integer sampling locations, which are multiplied by the corresponding polynomial weights w i (x) from the convolution kernel. The weighted sum is the final result of the filtering.
  • Equation (16) The derivation below introduces a simplified equation for texture filtering with fourth-order approximation accuracy, which is much cheaper than the original fourth-order equation introduced by Keys (i.e., Equation (16)).
  • f bilin (x,y) is the hardware bilinear Interpolation function at an arbitrary 2D sampling point (x,y), which is supported by GPU hardware.
  • Figure 7 summarises a method for rendering an image having a plurality of pixels, wherein the method comprises, for each of at least some of the plurality of pixels.
  • the method comprises the following steps for each of at least one of the plurality of pixels.
  • the method comprises determining a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in a texture space.
  • the method comprises applying a texture filtering function to the sub-pixel data points of the plurality of texels in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
  • applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y and (ii) a weighted sum of second-order derivative approximations among the texels in the two orthogonal directions and a weighted sum of third-order derivative approximations among the texels in the two orthogonal directions.
  • the second-order and/or third-order derivatives are calculated using the dedicated hardware logic.
  • the second-order and/or third-order derivatives are calculated in dependence on the interpolation among the texels, which is determined using a bilinear interpolation function f bilin (x,y).
  • f bilin (x,y) is supported by GPU hardware, and so may be easily determined.
  • the device may comprise a texture cache configured to store the bilinear interpolation function at the location x, y. As this value is re-used in the equations, this may reduce the computation required further.
  • the texture cache may be exploited to store processed sampling results to be re-used by neighbouring pixels, in order to further accelerate the computation.
  • the texture filtering function may be a shadow filtering function.
  • a data sample’s value from a shadow map texture is either 1.0 or 0.0.
  • the rendered shadow in a final image shows strong aliasing. This is because if a data sample’s value is equal to 1.0, it means that the pixel is completely outside of the shadow, while if it is 0.0, it means that the pixel is completely in shadow.
  • a data sample’s value could be a floating point, which lies between 1 .0 and 0.0.
  • the device can be configured to perform fewer texture fetches than the original equations found in Keys (Keys, 1981). For example, for each pixel, the device can be configured to perform five texture fetches for the third-order (bicubic) approximation and nine texture fetches for the fourth-order approximation.
  • the solution may be implemented by the GPU software using shader code.
  • the simplified fucntions can be implemented by a few lines of GPU shader code using shading languages such as GLSL, HLSL, or Spir-V.
  • the filtering functions can be implemented using fixed function hardware via a single GPU instruction, instead of multiple lines of shader code.
  • one ISA intrinsics call can be used to complete 2D high-order texture filtering in a pixel shader.
  • the method described herein may therefore allow for faster and cheaper high-order texture filtering.
  • Figure 8 is a schematic representation of a device 800 configured to perform the methods described herein.
  • the device 800 may be implemented on a device, such as a laptop, tablet, smart phone, TV or any other device in which graphics data is to be processed.
  • the device 800 comprises a graphics processor 801 configured to process data.
  • the processor 801 may be a GPU.
  • the processor 801 may be implemented as a computer program running on a programmable device such as a GPU or a Central Processing Unit (CPU).
  • the device 800 comprises a memory 802 which is arranged to communicate with the graphics processor 801 .
  • Memory 802 may be a non-volatile memory.
  • the graphics processor 801 may also comprise a cache (not shown in Figure 8), which may be used to temporarily store data from memory 802.
  • the device may comprise more than one processor and more than one memory.
  • the memory may store data that is executable by the processor.
  • the processor may be configured to operate in accordance with a computer program stored in non-transitory form on a machine readable storage medium.
  • the computer program may store instructions for causing the processor to perform its methods in the manner described herein.
  • the device may allow for rendering a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations on a mobile GPU than conventional methods.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)
  • Image Processing (AREA)

Abstract

A graphics processing device (800) for generating an image signal representing an image having a plurality of pixels, wherein the graphics processing device is configured to generate the image signal by performing the following operations for each of the plurality of pixels: determine (701) a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in a texture space; and apply (702) a texture filtering function to the sub-pixel data points of the plurality of texels in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions. The device may allow for rendering of a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations on a mobile GPU than conventional methods.

Description

HIGH-ORDER TEXTURE FILTERING
FIELD OF THE INVENTION
This invention relates to texture filtering in images, for example in video games.
BACKGROUND
The goal of image filtering is, given an input image A, to create a new image B. The transformation operation from source A to target B is via an image filter.
Rendering 3D models with texture is very important for video games. Texture can provide more details than geometry and can therefore increase the realism of a rendered image. Almost all 3D games need to access textures and perform texture filtering on the Graphics Processing Unit (GPU), for example via the texture unit hardware module or shader code, to determine the color for a texture mapped pixel, by filtering the colors of nearby texels.
GPU hardware can provide fast, filtered access to textures, but generally only for a few restrictive types of texture filtering methods. Both OpenGL and Direct3D provide two very simple types of texture filtering: nearest-neighbor sampling and linear filtering, corresponding to zeroth and first-order filter schemes respectively. Both types are natively supported by all GPUs. Other commonly used filtering methods on modern GPUs include bilinear filtering and bicubic filtering (supported by the Vulkan API: VK_IMG_filter_cubic, G L_l M G_text u re_f i Ite r_cu b i c) .
Nearest-neighbor filtering is the simplest filtering algorithm, which fetches one single nearest texel sample from a texture image for any target sampling location. This requires one texture fetch and no weighted calculations. However, as shown at 101 in Figure 1 , the resulting image quality is not good, with jagged, unrealistic features which are not life-like.
Bilinear filtering can achieve a smoother transition across different texels and is a simple texture filter method which needs four taps of texel samples at integer texel locations. A 2 x 2 grid of neighboring texels have to be fetched and also a weighted sum is calculated by using sub-texel coordinate offset as weights to generate a smooth transition. Figure 2(a) shows an example of the result of bilinear filtering. Compared to nearest-neighbor filtering, the image quality is improved. However, images may still look unrealistic with jagged edges and zigzag aliasing. Figure 2(b) shows the 2 x 2 sampling pattern (sampling a 2 x 2 grid of texels surrounding the target UV coordinate).
A filter familiar to most users of imaging programs with "high-quality" resizing is commonly called the cubic filter. When applied in both x and y directions, the result is referred to as bicubic texture filtering. A typical cubic filter kernel function, which is used to calculate the polynomial weights, is shown in Figure 3. The y value of this function shows the relative weight that should be assigned to the texels that are distant from the center of a given texture sampling coordinate x. Texels more than two texels from that center are ignored due to their zero weight value, while texels at the center are given the largest weight. Note that for this particular filter, some weights may be negative.
In 1 D, the cubic filtering algorithm can be expressed as:
Figure imgf000004_0001
where fi is the indexed neighboring texel values at four taps of integer sampling locations, which are multiplied by the corresponding cubic polynomial weights w,(x) from the convolution kernel. The weighted sum is the final result of the filtering. This weighted sum-based filtering algorithm is illustrated in Figure 4 for cubic interpolation.
Bicubic filtering is a 2D extension of the 1 D cubic filtering for interpolating data points on a two-dimensional regular grid. The 2D interpolation function is a separable extension of the 1 D interpolation function. The 2D interpolation can be accomplished by two 1D interpolation with respect to each coordinate direction. The bicubic interpolated result is smoother than the corresponding results obtained by bilinear interpolation or nearest-neighbor interpolation.
Keys, R.G., ‘Cubic convolution interpolation for digital image processing’, IEEE Trans. Acoust. Speech Signal Process., 1981 , ASSP-29, (6), pp. 1153-1160 describes the cubic convolution filtering equation with third-order approximation accuracy. The equation involves a convolution kernel with cubic polynomials which has spatial support of four taps of texels. The third-order convolution kernel is as below:
Figure imgf000005_0001
Regarding the quality of the final rendering images, Keys’ cubic convolution interpolation method performs much better than linear interpolation and has become a standard in the image interpolation field. However, regarding rendering speed, it is complex and expensive to calculate at runtime for each rendering pixel.
Figures 5(a)-5(c) illustrate the rendering result of the above-described filtering methods. Figure 5(a) shows the result for nearest-neighbor filtering, Figure 5(b) for bilinear filtering and Figure 5(c) for bicubic filtering.
Higher-order filtering modes (such as third-order or fourth-order functions) often lead to superior image quality. Moreover, higher-order schemes are necessary to compute continuous derivatives of texture data. In some 3D applications, high-quality texture filtering is crucial. Images resampled with bicubic interpolation are smoother and have fewer interpolation artifacts, while bilinear filtering often produces diamond artifacts or aliasing, because the first-order derivative of the bilinear function is not continuous. Resampled images obtained by simple methods such as nearest-neighbour sampling and bilinear filtering typically incur common artefacts, such as blurring, blocking and ringing, especially in the edge regions. The bicubic algorithm is frequently used for scaling images or videos for display. It preserves fine detail better than the common bilinear algorithm, due to its high-order continuous derivatives.
However, third-order bicubic filtering is very demanding for mobile GPUs, due to the very high requirements of bandwidth and computing cost. In particular, it requires 16 texture fetching instructions and it requires computation of the dynamic cubic weights via a third-order polynomial and 4 x 4 weighted sum calculations along x and y. Altogether, it involves 32 multiplications (16+4+4+8) and many more arithmetic operations.
The bicubic filtering method described in US 10102181 B2 requires 4 x 4 data sampling and complex weighted calculations. The method described in US 7106326 B2 can perform filtering operations (such as linear, bilinear, trilinear, cubic or bicubic filtering) on the data values of a neighbourhood (Np x Np ), where Np x Np is the size of the neighbourhood in texels. However, the method requires Np x Np data sampling taps and very complex weighted filtering calculations based on the multiple samples and dynamic weights. Therefore, such higher- order filtering methods generally require a much greater computational cost and texturememory bandwidth than simpler filtering methods.
Thus, rendering high-quality texture-mapped 3D objects using high-order texture filtering algorithms is very demanding, especially for modern mobile GPUs, which have limited compute power and memory bandwidth. At the same time, mobile devices require realtime rendering performance (higher frame-rates, lower latency), long battery life (low power consumption) and low heat dissipation.
In 2005, Sigg and Hadwiger (see book chapter in GPU gems2: Fast Third-Order Texture Filtering, https://developer.nvidia.com/gpugems/gpugems2/part-iii-high-quality- rendering/chapter-20-fast-third-order-texture-filtering) developed an efficient evaluation of bicubic B-spline filtering on the GPU, which can reduce texture fetches from sixteen to only four times. As this approach drastically reduces the number of texture fetches, which is the bottleneck in GPU implementations in general, a significant speed-up can be achieved. However, this method can only support filter kernels with positive weights. It is not trivial to adapt this method to filter kernels that also take negative weight values, which is common for cubic filtering kernels, as defined by Khronos in Vulkan API (VK_FILTER_CUBIC_EXT and VK_IMG_filter_cubic), https://www.khronos.Org/registry/vulkan/specs/1 .2- extensions/man/html/VkFilter.html. This requires a pre-processing pass at runtime, which is not suitable when filtering dynamic textures that are very common in game rendering, such as shadow texture or other FBO textures.
In summary, some key technical problems of the high-order texture filtering algorithm for high- quality rendering are that high-order algorithms require many texture data fetches, requiring many texture fetching instructions which can be a memory bandwidth burden for the GPU. High-order filtering calculations also require many arithmetic operations (with corresponding ALU instructions) for each single texture filtering operation. This could be a dilemma for high- quality texture filtering on a mobile GPU.
It is desirable to develop a method with lower computational cost to produce high-quality texture filtering results with third-order or fourth-order approximation accuracy. SUMMARY OF THE INVENTION
According to a first aspect there is provided a graphics processing device for generating an image signal representing an image having a plurality of pixels, wherein the graphics processing device is configured to generate the image signal by performing the following operations for each of the plurality of pixels: determine a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in a texture space; and apply a texture filtering function to the sub-pixel data points of the plurality of texels in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
The device may allow for rendering of a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations than conventional methods. This may allow the application of a filtering function having at least third-order approximation accuracy.
Applying the texture filtering function may comprise taking the difference between (i) an interpolation among the texels for the location x, y and (ii) a weighted sum of second-order derivative approximations among the texels in the two orthogonal directions and a weighted sum of third-order derivative approximations among the texels in the two orthogonal directions. This may allow the application of a filtering function having fourth-order approximation accuracy.
The second-order and/or third-order derivative approximations among the texels in the two orthogonal directions may be weighted in dependence on the fractional position of the location x, y between two consecutive integer locations. For example, the derivative approximations may be weighted in dependence on parameters a and p, as described herein.
The second-order and/or third-order derivatives may be calculated using the dedicated hardware logic. The second-order and/or third-order derivatives may be calculated in dependence on the interpolation among the texels. The interpolation may be determined using a bilinear interpolation function. The bilinear interpolation function fbilin(x,y) is supported by GPU hardware, and so may be easily determined.
The device may comprise a texture cache configured to store the bilinear interpolation function at the location x, y. This may further reduce the processing required.
The two orthogonal directions may be directions in a texture space. This may allow rendering to a texture.
The filtering function may be determined according to a third-order or fourth-order approximation. This may produce high-quality images.
The device may be configured to implement the texture filtering function in one of the GLSL, HLSL and Spir-V languages. The device is therefore compatible with filtering algorithms and languages used in many modern image processing systems and video games.
The device may be configured to implement the texture filtering function in a single instruction, or fixed function hardware unit. This may reduce the processing cost required to render the image.
For each pixel, the device may be configured to perform fewer than sixteen texture fetches. Thus, the device may perform fewer texture fetches than for traditional bicubic rendering.
For each pixel, the device may be configured to perform five texture fetches or nine texture fetches. Five texture fetches may allow for third-order approximation accuracy (bicubic filtering) and nine texture fetches may allow for fourth-order approximation accuracy. Fewer texture fetches for the GPU can result in longer battery life, reduced latency and improved frame-rate for complex and demanding game rendering.
At least some of the pixels may represent a shadow of an object in said image and the texture filtering function may be a shadow filtering function. The filtered value may be a filtered shadow value. This may allow for faster shadow filtering in 2D image applications to produce a smooth image quality result for soft shadows. The device may be implemented by a mobile graphics processing unit. This may allow a mobile device to efficiently perform texture rendering.
According to a second aspect there is provided a method for generating an image signal representing an image having a plurality of pixels, wherein the method comprises, for each of the plurality of pixels: determining a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in an texture space; and applying a texture filtering function to the sub-pixel data points in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
The method may allow for rendering of a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations than conventional methods. This may allow the application of a filtering function having at least third-order approximation accuracy.
According to a third aspect there is provided a computer program which, when executed by a computer, causes the computer to perform the method described above. The computer program may be provided on a non-transitory computer readable storage medium.
BRIEF DESCRIPTION OF THE FIGURES
The present invention will now be described by way of example with reference to the accompanying drawings.
In the drawings:
Figure 1 illustrates the rendering result of nearest-neighbor filtering for a shadow texture.
Figure 2a shows the rendering result when 2 x 2 bilinear filtering is used to produce a soft shadow.
Figure 2b illustrates sampling a 2 x 2 grid of texels surrounding the target UV coordinate. Figure 3 shows a typical cubic filter kernel function.
Figure 4 illustrates the weighted sum based filtering algorithm for cubic interpolation.
Figures 5(a)-5(c) illustrate the rendering result of the traditional filtering methods. Figure 5(a) shows the result for nearest-neighbor, Figure 5(b) for bilinear filtering, and Figure 5(c) for bicubic filtering.
Figures 6(a)-6(b) schematically illustrate 1 D linear and cubic filtering at position x by calculating the weighted sum of the neighbouring texel-values (at integer locations) where the weights are polynomials based on the sub-texel offsets.
Figure 7 illustrates an example of a method for generating an image signal representing an image having a plurality of pixels.
Figure 8 shows an example of a device configured to implement the method described herein.
DETAILED DESCRIPTION OF THE INVENTION
Described herein is a graphics processing device for generating an image signal representing an image having a plurality of pixels. The graphics processing device may be a graphics processor, such as a mobile GPU. The graphics processing device is configured to generate the image signal by performing the operations described herein for each of a plurality of pixels of the image. The plurality of pixels may comprise a subset of all pixels of the image, or the method may be performed for every pixel in the image.
The image signal may be used, for example, for rendering high-quality soft shadows or other textures. Herein, rendering refers to any form of generating a visible image, for example displaying the image on a computer screen, printing, or projecting.
The device and method can implement high-order texture filtering algorithms (for example, third-order and fourth-order) on a GPU to produce a high-quality rendering image, at the same time requiring very low cost of computation and texture-memory bandwidth than other prior art implementations. The solution described herein uses fewer weighted sum calculations on the GPU than conventional methods. The solution involves fewer texture sampling instructions and fewer ALU calculations than previous methods. This may allow for faster high-order texture filtering with third-order and fourth-order approximation accuracy.
As will be described in more detail below, the device applies a texture filtering operation at a sampling location x, y in a 2D image (where x and y are the fractional positions of a texture coordinate) comprising taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
High-order texture filtering algorithms usually need to perform texture sampling in a N x N neighboring area from a texture data in DDR to get N x N texel data samples at integer texel location, and then perform a filtering operation on these N x N data samples to get a high- quality filtered value to shade the final pixel in the pixel shader.
In 1 D the cubic filtering algorithm can be expressed as the following weighted sum equation:
Figure imgf000011_0001
where fi is the indexed neighboring texel values at four taps of integer sampling locations, which are multiplied by the corresponding cubic polynomial weights Wi(x) from the convolution kernel. The weighted sum is the final result of the filtering.
The third-order filtering equation used by the device and method described herein will first be discussed. For the sake of clarity, the third-order filtering method is derived first in 1 D, and then extended to 2D. In these implementations, the directions x and y are directions in a texture space.
Without a loss of generality, assume that the samples of a continuous function f(t) are known only at integer texel locations, but at other arbitrary sampling locations x, the function value f(x) needs to be approximately reconstructed from these discrete texel-values by calculating the weighted sum of these discrete samples. The analysis below is based on the Taylor series expansion of the continuous function f(t). f(t) denotes a continuous function (the signal) which is sampled into the discrete texels f(i) = fi , where i is an integer. The values fi represent the samples of the original function f(t) at the integer location /. In computer imaging, the continuous function f(t) is not known except those texel values fi at the discrete integer grid locations.
The general 1 D linear i nterpolation/fi Itering is a method for estimating a specific function value f(x) at an arbitrary continuous sampling point x by calculating the weighted sum of two taps of the known functional values f(i) and f(i+1) at two integer grid locations (i and i+1), where x = i + α, α ∈[0, 1) , i.e., 1> α≥ 0, and i ∈Z being the integer and fractional parts of x, respectively. Here fi and. fi+1 are the two discrete function values at integer grid location i and i+1 .
The general 1 D cubic interpolation/f ilteri ng is a method for estimating a specific function value f(x) at an arbitrary continuous sampling point x by calculating the weighted sum of four taps of known functional values f(i) , f(i+1) and f(i+2) at four integer grid locations (from i-1 to i+2), where x = / + α, α ∈[0, 1) , i.e., 1> a > 0, and i ∈Z being the integer and fractional parts of x, respectively.
Piecewise linear and cubic function reconstructions are shown in Figure 6(a) and Figure 6(b) respectively.
The 1 D linear and cubic filtering function at position x is therefore determined by calculating the weighted sum of the neighbouring texel-values (at integer locations), where the weights are polynomials based on the sub-texel offsets.
In mathematics, the Taylor series of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. Assuming that the first three derivatives of f(t) exist at t=x, f(t) can be expanded as a Taylor series at t=x. According to Keys (Keys, 1981), the cubic interpolation function agrees with the Taylor series expansion of the image function being interpolated, and all conditions can be satisfied for image filtering.
If f(t) has at least three continuous derivatives at t=x in the interval [i, i+1] then according to Taylor’s theorem (see Keys, 1981), the third-order Taylor series approximation of the real- valued function f(t) at a specific real-valued sampling location x is given by:
Figure imgf000013_0001
where
Figure imgf000013_0002
is the error term (remainder or residual ) which goes to zero at a rate proportional to
Figure imgf000013_0003
This is because
Figure imgf000013_0004
So, this Taylor series expansion is a third-order approximation for function f(t).
Therefore, at the integer sampling location t= /, the third-order approximation for is as below:
Figure imgf000013_0005
Similarly, at the next integer sampling location t= i+1, the third-order approximation for fi+1 is as below:
Figure imgf000013_0006
flin(x) is defined as the GPU’s hardware linear interpolation function at sampling point x between the interval [i, i+1], where x = i + α, α ∈[0, 1), given by:
Figure imgf000013_0007
where
Figure imgf000013_0008
and where
Figure imgf000013_0009
Adding these two terms together gives:
Figure imgf000014_0001
,herefore, in 1 D, the third-order cubic interpolation/filtering function may be expressed as:
Figure imgf000014_0002
where second-order central difference (see https://en.wikipedia.org/wiki/Finite_difference) can be used to calculate f" (x) as below:
Figure imgf000014_0003
For h=1, the second-order derivative of f(x) at sampling location x is given by:
Figure imgf000014_0004
Extending the above equations to 2D, the bicubic interpolation/filtering function (i.e., a third- order approximation of the original function) may be approximated as below:
Figure imgf000014_0005
where the second-order derivatives can be calculated using second-order central difference as below:
Figure imgf000014_0006
where x = i + α, y =j + β, for i,j ∈Z being the integer parts, and α, β ∈[0, 1) being the fractional parts of x, y between the two consecutive integer locations. fbiiin(xy) is the hardware bilinear interpolation function at an arbitrary 2D sampling point (x,y), which is already supported by GPU hardware.
From this derivation, it can be seen that the derived 2D third-order filtering equation is much simpler than the original bicubic filtering equation (Keys, 1981), which involves cubic polynomial weighted sum calculations and 16 texture fetches, while the 2D Equation (13) above involves only five texture fetches and much simpler arithmetic calculation without any cubic polynomial weighting calculations. Only seven multiplications are needed, instead of 32 using the former method. This may result in increased rendering performance and reduced power consumption for mobile GPUs.
Experimental results have shown that the method can produce the same good quality texture filtering results as the original bicubic filtering equation (Keys, 1981), but the rendering performance of the former is approximately 91% faster than the latter for rendering frame-rate when tested on exactly the same testing platform.
The filtering equation for fourth-order approximation accuracy of texture filtering will now be described.
The convergence rate of the cubic convolution interpolation function (see Keys, 1981) is 0(h3), which yields a third-order approximation of the original function f(t). Therefore, any interpolation function whose interpolation kernel satisfies the conditions outlined by Keys (first part of Keys, R.G., ‘Cubic convolution interpolation for digital image processing’, IEEE Trans. Acoust. Speech Signal Process., 1981 , ASSP-29, (6), pp. 1153-1160) will have at most a third-order convergence rate. Nevertheless, interpolation functions with fourth-order convergence rates are possible and can be derived. Keys also derived a fourth-order approximation function (with remainder term 0(h4) ) at the cost of a much larger spatial filter kernel support, i.e., this fourth-order convolution kernel is composed of piecewise polynomials defined on the unit subintervals between [-3, 3], which means that it needs to fetch six taps of texels to perform the weighted-sum calculation of all six taps of texels in 1 D. This is on the contrary to the third-order approximation kernel defined on the unit subintervals between [-2, 2] with only four taps of texels, as the third-order function approximation described above.
In 1D, the fourth-order filtering algorithm can be expressed as the following weighted sum equation:
Figure imgf000016_0001
where fi is the indexed neighboring texel values at six taps of integer sampling locations, which are multiplied by the corresponding polynomial weights wi(x) from the convolution kernel. The weighted sum is the final result of the filtering.
In 2D, this convolution kernel will involve 6x6=36 texture fetches, and also involve the complex weighted-sum calculation using polynomials on all of the 36 taps of texels, which will introduce a very high cost of memory bandwidth and many weighted sum calculations for a mobile GPU.
The derivation below introduces a simplified equation for texture filtering with fourth-order approximation accuracy, which is much cheaper than the original fourth-order equation introduced by Keys (i.e., Equation (16)).
If function f (t) has at least four continuous derivatives at location x then according to Taylor’s theorem, the fourth-order approximation of the real-valued function f(t) at a real-valued sampling location x in the interval [i, i+1] is given by :
Figure imgf000016_0002
where 0( (t - x)4) is the fourth-order error term (remainder or residual ) which goes to zero at a rate proportional to (t - x)4.
Now, the fourth-order approximation will be derived at two integer sampling locations: t= i and t= /+1.
Firstly, at the integer sampling location ti /, we have fourth-order approximation for fi as below. Note that since
Figure imgf000016_0004
Figure imgf000016_0003
Secondly, at the next integer sampling location t= i+1, the fourth-order approximation for fi+1 is given as below. Note that fa-bj Λ3 =aΛ3 - 3aΛ2b + 3abΛ2 - bΛ3 and note that since x = i +α, 0 ≤ α <1, i+1-x = 1-α.
Figure imgf000017_0001
Since fiin(x) is the hardware linear Interpolation function at sampling point x between [fi f i+ 1], we have:
Figure imgf000017_0002
where
Figure imgf000017_0003
Figure imgf000017_0004
Adding these two terms together gives:
Figure imgf000017_0005
Therefore, the fourth-order interpolation equation in 1 D is given by:
Figure imgf000018_0001
Where second-order central difference (see https://en.wikipedia.org/wiki/Finite_difference) can be used to calculate f" (x) as :
Figure imgf000018_0002
Using h=1, the second-order derivative of f(x) at sampling location x is given by:
Figure imgf000018_0003
Similarly, the third-order derivative (see https://en.wikipedia.org/wiki/Finite_difference_coefficient) can be calculated as below:
Figure imgf000018_0004
where hx represents a uniform grid spacing between each integer finite difference interval. Using h=1, the third-order derivative of f(t) at sampling location x is given by:
Figure imgf000018_0005
Extending the above 1 D equation to 2D, the fourth-order interpolation equation is given by:
Figure imgf000018_0006
where
Figure imgf000019_0001
and x = i + α, y = j + β, for i, j ∈ Z being the integer parts, and α, β ∈ [0, 1) being the fractional parts of x, y. fbilin(x,y) is the hardware bilinear Interpolation function at an arbitrary 2D sampling point (x,y), which is supported by GPU hardware.
From this derivation, it can be seen that this 2D fourth-order filtering equation is much simpler than the original fourth-order filtering equation (Keys, 1981) which involves a complex polynomial weighted sum calculation with 36 texture fetches, while the function shows in Equation (29) above involves only nine texture fetches and much simpler arithmetic calculation. This can result in increased rendering performance and reduced power consumption for a mobile GPU.
Experimental results have shown that the fourth-order filtering function above can produce the same good quality texture filtering results as the original fourth-order filtering equation (Keys, 1981), but the rendering performance of the former is much faster than the latter in rendering frame-rate when testing on exactly the same testing platform.
Figure 7 summarises a method for rendering an image having a plurality of pixels, wherein the method comprises, for each of at least some of the plurality of pixels. The method comprises the following steps for each of at least one of the plurality of pixels. At step 701 , the method comprises determining a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in a texture space. At step 702, the method comprises applying a texture filtering function to the sub-pixel data points of the plurality of texels in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
For fourth-order approximation accuracy, applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y and (ii) a weighted sum of second-order derivative approximations among the texels in the two orthogonal directions and a weighted sum of third-order derivative approximations among the texels in the two orthogonal directions.
As described above, the second-order and/or third-order derivative approximations among the texels in two orthogonal directions are weighted in dependence on the fractional position of the location x, y between two consecutive integer locations (in dependence on a and β, where x = i + α, y = j + β). The second-order and/or third-order derivatives are calculated using the dedicated hardware logic. The second-order and/or third-order derivatives are calculated in dependence on the interpolation among the texels, which is determined using a bilinear interpolation function fbilin(x,y). fbilin(x,y) is supported by GPU hardware, and so may be easily determined.
In some embodiments, the device may comprise a texture cache configured to store the bilinear interpolation function at the location x, y. As this value is re-used in the equations, this may reduce the computation required further. In some implementations, the texture cache may be exploited to store processed sampling results to be re-used by neighbouring pixels, in order to further accelerate the computation.
For shadow filtering applications, at least some of the pixels in the image can represent a shadow of an object in said image. In these cases, the texture filtering function may be a shadow filtering function. Without any filtering, a data sample’s value from a shadow map texture is either 1.0 or 0.0. As a result, the rendered shadow in a final image shows strong aliasing. This is because if a data sample’s value is equal to 1.0, it means that the pixel is completely outside of the shadow, while if it is 0.0, it means that the pixel is completely in shadow. After filtering, a data sample’s value could be a floating point, which lies between 1 .0 and 0.0. This is to achieve continuous transitions and thus the pixel can be rendered as a soft shadow in the final image. The functions for high-order texture filtering (both third- and fourth-order) described herein (with preferred 2D examples as defined in Equations (13) and (29)) involve fewer texture fetches and much simpler weighted sum calculation, especially for 2D texture filtering. These functions require fewer GPU instructions, in term of both texture fetching instructions and ALU instructions, than conventional methods.
Using the method described herein, the device can be configured to perform fewer texture fetches than the original equations found in Keys (Keys, 1981). For example, for each pixel, the device can be configured to perform five texture fetches for the third-order (bicubic) approximation and nine texture fetches for the fourth-order approximation.
Fewer GPU instructions can result in longer mobile battery life, reduced latency and improved frame-rate for complex and demanding game rendering.
The solution may be implemented by the GPU software using shader code. The simplified fucntions can be implemented by a few lines of GPU shader code using shading languages such as GLSL, HLSL, or Spir-V.
Alternatively, by modifying the texture unit module of the GPU hardware, the filtering functions can be implemented using fixed function hardware via a single GPU instruction, instead of multiple lines of shader code. For example, one ISA intrinsics call can be used to complete 2D high-order texture filtering in a pixel shader.
The method described herein may therefore allow for faster and cheaper high-order texture filtering.
Figure 8 is a schematic representation of a device 800 configured to perform the methods described herein. The device 800 may be implemented on a device, such as a laptop, tablet, smart phone, TV or any other device in which graphics data is to be processed.
The device 800 comprises a graphics processor 801 configured to process data. For example, the processor 801 may be a GPU. Alternatively, the processor 801 may be implemented as a computer program running on a programmable device such as a GPU or a Central Processing Unit (CPU). The device 800 comprises a memory 802 which is arranged to communicate with the graphics processor 801 . Memory 802 may be a non-volatile memory. The graphics processor 801 may also comprise a cache (not shown in Figure 8), which may be used to temporarily store data from memory 802. The device may comprise more than one processor and more than one memory. The memory may store data that is executable by the processor. The processor may be configured to operate in accordance with a computer program stored in non-transitory form on a machine readable storage medium. The computer program may store instructions for causing the processor to perform its methods in the manner described herein.
The device may allow for rendering a high-quality image with texture at higher performance with fewer texture fetches and weighted filtering calculations on a mobile GPU than conventional methods.
The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims

1. A graphics processing device (800) for generating an image signal representing an image having a plurality of pixels, wherein the graphics processing device is configured to generate the image signal by performing the following operations for each of the plurality of pixels: determine (701) a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in a texture space; and apply (702) a texture filtering function to the sub-pixel data points of the plurality of texels in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and at least (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
2. The device (800) of claim 1 , wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y and (ii) a weighted sum of second-order derivative approximations among the texels in the two orthogonal directions and a weighted sum of third-order derivative approximations among the texels in the two orthogonal directions.
3. The device (800) of claim 1 or claim 2, wherein the second-order and/or third-order derivative approximations among the texels in the two orthogonal directions are weighted in dependence on the fractional position of the location x, y between two consecutive integer locations.
4. The device (800) of any preceding claim, wherein the second-order and/or third-order derivatives are calculated using the dedicated hardware logic.
5. The device (800) of any preceding claim, wherein the second-order and/or third-order derivatives are calculated in dependence on the interpolation among the texels.
6. The device (800) of any preceding claim, wherein the interpolation is determined using a bilinear interpolation function.
7. The device (800) of claim 6, wherein the device comprises a texture cache configured to store the bilinear interpolation function at the location x, y.
8. The device (800) of any preceding claim, wherein the two orthogonal directions are directions in a texture space.
9. The device (800) of any preceding claim, wherein the filtering function is determined according to a third-order or fourth-order approximation.
10. The device (800) of any preceding claim, wherein the device is configured to implement the texture filtering function in one of the GLSL, HLSL and Spir-V languages.
11. The device (800) of any preceding claim, wherein the device is configured to implement the texture filtering function in a single instruction, or fixed function hardware unit.
12. The device (800) of any preceding claim, wherein, for each pixel, the device is configured to perform fewer than sixteen texture fetches.
13. The device (800) of claim 12, wherein, for each pixel, the device is configured to perform five texture fetches or nine texture fetches.
14. The device (800) of any preceding claim, wherein at least some of the pixels represent a shadow of an object in said image and wherein the texture filtering function is a shadow filtering function.
15. The device (800) of any preceding claim, wherein the device is implemented by a mobile graphics processing unit.
16. A method (700) for generating an image signal representing an image having a plurality of pixels, wherein the method comprises, for each of the plurality of pixels: determining (701) a plurality of texels corresponding to the respective pixel, each texel comprising a sub-pixel data point indicating a texture sample value in an texture space; and applying (702) a texture filtering function to the sub-pixel data points in dependence on a selected sampling location x, y to form a filtered value, where x and y are the fractional positions of a texture coordinate; wherein applying the texture filtering function comprises taking the difference between (i) an interpolation among the texels for the location x, y performed by dedicated hardware logic and (ii) a weighted sum of second-order derivative approximations among the texels in two orthogonal directions.
17. A computer program which, when executed by a computer, causes the computer to perform the method of claim 16.
PCT/EP2020/082790 2020-11-20 2020-11-20 High-order texture filtering WO2022106016A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2020/082790 WO2022106016A1 (en) 2020-11-20 2020-11-20 High-order texture filtering
CN202080102689.2A CN115917606A (en) 2020-11-20 2020-11-20 High order texture filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/082790 WO2022106016A1 (en) 2020-11-20 2020-11-20 High-order texture filtering

Publications (1)

Publication Number Publication Date
WO2022106016A1 true WO2022106016A1 (en) 2022-05-27

Family

ID=73544159

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2020/082790 WO2022106016A1 (en) 2020-11-20 2020-11-20 High-order texture filtering

Country Status (2)

Country Link
CN (1) CN115917606A (en)
WO (1) WO2022106016A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541463B (en) * 2023-11-29 2024-08-20 沐曦集成电路(上海)有限公司 Sky box angular grain element loss processing system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7106326B2 (en) 2003-03-03 2006-09-12 Sun Microsystems, Inc. System and method for computing filtered shadow estimates using reduced bandwidth
US10102181B2 (en) 2014-08-27 2018-10-16 Imagination Technologies Limited Efficient Catmull-Rom interpolation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7106326B2 (en) 2003-03-03 2006-09-12 Sun Microsystems, Inc. System and method for computing filtered shadow estimates using reduced bandwidth
US10102181B2 (en) 2014-08-27 2018-10-16 Imagination Technologies Limited Efficient Catmull-Rom interpolation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHRISTIAN SIGG ET AL: "GPU Gems - Chapter 20. Fast Third-Order Texture Filtering", April 2005 (2005-04-01), XP055220422, Retrieved from the Internet <URL:http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter20.html> [retrieved on 20151013] *
CSÉBFALVI BALÁZS: "Fast Catmull-Rom Spline Interpolation for High-Quality Texture Sampling", COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS, vol. 37, no. 2, May 2018 (2018-05-01), Oxford, pages 455 - 462, XP055830796, ISSN: 0167-7055, DOI: 10.1111/cgf.13375 *
KEYS, R.G.: "Cubic convolution interpolation for digital image processing", IEEE TRANS. ACOUST. SPEECH SIGNAL PROCESS., vol. 29, no. 6, 1981, pages 1153 - 1160, XP009012170, DOI: 10.1109/TASSP.1981.1163711

Also Published As

Publication number Publication date
CN115917606A (en) 2023-04-04

Similar Documents

Publication Publication Date Title
US9202258B2 (en) Video retargeting using content-dependent scaling vectors
US10262454B2 (en) Image processing apparatus and method
JP5734475B2 (en) Method for fast and memory efficient implementation of conversion
CN107133914B (en) Apparatus for generating three-dimensional color image and method for generating three-dimensional color image
EP1161745B1 (en) Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for reduced artifacts
US20170358132A1 (en) System And Method For Tessellation In An Improved Graphics Pipeline
US10008023B2 (en) Method and device for texture filtering
JP2012515982A (en) Smoothed local histogram filter for computer graphics
EP1161744B1 (en) Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for implementation of display effects
JP2006526834A (en) Adaptive image interpolation for volume rendering
JP2010092478A (en) Graphics processing system
WO2013005366A1 (en) Anti-aliasing image generation device and anti-aliasing image generation method
WO2021213664A1 (en) Filtering for rendering
WO2022106016A1 (en) High-order texture filtering
JP4801088B2 (en) Pixel sampling method and apparatus
Zhao et al. Real-time edge-aware weighted median filtering on the GPU
US20230298212A1 (en) Locking mechanism for image classification
JP2023054783A (en) Method and system for visualization and simulation of flow phenomenon
Bernasconi et al. Kernel aware resampler
Ruijters et al. Accuracy of GPU-based B-spline evaluation
US20230298133A1 (en) Super resolution upscaling
Safonov et al. Image Upscaling
US20240161236A1 (en) Method and apparatus with adaptive super sampling
JP5712385B2 (en) Image display processing apparatus and image display processing method
WO2022252080A1 (en) Apparatus and method for generating a bloom effect

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20811567

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20811567

Country of ref document: EP

Kind code of ref document: A1