WO2019147929A1 - Localized adaptive coarse pixel shading - Google Patents

Localized adaptive coarse pixel shading Download PDF

Info

Publication number
WO2019147929A1
WO2019147929A1 PCT/US2019/015135 US2019015135W WO2019147929A1 WO 2019147929 A1 WO2019147929 A1 WO 2019147929A1 US 2019015135 W US2019015135 W US 2019015135W WO 2019147929 A1 WO2019147929 A1 WO 2019147929A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
shading
rate
processing system
graphics processing
Prior art date
Application number
PCT/US2019/015135
Other languages
French (fr)
Inventor
Yubo ZHANG
Eric Lum
Yury Uralsky
John Spitzer
Original Assignee
Nvidia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nvidia Corporation filed Critical Nvidia Corporation
Publication of WO2019147929A1 publication Critical patent/WO2019147929A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures

Definitions

  • This technology relates to computer graphics, and more particularly to rendering using coarse pixel shading. Still more particularly, the example technology relates to adaptable shaders that offer controllable shading rates for screen and/or object space localities based on complexity or other characteristics within those localities.
  • a Pixel Shader graphics function can calculate effects on a per-pixel basis. Imagine thousands of independent computers operating in parallel— one for each pixel of your display— whose purpose is to figure out, for each pixel independently, what color that particular pixel should be. [0005] Pixel shading can be used to bring out an extraordinary level of surface detail— allowing the viewer to see effects beyond the triangle level. For example, Pixel Shaders can be used to create ambience with materials and surfaces that mimic reality. By allowing artists to alter lighting and surface effects on a per pixel basis, Pixel Shaders enable artists to manipulate colors, textures, and/or shapes and to generate complex, realistic scenes. A limitless number of material effects can replace artificial, computerized looks with high-impact organic surfaces. Characters can have facial hair and blemishes, golf balls can have dimples, a red chair can gain a subtle leather look, and wood can exhibit vibrant texture and grain.
  • Nvidia s graphics processing units (GPUs) such as the GeForce3 use programmable Pixel Shaders to bring movie-style effects to personal computers and other platforms.
  • Programmable Pixel Shaders provide developers with unprecedented control for determining the lighting, shading, and color of each individual pixel, allowing developers to create a myriad of unique surface effects.
  • Such programmable Pixel Shaders give artists and developers the ability to create per-pixel effects that mirror their creative vision.
  • Coarse pixel shading is an architectural feature that allows pixel shaders to run at a rate lower than once per pixel. Coarse pixel shading has traditionally been a way to reduce shading rate based on exterior factors such as gaze direction (foveation) or lens parameters, rather than based on the content of the rendered pixels.
  • a coarse pixel parameter depending on the material type to save computations where visual impact is minimal. For instance, a particle system for rendering smoke may be shaded at a low rate, while a sign with text may warrant high resolution shading. Similarly, objects in full shadow may possibly be shaded at a lower rate than objects in bright sunlight. Similarly, motion or defocus blur can be shaded at a lower rate than other parts of the frame. See e.g., Vaidyanathan et al, “Coarse Pixel Shading”, High Performance Graphics (Eurographics 2014).
  • Nvidia Multi-Res Shading introduced with the Maxwell architecture splits the image up into multiple viewports in preparation for a later warp distortion pass that distorts an image before it is output to a headset or other virtual reality display device. Because the warp distortion pass compresses the edges of the image, many pixels are potentially generated that are discarded before display. Multi-resolution shading divides the image into multiple viewports. For example, a center viewport in the center of the image and further viewports around the image periphery provide for a 3x3 grid. The center viewport (which is typically not distorted much in the warping pass) is shaded at high resolution (e.g., 1-to-l), whereas the peripheral viewports are shaded at lower rates.
  • high resolution e.g., 1-to-l
  • comer viewports (which are lost almost altogether during the warping pass) can be shaded at a much lower rate such as 1 ⁇ 4, and the side viewports can be shaded at an intermediate rate such as 1 ⁇ 2. Since fewer pixels are shaded, rendering is faster. This can reduce image latency, which is important for virtual reality.
  • Figure 1 shows an example non-limiting system for implementing this technology.
  • Figure 2 shows an example non-limiting graphics processing unit.
  • Figures 3 and 3A show example non-limiting data structures.
  • Figure 4 shows example non-limiting analysis software that may execute for example on the CPU.
  • Figure 5 shows an example baseline image rendered using per-pixel shading.
  • Figure 6 shows an example image that uses uniform shading reduction (50% shading rate versus the baseline).
  • Figure 9 shows example non-limiting adaptive coarse pixel shading (50% shades versus baseline).
  • Figure 11 shows example non-limiting comparison between per- pixel shading and adaptive coarse pixel shading.
  • the cropped views show that in adaptive coarse pixel shading, high frequency features are preserved using per-pixel shading while low frequency features are coarsened using lower (e.g., 1/4 or 1/16) shading rates.
  • Figures 1 and 2 show an example non-limiting graphics system 50 for rendering images using localized adaptive coarse pixel shading.
  • the system includes input devices/sensors 100, at least one central processing unit (CPU) or other processor 200, at least one graphics processing unit (GPU) 300, at least one memory 400, and at least one rendering target such as a display 500. These various components may communicate with each other and with external systems via one or more buses 600.
  • the input devices/sensors 100 may comprise any kind of device capable of generating an input signal, including but not limited to a keyboard, a touch screen, a microphone, a motion or attitude sensor, a geolocation sensor, a camera, etc.
  • the CPU 200 may comprise a single processor, a multithreaded, multi-core distributed processing system, or other conventional processing arrangements.
  • the GPU 300 may comprise one or more graphics processing units including a conventional graphics pipeline having geometry and pixel shading capabilities.
  • Memory 400 may comprise one or more DRAM or other conventional non-transitory memory.
  • CPU 200 executes instructions stored in memory 400.
  • CPU 200 communicates with GPU 300 via a conventional graphics API such as OPENGL, Vulkan, D3D or the like.
  • the CPU 200 uses this graphics API to send the GPU 300 commands, display lists and the like to control the GPU to render images.
  • the GPU 300 includes a conventional graphics pipeline that generates graphics for output to a render target such as one or more displays 500.
  • Figure 2 shows a conventional graphics pipeline including a vertex fetch unit 302, a vertex shader unit 304, a geometry shader unit 306, a rasterizer 308, a pixel shader unit 310, and a render output unit (ROP) 312.
  • the vertex fetch 302 fetches vertices from CPU 200 and/or a display list stored in memory 400, these vertices defining the geometry to be rendered.
  • a vertex shader 304 and a geometry shader 306 provide conventional shading values and processes (e.g., illumination/lighting, texture mapping, etc.) for input to rasterizer 308.
  • Rasterizer 308 rasterizes an image to generate pixels that are shaded by pixel shader 310.
  • the ROP 312 outputs the resulting shaded pixel values to a frame buffer (FB) allocated in memory 400 or to other render target(s).
  • the graphics pipeline may use level 1 and level 2 caches 314 and also has access to memory 400.
  • a direct memory access (DMA) controller or process may read the image (pixel) data from the frame buffer in memory 400 and send it to a render target such as display 500.
  • DMA direct memory access
  • pixel shading performed by pixel shader 310 is run once per pixel.
  • the computations are regular and do not take into account the content of the scene and/or the particular content of the pixels themselves. Shading operations thus generally take the same amount of time per pixel no matter what the characteristics are of the objects or scenes being rendered.
  • One way to reduce the amount of time and associated computational complexity of rendering is to determine which parts of the scene need more detailed shading and which parts do not.
  • a foveated rendering technique may involve tracking the viewer’s gaze, which may not be practical in some
  • Example Non-Limiting Localized Adaptive Coarse Pixel Shading provides an analyzer that analyzes a previously rendered frame(s) for certain characteristics (e.g., spatial, temporal and/or local contrast differences) associated with a rendered tile or other portion to determine whether to access or activate the coarse pixel shading capability with respect to any particular locality.
  • Such analyzer can determine localized opportunities to reduce pixel shading rate without significantly adversely affecting human perception of the resulting image, and can control the pixel shading hardware in accordance with the determined localized opportunities.
  • Example non-limiting techniques adapt shading rate based on a local metric (less than the entire image) however it can be computed, in screen space and/or object space.
  • local we mean a subset of pixels in a given scene or generated by less than all of the geometry to be rendered for the scene (e.g., the pixels of a given object or a given triangle).
  • the local metric can be computed from (a) the frame currently being rendered, (b) a previously-rendered frame, (c) the triangle currently being rendered and/or (d) previously rendered geometry.
  • These analysis capabilities can be used to change the shading rate based on content (e.g., complexity and/or detail).
  • content e.g., complexity and/or detail.
  • the hardware-based or other coarse pixel shader 310 capabilities are used to shade more often (higher shading rate) in areas which have more detail and less often (lower shading rate) in areas that have less detail.
  • the example non-limiting embodiments provide techniques that can apply coarse pixel shading to existing graphics programs based on the characteristics of the scene.
  • the example techniques do not require analyzing the contents of a mesh, pixel shader, or textures.
  • the desired shading rate for different parts of a scene is computed in a post-processing step at the end of the current or a previous frame.
  • the estimation for the desired shading rate may for example consider factors such as temporal color coherence and spatial coherence of the rendered image from the current and previous frames. For example, based on color coherence, the spatial shading rate distribution for the or a next frame can be decided.
  • This technique can be achieved without modifying traditional rendering pipelines and shaders such as shown in Figure 2. Therefore, integrating this technique into existing programs takes much lower effort compared to existing content- dependent techniques.
  • the proposed technique also maintains the image quality well compared to the per-pixel shading baseline while reducing the overall shading by a large amount.
  • the example non-limiting embodiments thus are adaptive based upon where there is more detail in the scene, and increase the shading in those more detailed areas. Based on the content of a scene, the system shades more or less.
  • the hardware supports specifying what the shading rate is per tile on the screen and/or object in the scene.
  • the disclosed non-limiting embodiment divides the screen space and/or object space into subdivisions (e.g., 16x16 pixel tiles in one embodiment), and provides the ability per tile to choose what the shading rate is.
  • the system can provide multiple modes.
  • the system adjusts shading rate in screen space based upon for example tile-by-tile subdivisions.
  • Shading rate can be set based upon material or other properties of the rendered image (or to-be rendered image) in screen space, based on screen space location. This for example can involve deciding what the shading rate should be for any particular tile, macroblock or other subdivision on the screen.
  • shading rate decision can be determined based on the characteristics of a particular piece of geometry currently being rendered, regardless of where it ends up on the screen.
  • this geometry mode based on which triangle is currently being rendered, it is possible to associate similar material properties with that triangle. By examining the material properties of that particular triangle, it is possible to set the shading rate (e.g., rate of execution or invocation of pixel shaders) resulting from that triangle.
  • a geometry shader 306 that shades per triangle can include code that controls shading based upon an analysis of each triangle to determine what the pixel shading rate should be for that triangle. For example, shading rate can be controlled based on material or other properties of that triangle or other fragment in object space. Adaptive pixel shading rate can be controlled based on such metrics.
  • Still other implementations can benefit from a combination of both screen-space based and geometry-based adaption of pixel shading rate. It is possible to combine both object and screen space metrics. For example, it is possible to set an object to have a finer shading rate, and then define the shading rate to use based on the localization information from screen space rendering.
  • the shading rate is controlled independently for each triangle, the shading rate will generally be carried with each triangle.
  • every triangle is assigned to an associated viewport, and different pixel shading rates can be used for each different viewport.
  • Figures 3 and 3 A show one example non-limiting implementation in which CPU 200 stores a shading rate table 280 of possible shading rates in memory.
  • the table 280 provides a list of some or all possible shading rates the pixel shader 310 is capable of.
  • Each subdivision can carry or be associated with an index that selects which shading rate value in the shading rate table 280 to use to pixel-shade that subdivision.
  • “Shading rate” in some example non-limiting implementations is the ratio between the number of pixel shaders (i.e., the number of invocations of pixel shading) for a given area in screen space or piece of geometry in object space and the number of pixels in that area or piece of geometry. Shading rate in this example non-limiting context is thus for example the ratio between the number of pixel shaders for a given area on the screen and the number of pixels in that area. In conventional graphics pipelines, the shading rate is typically 1: 1 (i.e., shading is performed on a pixel-by-pixel basis). Thus, in many implementations, every pixel will launch a pixel shader. In the example non-limiting embodiments, it is possible to drive the ratio up or down.
  • the pixel shading rate can be a single numerical value specifying a simple ratio between pixel shader invocations and pixels.
  • the pixel shading rate can also specify a shape, grouping or other configuration of a plurality of pixels.
  • a given shading rate value could specify a group of 4x2 pixels, meaning a subdivision that is 4 pixels in width and 2 pixels high, is to be shaded by a single pixel shader invocation.
  • Another shading rate value could specify a differently sized or configured subdivision such as 2x2, 2x1, 4x4, etc.
  • the pixel shading rate specifier stored in shading rate table 280 can be a byte of memory (e.g., one number per tile), the byte value specifying what shading rate to use.
  • the byte value“0” might indicate one shader per pixel
  • the byte value“1” might indicate one shader per array of 2x1 pixels
  • the byte value “2” might indicate one shader per array of 2x2 pixels
  • so on with the byte value“7” indicating shade 4x per pixel (i.e., supersample).
  • screen space shading rate is controlled by a cascade of a pair of control structures: (a) a shading rate distribution table, map or surface 270 which indexes (b) the shading rate look-up table 280.
  • the shading rate distribution table, map or surface 270 stored in memory 400 has plural values, each value specifying the shading rate of a particular subdivision in screen space.
  • the entries in the shading rate distribution table, map or surface 270 are spatially correlated, i.e. each entry corresponds to a spatial subdivision in screen space.
  • One example non-limiting embodiment divides the screen up into tiles, and stores in memory a map of the tiles, each entry in the map corresponding to a different tile and specifying the shading rate for that tile.
  • a stencil-like structure i.e., a 2D surface in screen space where each texel location in the surface corresponds to a tile on the screen.
  • Each slot in the stencil may thus correspond to a tile on the screen.
  • the stencil may store information that instructs the shading hardware how many pixel shaders to invoke for that tile.
  • a surface/stencil-like structure can thus be used to control shading rate.
  • This enabling technology may be part of a standard API such as D3D.
  • the example non-limiting embodiments may thus generate data structure 270 in the form of a map, e.g., a two-dimensional array of numbers or values, with each number or value specifying (directly or indirectly) how many pixel shaders to invoke for a particular corresponding tile or other screen space subdivision.
  • This shading rate surface 270 can be updated using heuristics to calculate the shading rates to be indicated for each element in the surface.
  • the shading rate surface 270 just like any other surface, is stored in memory 400 but happens to have a certain specific interpretation by the pixel shader 310 hardware. Based on the resulting values, the pixel shader 310 will launch the appropriate number of pixel shaders for that given surface subdivision.
  • the system 50 develops a number for each tile indicating how many pixel shader invocations to allow for that tile.
  • coarse pixel shading involves using less (or more) shaders than there are pixels, and in effect shading multiple pixels with the same pixel shader for reduced shading rate.
  • the result of a single pixel shader invocation will be replicated across a number of pixels in that case (e.g.., across the entire tile or some portion thereof).
  • the shading rate lookup table 280 is used to map the value stored in the pixel shading distribution table, map or surface 270 (or a value provided by the geometry shader
  • the shading rate lookup table 280 decouples the contents of the shading rate distribution table 270 (and/or the triangle/fragment parameters as shown in Figure 3A) from specific graphic API bit patterns used to control the pixel shading hardware to provide a desired shading rate.
  • the shading rate lookup table 280 is programmed for whatever shading rates are desired.
  • the areas of the pixel shading surface 270 that correspond to high levels of detail may be programmed to retrieve corresponding higher-rate pixel shading values in the shading rate lookup table 280.
  • the shading rate lookup table could be implemented as a programmable or non programmable gate array or other circuit in hardware that transforms a control byte into control signals needed to control the pixel shading hardware to provide a desired coarse pixel shading rate.
  • the shading rate table 280 may be eliminated and the screen space distribution table, map or surface 270 (see Figure 3) and/or the triangle/fragment parameter (see Fig. 3A) could provide the bit patterns that directly control the pixel shading rate the pixel shader 310 performs.
  • the shading rate lookup table 280 is accessed based on a parameter(s) that triangles, fragments or other geometry carry through at least part of the rendering pipeline rather than via a spatially-organized shading rate distribution table 270.
  • the triangle/fragment parameter can be injected into the graphics pipeline by CPU 200 at time of vertex fetch 302, or in some embodiments it could be generated by the graphics pipeline itself prior to pixel shading.
  • the triangle/fragment index in some example embodiments can comprise a viewport index, with different viewports having different corresponding shading rate mappings.
  • the viewports may be set up ahead of time by an application executing on CPU 200 and preprogrammed ahead of time, or they may be changed
  • the application sets up N (e.g., 16) different viewports. Each triangle can select which one of these different viewports to use. By selecting a viewport, the triangle is also selecting a pixel shading rate associated with the viewport.
  • FIG. 4 shows an example non-limiting analyzer CPU 200 executes to adapt localized pixel shading rate to image characteristics.
  • CPU 200 analyzes a previously rendered image to determine which shading rate to use in the current image.
  • Other techniques are also possible.
  • other embodiments can analyze based on the current frame, or based on a combination of the current frame and a previous frame or frames.
  • the shading rate distribution table 270 specifies the shading rate per screen space tile.
  • the tile granularity can vary based on the design (e.g., 8x8 or 16x16). This table 270 will be used by the graphics hardware or software that supports coarse pixel shading.
  • Step 204 performs the original rendering work using the rasterizer 308 and other graphics pipeline components.
  • CPU 200 executes an algorithm that analyzes the rendered image from Step 204 and decides the optimal shading rate per screen space tile.
  • Step 204 and Step 206 can be treated as a“black box” and the graphics driver can automatically insert Step 202 and Step 208 at the driver side to apply the technique. Otherwise, Step 208 can be merged into Step 206 for better performance.
  • One way to choose the shading rate is to examine the (or a) previously-rendered frame and use heuristics to determine if that particular tile has interesting details or not (temporal). Another example heuristic could be used to perform an efficient analysis in the current frame to determine whether a tile is interesting or not. The shading rate of particular tiles determined to be interesting can be adjusted based on the results of the analysis.
  • the heuristics can be as simple and straightforward as looking at surface properties. For example, it is possible to initially render into a G- buffer (Geometry buffer, which typically enables a render-to-texture function) which would contain material properties for every surface of every pixel on the screen. Looking at those properties (e.g., which could contain metrics such as how specular the particular surface is, or whether it is flat color or not), it is possible to calculate the shading rate.
  • G- buffer Geometry buffer, which typically enables a render-to-texture function
  • the heuristics can be arbitrarily complex. In one example non-limiting embodiment, it is possible to look at previous frames or just look at the current frame material properties and derive shading rate from that.
  • the example non-limiting embodiment performs such analysis on a subdivision-by-subdivision basis (e.g., tile by tile, surface by surface, triangle by triangle, etc.), and is able to adjust the amount of shading in each individual subdivision. It is possible to do more shading in more complicated subdivisions and less shading in other subdivisions.
  • a subdivision-by-subdivision basis e.g., tile by tile, surface by surface, triangle by triangle, etc.
  • shading rate may be computed based on some or all (or any combination) of the following factors:
  • the shading rate estimation algorithm of block 206 can be composed as the following metric:
  • I(x,y,t) is the rendered image defined with the spatial coordinates x,y and the temporal coordinate t.
  • A, B and C are constants that control the contribution from the spatial difference term (the first term in the expression above), the temporal difference term (the second term in the expression above), and the local contrast term (the third term in the expression above).
  • the first (A constant) term in the expression above indicates the spatial difference. This indicates whether the adjacent pixels are quite different, which means there is an edge at or near the location. If there is an edge, the shading rate should be increased to provide additional detail to properly render the edge.
  • the second (B constant) term indicates the temporal difference (how the color differs from the previous frame to the current frame or from one frame to another). If the color changes temporally, the shading rate should be increased because there might be some object movement around this location.
  • the third (C constant) term is local contrast, which means the color varies (i.e., the dynamic color range within a local window is large or not, where the window size can be larger or smaller than the subdivision size for controlling the sampling rate). If the dynamic range is large, the shading rate should be increased because this means the local image has high contrast. The eyes are more sensitive to high contrast content, so more detailed shading may be desirable in such areas.
  • N is a small positive integer that controls the window size for calculating the local contrast.
  • the shading rate can be decided based on the metric E(x y).
  • a mapping from E to discrete shading rates can be defined by a set of threshold values.
  • the RGB values of the input image I(x,y,t) may in some example non-limiting implementations be normalized by relative luminance such that the high frequency signals in shadow can also be captured by the proposed metric.
  • the proposed technique maintains the image quality well compared to the per-pixel shading baseline while reducing the overall shading by a large amount.
  • Figure 5 shows an example baseline image rendered using per-pixel shading (i.e., each pixel was shaded individually).
  • Figure 6 shows the same image rendered using a uniform shading reduction of 50% pixel shading as compared to the baseline.
  • Figure 7 shows an error map resulting from a comparison of the images of Figures 5 and 6.
  • most of the area is blue indicating minor differences ( ⁇ 0.01) between the two images.
  • the red lines (which appear to be light in grey scale versions of the Figure 7 image) show major differences.
  • the red lines indicative of major differences occur at the edges of objects in the scene.
  • the edges of all of the leaves of the plant exhibit major differences, as do the edges of the couch, the edges of the windows, etc.
  • Figure 7 as a guide, one can see the reduced image definition in portions of Figure 6 as compared to Figure 5. It is also apparent that most of the image can be rendered with half the pixel shading rate with no appreciable adverse effects at the resolution of these prints.
  • Figure 8 shows an adaptive coarse pixel shading rate visualization.
  • the visualization includes the addition of coloration of tiles indicating reduced (less than 1 : 1) pixel shading rate.
  • the red and green tiles on the color print are examples of screen space subdivisions for which the analyzer described above has heuristically determined may use reduced pixel shading rate without noticeable image quality degradation.
  • the red tiles on the color print (e.g., most of the colored blocks on the couch - these appear to be medium gray on the grey scale image) indicate localized areas of low (e.g., 1 ⁇ 4) pixel shading rate.
  • the green tiles on the color print (which appear light grey in the grey scale image) have adaptively been pixel shaded using a still lower (e.g., 1/16) pixel shading rate. The remainder of the image was pixel shaded using full (1 : 1) pixel shading rate.
  • Figure 9 shows the resulting image when adaptive coarse pixel shading as indicated in Figure 8 is implemented.
  • Figure 10 shows an error map resulting when the Figure 9 adaptive coarse pixel shaded image is compared to the original Figure 5 image rendered using a uniform 1: 1 pixel shading rate. Of the errors Figure 10 indicates, most are shown in blue indicating a minor difference ⁇ 0.01.
  • Figure 11 shows example non-limiting comparisons between per- pixel shading and adaptive coarse pixel shading.
  • the cropped views show that in adaptive coarse pixel shading, high frequency features are preserved using per-pixel shading while low frequency features are coarsened using 1 ⁇ 4 or 1/16 reduced pixel shading rates.
  • the two “No TAA” (no temporal anti-aliasing) crop images on the bottom of Figure 11 (the left-hand crop image being full per-pixel pixel shading, the right- hand crop image being adaptive pixel shading), one can see no difference because in each case full pixel shading rate was used to render the region of interest, namely the leaves of the house plant.
  • the adaptive coarse pixel shading determination determined that full 1 : 1 pixel shading should be used for shading the pixels of this part of the image.
  • the full adaptive pixel shaded image (see upper right-hand pane of Figure 11) has an overall effective pixel shading rate of 43% of the 100% per-pixel shading rate of the left- hand pane. Note further the adaptive rate visualization in the center pane indicating red tiles using a 1/4 pixel shading rate (1/2x2) and green tiles using a 1/16 pixel shading rate (1/4x4).
  • the techniques disclosed herein provide a major use case for the coarse pixel shading hardware feature introduced in NVIDIA’ s Turing architecture. It can be applied to desktop rendering applications in general and has the potential of reducing the pixel shading computation by more than 50% without introducing perceivable image quality loss.
  • Example uses for this technology include contexts that are pixel shading limited where the pixel shading is the bottleneck. For example, some computation is run for high quality lighting. Other contexts provide benefits with long pixel shaders that produce soft results. For example, soft shadows or soft lighting is often the most expensive per pixel to compute. High frequency content may have less opportunity for benefit because the higher shading rate will be needed. Lower screen resolutions may allow coarser shading without being visually detectable. While the feature is useful for eye tracking contexts (more detail where the gaze is), it is also useful in other contexts such as desktop contexts.

Abstract

To accomplish localized adaptive coarse pixel shading, an analyzer analyzes a currently or previously rendered frame(s) for certain characteristics (e.g., spatial, temporal and/or local contrast differences) associated with a rendered tile or other portion to determine whether to access or activate coarse pixel shading capability with respect to any particular locality.

Description

TITLE
LOCALIZED ADAPTIVE COARSE PIXEL SHADING
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims benefit of U.S. Provisional Patent
Application No. 62/622,623 filed January 26, 2018, incorporated herein by reference.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
[0002] None.
FIELD
[0003] This technology relates to computer graphics, and more particularly to rendering using coarse pixel shading. Still more particularly, the example technology relates to adaptable shaders that offer controllable shading rates for screen and/or object space localities based on complexity or other characteristics within those localities.
BACKGROUND
[0004] A Pixel Shader graphics function can calculate effects on a per-pixel basis. Imagine thousands of independent computers operating in parallel— one for each pixel of your display— whose purpose is to figure out, for each pixel independently, what color that particular pixel should be. [0005] Pixel shading can be used to bring out an extraordinary level of surface detail— allowing the viewer to see effects beyond the triangle level. For example, Pixel Shaders can be used to create ambiance with materials and surfaces that mimic reality. By allowing artists to alter lighting and surface effects on a per pixel basis, Pixel Shaders enable artists to manipulate colors, textures, and/or shapes and to generate complex, realistic scenes. A limitless number of material effects can replace artificial, computerized looks with high-impact organic surfaces. Characters can have facial hair and blemishes, golf balls can have dimples, a red chair can gain a subtle leather look, and wood can exhibit exquisite texture and grain.
[0006] Nvidia’s graphics processing units (GPUs) such as the GeForce3 use programmable Pixel Shaders to bring movie-style effects to personal computers and other platforms. Programmable Pixel Shaders provide developers with unprecedented control for determining the lighting, shading, and color of each individual pixel, allowing developers to create a myriad of unique surface effects. Such programmable Pixel Shaders give artists and developers the ability to create per-pixel effects that mirror their creative vision.
[0007] Although programmable Pixel Shaders offer many advantages, they introduce substantial computation complexity that takes time to perform. Display resolution has increased substantially over the last few years.
Depending on resolution, in excess of 2 million pixels may need to be rendered, lit, shaded, and colored for each frame, at 60 frames per second. This creates a tremendous computational load that can slow down the rendering process. [0008] Typically, pixel shaders are run at a rate of once per pixel. This can be inefficient since objects in a scene get shaded uniformly regardless of their surface appearance. Shading workload could be reduced uniformly by reducing the rendered resolution and upscaling, but this can produce soft image quality or aliasing at the high frequency content of the scene.
Instead, what is desired is a method for varying the shading rate depending on the rendered image content, reducing shading rate where the surface appearance varies smoothly and increasing the shading rate for high frequency regions such as bumps, shadow edges and specular highlights.
[0009] Coarse pixel shading is an architectural feature that allows pixel shaders to run at a rate lower than once per pixel. Coarse pixel shading has traditionally been a way to reduce shading rate based on exterior factors such as gaze direction (foveation) or lens parameters, rather than based on the content of the rendered pixels.
[0010] It is also known to choose a coarse pixel parameter depending on the material type to save computations where visual impact is minimal. For instance, a particle system for rendering smoke may be shaded at a low rate, while a sign with text may warrant high resolution shading. Similarly, objects in full shadow may possibly be shaded at a lower rate than objects in bright sunlight. Similarly, motion or defocus blur can be shaded at a lower rate than other parts of the frame. See e.g., Vaidyanathan et al, “Coarse Pixel Shading”, High Performance Graphics (Eurographics 2014).
[0011] Techniques are also known for using adaptive, multi -rate shading to perform shading calculations more efficiently. See e.g., He et al, Extending the graphics pipeline with adaptive, multi-rate shading, ACM Trans. Graph. 33, 4 (July 2014); Mavridis et al,“MSAA-Based Coarse Shading for
Power-Efficient Rendering on High Pixel-Density Displays” (2014); Wang et al, Real-time rendering on a power budget, ACM Transactions on Graphics (TOG), Proceedings of ACM SIGGRAPH (Volume 35 Issue 4, July 2016).
[0012] Nvidia Multi-Res Shading introduced with the Maxwell architecture splits the image up into multiple viewports in preparation for a later warp distortion pass that distorts an image before it is output to a headset or other virtual reality display device. Because the warp distortion pass compresses the edges of the image, many pixels are potentially generated that are discarded before display. Multi-resolution shading divides the image into multiple viewports. For example, a center viewport in the center of the image and further viewports around the image periphery provide for a 3x3 grid. The center viewport (which is typically not distorted much in the warping pass) is shaded at high resolution (e.g., 1-to-l), whereas the peripheral viewports are shaded at lower rates. For example, comer viewports (which are lost almost altogether during the warping pass) can be shaded at a much lower rate such as ¼, and the side viewports can be shaded at an intermediate rate such as ½. Since fewer pixels are shaded, rendering is faster. This can reduce image latency, which is important for virtual reality.
[0013] While much work has been done in the past, there is a need for further improvements in control of adaptive coarse pixel shading.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0015] The following detailed description of exemplary non-limiting illustrative embodiments is to be read in conjunction with the drawings of which:
[0016] Figure 1 shows an example non-limiting system for implementing this technology.
[0017] Figure 2 shows an example non-limiting graphics processing unit.
[0018] Figures 3 and 3A show example non-limiting data structures.
[0019] Figure 4 shows example non-limiting analysis software that may execute for example on the CPU.
[0020] Figure 5 shows an example baseline image rendered using per-pixel shading.
[0021] Figure 6 shows an example image that uses uniform shading reduction (50% shading rate versus the baseline).
[0022] Figure 7 shows an example 50% uniform shading reduction error map (blue = minor difference < 0.01, red = major difference).
[0023] Figure 8 shows example non-limiting adaptive coarse pixel shading rate visualization (red = ¼ rate, green = 1/16 rate, other = full rate).
[0024] Figure 9 shows example non-limiting adaptive coarse pixel shading (50% shades versus baseline).
[0025] Figure 10 shows example non-limiting 50% adaptive coarse pixel shading error map (blue = minor difference <0.01, red = major difference).
[0026] Figure 11 shows example non-limiting comparison between per- pixel shading and adaptive coarse pixel shading. The cropped views show that in adaptive coarse pixel shading, high frequency features are preserved using per-pixel shading while low frequency features are coarsened using lower (e.g., 1/4 or 1/16) shading rates.
DETAILED DESCRIPTION OF NON-LIMITING EMBODIMENTS
[0027] Example Non-Limiting Rendering System
[0028] Figures 1 and 2 show an example non-limiting graphics system 50 for rendering images using localized adaptive coarse pixel shading. The system includes input devices/sensors 100, at least one central processing unit (CPU) or other processor 200, at least one graphics processing unit (GPU) 300, at least one memory 400, and at least one rendering target such as a display 500. These various components may communicate with each other and with external systems via one or more buses 600.
[0029] The input devices/sensors 100 may comprise any kind of device capable of generating an input signal, including but not limited to a keyboard, a touch screen, a microphone, a motion or attitude sensor, a geolocation sensor, a camera, etc. The CPU 200 may comprise a single processor, a multithreaded, multi-core distributed processing system, or other conventional processing arrangements. The GPU 300 may comprise one or more graphics processing units including a conventional graphics pipeline having geometry and pixel shading capabilities. Memory 400 may comprise one or more DRAM or other conventional non-transitory memory.
[0030] In operation, CPU 200 executes instructions stored in memory 400. CPU 200 communicates with GPU 300 via a conventional graphics API such as OPENGL, Vulkan, D3D or the like. The CPU 200 uses this graphics API to send the GPU 300 commands, display lists and the like to control the GPU to render images. The GPU 300 includes a conventional graphics pipeline that generates graphics for output to a render target such as one or more displays 500.
[0031] Figure 2 shows a conventional graphics pipeline including a vertex fetch unit 302, a vertex shader unit 304, a geometry shader unit 306, a rasterizer 308, a pixel shader unit 310, and a render output unit (ROP) 312. The vertex fetch 302 fetches vertices from CPU 200 and/or a display list stored in memory 400, these vertices defining the geometry to be rendered. A vertex shader 304 and a geometry shader 306 provide conventional shading values and processes (e.g., illumination/lighting, texture mapping, etc.) for input to rasterizer 308. Rasterizer 308 rasterizes an image to generate pixels that are shaded by pixel shader 310. The ROP 312 outputs the resulting shaded pixel values to a frame buffer (FB) allocated in memory 400 or to other render target(s). The graphics pipeline may use level 1 and level 2 caches 314 and also has access to memory 400. A direct memory access (DMA) controller or process may read the image (pixel) data from the frame buffer in memory 400 and send it to a render target such as display 500.
[0032] Often, the functions shown in Figure 2 will be implemented primarily in hardware due to lower latency. However, other
implementations such as software or hybrid hardware/software
implementations are also possible.
[0033] Example Pixel Shading
[0034] Traditionally, the pixel shading performed by pixel shader 310 is run once per pixel. The computations are regular and do not take into account the content of the scene and/or the particular content of the pixels themselves. Shading operations thus generally take the same amount of time per pixel no matter what the characteristics are of the objects or scenes being rendered.
[0035] One way to reduce the amount of time and associated computational complexity of rendering is to determine which parts of the scene need more detailed shading and which parts do not. As an example, it might be possible to determine a lens or gaze of the user and provide detailed shading on those portions of the image the user is looking at a given time, while devoting less pixel shading computation resources to portions of the image that the user is not currently looking at. See e.g., Vaidyanathan et al, above. However, such a foveated rendering technique may involve tracking the viewer’s gaze, which may not be practical in some
implementations such as for conventional desktop or other display or render target contexts.
[0036] Even when multirate pixel shading has been used in the past, some implementations have not necessarily efficiently taken into account factors in the scene itself such as surface lighting (e.g., shadows, specular highlights) or texture content (e.g., albedo texture, bump maps, etc.).
[0037] It would also be desirable to take advantage of new coarse pixel shading hardware features using legacy or other software that hasn’t been specifically designed for coarse pixel shading hardware features.
[0038] Example Non-Limiting Localized Adaptive Coarse Pixel Shading [0039] One example non-limiting implementation provides an analyzer that analyzes a previously rendered frame(s) for certain characteristics (e.g., spatial, temporal and/or local contrast differences) associated with a rendered tile or other portion to determine whether to access or activate the coarse pixel shading capability with respect to any particular locality. Such analyzer can determine localized opportunities to reduce pixel shading rate without significantly adversely affecting human perception of the resulting image, and can control the pixel shading hardware in accordance with the determined localized opportunities.
[0040] Example non-limiting techniques adapt shading rate based on a local metric (less than the entire image) however it can be computed, in screen space and/or object space. By“local” we mean a subset of pixels in a given scene or generated by less than all of the geometry to be rendered for the scene (e.g., the pixels of a given object or a given triangle). The local metric can be computed from (a) the frame currently being rendered, (b) a previously-rendered frame, (c) the triangle currently being rendered and/or (d) previously rendered geometry.
[0041] These analysis capabilities can be used to change the shading rate based on content (e.g., complexity and/or detail). The hardware-based or other coarse pixel shader 310 capabilities are used to shade more often (higher shading rate) in areas which have more detail and less often (lower shading rate) in areas that have less detail.
[0042] The example non-limiting embodiments provide techniques that can apply coarse pixel shading to existing graphics programs based on the characteristics of the scene. The example techniques do not require analyzing the contents of a mesh, pixel shader, or textures. Instead, in one example non-limiting embodiment, the desired shading rate for different parts of a scene is computed in a post-processing step at the end of the current or a previous frame. The estimation for the desired shading rate may for example consider factors such as temporal color coherence and spatial coherence of the rendered image from the current and previous frames. For example, based on color coherence, the spatial shading rate distribution for the or a next frame can be decided. This technique can be achieved without modifying traditional rendering pipelines and shaders such as shown in Figure 2. Therefore, integrating this technique into existing programs takes much lower effort compared to existing content- dependent techniques.
[0043] The proposed technique also maintains the image quality well compared to the per-pixel shading baseline while reducing the overall shading by a large amount.
[0044] Once a pixel shader with programmable resolution is available, it is possible to base the programming criteria on factors such as the content of the scene. For example, it is possible to determine where there are edges or other detail in the scene, and use the pixel shading hardware to shade more in those locations.
[0045] The example non-limiting embodiments thus are adaptive based upon where there is more detail in the scene, and increase the shading in those more detailed areas. Based on the content of a scene, the system shades more or less. In terms of implementation strategy, the hardware supports specifying what the shading rate is per tile on the screen and/or object in the scene. The disclosed non-limiting embodiment divides the screen space and/or object space into subdivisions (e.g., 16x16 pixel tiles in one embodiment), and provides the ability per tile to choose what the shading rate is.
[0046] Example Non-Limiting MultiMode Operation
[0047] In example non-limiting embodiments, the system can provide multiple modes. In one mode, the system adjusts shading rate in screen space based upon for example tile-by-tile subdivisions. Shading rate can be set based upon material or other properties of the rendered image (or to-be rendered image) in screen space, based on screen space location. This for example can involve deciding what the shading rate should be for any particular tile, macroblock or other subdivision on the screen.
[0048] In another mode, it is possible to make a decision about shading rate on a per-triangle or other primitive in object space. For example, shading rate decision can be determined based on the characteristics of a particular piece of geometry currently being rendered, regardless of where it ends up on the screen. In this geometry mode, based on which triangle is currently being rendered, it is possible to associate similar material properties with that triangle. By examining the material properties of that particular triangle, it is possible to set the shading rate (e.g., rate of execution or invocation of pixel shaders) resulting from that triangle.
[0049] For example, a geometry shader 306 that shades per triangle can include code that controls shading based upon an analysis of each triangle to determine what the pixel shading rate should be for that triangle. For example, shading rate can be controlled based on material or other properties of that triangle or other fragment in object space. Adaptive pixel shading rate can be controlled based on such metrics.
[0050] Still other implementations can benefit from a combination of both screen-space based and geometry-based adaption of pixel shading rate. It is possible to combine both object and screen space metrics. For example, it is possible to set an object to have a finer shading rate, and then define the shading rate to use based on the localization information from screen space rendering.
[0051] Common to all such modes is that there is some metric (e.g., how specular a surface is or how simple shaded that surface is) for a locality, and based on that localized metric, the system decides at what rate to shade the pixels in that locality. This decision can be made in screen space, object space or both, based on the current tile and/or triangle being rendered.
[0052] If the shading rate is controlled independently for each triangle, the shading rate will generally be carried with each triangle. In one example implementation, every triangle is assigned to an associated viewport, and different pixel shading rates can be used for each different viewport.
[0053] Example Shading Rate Data Structures
[0054] Figures 3 and 3 A show one example non-limiting implementation in which CPU 200 stores a shading rate table 280 of possible shading rates in memory. The table 280 provides a list of some or all possible shading rates the pixel shader 310 is capable of. Each subdivision (triangle, tile, etc.) can carry or be associated with an index that selects which shading rate value in the shading rate table 280 to use to pixel-shade that subdivision.
[0055]“Shading rate” in some example non-limiting implementations is the ratio between the number of pixel shaders (i.e., the number of invocations of pixel shading) for a given area in screen space or piece of geometry in object space and the number of pixels in that area or piece of geometry. Shading rate in this example non-limiting context is thus for example the ratio between the number of pixel shaders for a given area on the screen and the number of pixels in that area. In conventional graphics pipelines, the shading rate is typically 1: 1 (i.e., shading is performed on a pixel-by-pixel basis). Thus, in many implementations, every pixel will launch a pixel shader. In the example non-limiting embodiments, it is possible to drive the ratio up or down. For example, to drive the ratio down, it is possible to launch only 64 pixel shaders for a 16x16 pixel tile (i.e., 256 pixels) so that each pixel shader shades a set of 4 pixels rather than a single pixel. To drive the ratio up, it is possible to launch more pixel shaders than there are pixels to provide supersampling.
[0056] The term“coarse” as used herein should not be construed as limiting the disclosed technology to reduced pixel shading rates. For example, as explained above, it is possible to increase the pixel rate beyond (higher than) a 1: 1 ratio to provide supersampling sub-pixel shading capabilities for certain areas of screen space and /or object space. Thus, it is possible to launch more pixel shaders than there are pixels on the screen.
[0057]
[0058] Conceptually, the pixel shading rate can be a single numerical value specifying a simple ratio between pixel shader invocations and pixels. However, in some non-limiting implementations, the pixel shading rate can also specify a shape, grouping or other configuration of a plurality of pixels. For example, a given shading rate value could specify a group of 4x2 pixels, meaning a subdivision that is 4 pixels in width and 2 pixels high, is to be shaded by a single pixel shader invocation. Another shading rate value could specify a differently sized or configured subdivision such as 2x2, 2x1, 4x4, etc.
[0059] In one example non-limiting implementation, the pixel shading rate specifier stored in shading rate table 280 can be a byte of memory (e.g., one number per tile), the byte value specifying what shading rate to use. For example, the byte value“0” might indicate one shader per pixel, the byte value“1” might indicate one shader per array of 2x1 pixels, the byte value “2” might indicate one shader per array of 2x2 pixels, and so on, with the byte value“7” indicating shade 4x per pixel (i.e., supersample).
[0060] In one example non- limiting implementation, screen space shading rate is controlled by a cascade of a pair of control structures: (a) a shading rate distribution table, map or surface 270 which indexes (b) the shading rate look-up table 280. The shading rate distribution table, map or surface 270 stored in memory 400 has plural values, each value specifying the shading rate of a particular subdivision in screen space. The entries in the shading rate distribution table, map or surface 270 are spatially correlated, i.e. each entry corresponds to a spatial subdivision in screen space. One example non-limiting embodiment divides the screen up into tiles, and stores in memory a map of the tiles, each entry in the map corresponding to a different tile and specifying the shading rate for that tile.
[0061] In some non-limiting embodiments, it is possible to base shading on a stencil-like structure, i.e., a 2D surface in screen space where each texel location in the surface corresponds to a tile on the screen. Each slot in the stencil may thus correspond to a tile on the screen. The stencil may store information that instructs the shading hardware how many pixel shaders to invoke for that tile. A surface/stencil-like structure can thus be used to control shading rate. This enabling technology may be part of a standard API such as D3D.
[0062] The example non-limiting embodiments may thus generate data structure 270 in the form of a map, e.g., a two-dimensional array of numbers or values, with each number or value specifying (directly or indirectly) how many pixel shaders to invoke for a particular corresponding tile or other screen space subdivision. This shading rate surface 270 can be updated using heuristics to calculate the shading rates to be indicated for each element in the surface. The shading rate surface 270, just like any other surface, is stored in memory 400 but happens to have a certain specific interpretation by the pixel shader 310 hardware. Based on the resulting values, the pixel shader 310 will launch the appropriate number of pixel shaders for that given surface subdivision.
[0063] To specify the pixel rate, the system 50 develops a number for each tile indicating how many pixel shader invocations to allow for that tile. In effect, coarse pixel shading involves using less (or more) shaders than there are pixels, and in effect shading multiple pixels with the same pixel shader for reduced shading rate. The result of a single pixel shader invocation will be replicated across a number of pixels in that case (e.g.., across the entire tile or some portion thereof).
[0064] In example non-limiting embodiments, the shading rate lookup table 280 is used to map the value stored in the pixel shading distribution table, map or surface 270 (or a value provided by the geometry shader
306/rasterizer 308 in the case of object space adaptive shading) to whatever final shading rate is to be used by the pixel shading hardware. In some example non-limiting embodiments, the shading rate lookup table 280 decouples the contents of the shading rate distribution table 270 (and/or the triangle/fragment parameters as shown in Figure 3A) from specific graphic API bit patterns used to control the pixel shading hardware to provide a desired shading rate. The shading rate lookup table 280 is programmed for whatever shading rates are desired. The areas of the pixel shading surface 270 that correspond to high levels of detail may be programmed to retrieve corresponding higher-rate pixel shading values in the shading rate lookup table 280.
[0065] Of course, other implementations are possible. For example, the shading rate lookup table could be implemented as a programmable or non programmable gate array or other circuit in hardware that transforms a control byte into control signals needed to control the pixel shading hardware to provide a desired coarse pixel shading rate. In other example implementations, the shading rate table 280 may be eliminated and the screen space distribution table, map or surface 270 (see Figure 3) and/or the triangle/fragment parameter (see Fig. 3A) could provide the bit patterns that directly control the pixel shading rate the pixel shader 310 performs.
[0066] As described above in connection with Figure 3 A, in example non limiting embodiments that use adaptable object space pixel shading, the shading rate lookup table 280 is accessed based on a parameter(s) that triangles, fragments or other geometry carry through at least part of the rendering pipeline rather than via a spatially-organized shading rate distribution table 270. The triangle/fragment parameter can be injected into the graphics pipeline by CPU 200 at time of vertex fetch 302, or in some embodiments it could be generated by the graphics pipeline itself prior to pixel shading.
[0067] For object space coarse shading, the triangle/fragment index in some example embodiments can comprise a viewport index, with different viewports having different corresponding shading rate mappings. The viewports may be set up ahead of time by an application executing on CPU 200 and preprogrammed ahead of time, or they may be changed
dynamically depending on image processing results. In one example non limiting embodiment, the application sets up N (e.g., 16) different viewports. Each triangle can select which one of these different viewports to use. By selecting a viewport, the triangle is also selecting a pixel shading rate associated with the viewport.
[0068] Example Analyzer
[0069] Figure 4 shows an example non-limiting analyzer CPU 200 executes to adapt localized pixel shading rate to image characteristics. In some embodiments, CPU 200 analyzes a previously rendered image to determine which shading rate to use in the current image. Other techniques are also possible. For example, other embodiments can analyze based on the current frame, or based on a combination of the current frame and a previous frame or frames.
[0070] To apply the adaptive coarse pixel shading technique, the major steps related to the rendering loop shown in Figure 4 include:
1. Turn on the coarse pixel shading feature and initialize shading rate
distribution table 270 (block 202)
2. Perform rendering work (block 204)
3. Analyze the rendered image from Step 204 and update the shading rate distribution table (block 206)
4. Advance to the next frame and go to Step 204 (block 208)
[0071] In Step 202, the shading rate distribution table 270 specifies the shading rate per screen space tile. The tile granularity can vary based on the design (e.g., 8x8 or 16x16). This table 270 will be used by the graphics hardware or software that supports coarse pixel shading. Step 204 performs the original rendering work using the rasterizer 308 and other graphics pipeline components. In Step 206, CPU 200 (either alone or assisted by hardware or software within GPU 300) executes an algorithm that analyzes the rendered image from Step 204 and decides the optimal shading rate per screen space tile.
[0072] If modifying the application program is not allowed or is not possible, Step 204 and Step 206 can be treated as a“black box” and the graphics driver can automatically insert Step 202 and Step 208 at the driver side to apply the technique. Otherwise, Step 208 can be merged into Step 206 for better performance. [0073] Shading Rate Estimation 206
[0074] One way to choose the shading rate is to examine the (or a) previously-rendered frame and use heuristics to determine if that particular tile has interesting details or not (temporal). Another example heuristic could be used to perform an efficient analysis in the current frame to determine whether a tile is interesting or not. The shading rate of particular tiles determined to be interesting can be adjusted based on the results of the analysis.
[0075] As an example, if a scene to be rendered is based on flat shaded colored surfaces, those surfaces will not have to be shaded at a high rate. Because the color is only slowly changing in screen space from pixel to pixel, it is not necessary to compute shading for every pixel. In contrast, if the scene to be rendered includes a highly specular surface (e.g., glitter on the water’s surface), it would be desirable to increase the shading rate in those parts of the scene. Shading rate can thus be controlled depending on the content of the scene in screen space to provide a variable localized shading rate.
[0076] The heuristics can be as simple and straightforward as looking at surface properties. For example, it is possible to initially render into a G- buffer (Geometry buffer, which typically enables a render-to-texture function) which would contain material properties for every surface of every pixel on the screen. Looking at those properties (e.g., which could contain metrics such as how specular the particular surface is, or whether it is flat color or not), it is possible to calculate the shading rate. The heuristics can be arbitrarily complex. In one example non-limiting embodiment, it is possible to look at previous frames or just look at the current frame material properties and derive shading rate from that. [0077] The example non-limiting embodiment performs such analysis on a subdivision-by-subdivision basis (e.g., tile by tile, surface by surface, triangle by triangle, etc.), and is able to adjust the amount of shading in each individual subdivision. It is possible to do more shading in more complicated subdivisions and less shading in other subdivisions.
[0078] In one example embodiment, shading rate may be computed based on some or all (or any combination) of the following factors:
• Spatial difference (is there an edge?)
• Temporal difference (did this part of the scene change from the previous frame?)
• Local difference (is the pixel in a high contrast region?)
[0079] The shading rate estimation algorithm of block 206 can be composed as the following metric:
dl(x, y, t) dl(x, y, t) dl(x, y, t)
E(x, y) = A i + + B
dx dy dt
Figure imgf000021_0001
[0080] where I(x,y,t) is the rendered image defined with the spatial coordinates x,y and the temporal coordinate t. A, B and C are constants that control the contribution from the spatial difference term (the first term in the expression above), the temporal difference term (the second term in the expression above), and the local contrast term (the third term in the expression above).
[0081] In more detail, the first (A constant) term in the expression above indicates the spatial difference. This indicates whether the adjacent pixels are quite different, which means there is an edge at or near the location. If there is an edge, the shading rate should be increased to provide additional detail to properly render the edge.
[0082] The second (B constant) term indicates the temporal difference (how the color differs from the previous frame to the current frame or from one frame to another). If the color changes temporally, the shading rate should be increased because there might be some object movement around this location.
[0083] The third (C constant) term is local contrast, which means the color varies (i.e., the dynamic color range within a local window is large or not, where the window size can be larger or smaller than the subdivision size for controlling the sampling rate). If the dynamic range is large, the shading rate should be increased because this means the local image has high contrast. The eyes are more sensitive to high contrast content, so more detailed shading may be desirable in such areas.
[0084] In the expression above, N is a small positive integer that controls the window size for calculating the local contrast. The shading rate can be decided based on the metric E(x y). A mapping from E to discrete shading rates can be defined by a set of threshold values. The RGB values of the input image I(x,y,t) may in some example non-limiting implementations be normalized by relative luminance such that the high frequency signals in shadow can also be captured by the proposed metric.
[0085] It is possible to look at any one of these terms or any combination of them. A further, more minimal embodiment would use only one of the above terms. Many of the terms are computed per pixel, but other terms are computed per tile or other subdivision. The analysis can be performed on a per pixel basis, and in one embodiment it is possible to take the highest calculated value for the entire tile. This ensures that adequate shading is provided for each pixel in the tile or other subdivision. Thus, in some example non-limiting embodiments, for each screen space tile, the finest shading rate among multiple estimated shading rates may be used as the per-tile shading rate for the next frame.
[0086] Example Imaging Results
[0087] The proposed technique maintains the image quality well compared to the per-pixel shading baseline while reducing the overall shading by a large amount.
[0088] Figure 5 shows an example baseline image rendered using per-pixel shading (i.e., each pixel was shaded individually). For purposes of comparison, Figure 6 shows the same image rendered using a uniform shading reduction of 50% pixel shading as compared to the baseline.
[0089] Figure 7 shows an error map resulting from a comparison of the images of Figures 5 and 6. In Figure 7, most of the area is blue indicating minor differences (<0.01) between the two images. The red lines (which appear to be light in grey scale versions of the Figure 7 image) show major differences. As can be seen, the red lines indicative of major differences occur at the edges of objects in the scene. For example, the edges of all of the leaves of the plant exhibit major differences, as do the edges of the couch, the edges of the windows, etc. Using Figure 7 as a guide, one can see the reduced image definition in portions of Figure 6 as compared to Figure 5. It is also apparent that most of the image can be rendered with half the pixel shading rate with no appreciable adverse effects at the resolution of these prints.
[0090] Figure 8 shows an adaptive coarse pixel shading rate visualization. The visualization includes the addition of coloration of tiles indicating reduced (less than 1 : 1) pixel shading rate. The red and green tiles on the color print (and grey boxes on the grey scale version) are examples of screen space subdivisions for which the analyzer described above has heuristically determined may use reduced pixel shading rate without noticeable image quality degradation.
[0091] The red tiles on the color print (e.g., most of the colored blocks on the couch - these appear to be medium gray on the grey scale image) indicate localized areas of low (e.g., ¼) pixel shading rate. The green tiles on the color print (which appear light grey in the grey scale image) have adaptively been pixel shaded using a still lower (e.g., 1/16) pixel shading rate. The remainder of the image was pixel shaded using full (1 : 1) pixel shading rate.
[0092] Figure 9 shows the resulting image when adaptive coarse pixel shading as indicated in Figure 8 is implemented.
[0093] Figure 10 shows an error map resulting when the Figure 9 adaptive coarse pixel shaded image is compared to the original Figure 5 image rendered using a uniform 1: 1 pixel shading rate. Of the errors Figure 10 indicates, most are shown in blue indicating a minor difference <0.01.
Only a few major differences (>0.01) in red are indicated. For example, a few red areas of major difference are indicated where the adaptive pixel shading shown in Figure 8 used a lower (“green”) pixel shading rate beneath the couch. Otherwise, most or all of the Figure 10 error map shows very low errors even though the adaptive pixel shading is realizing substantial computation savings through localized reductions in pixel shading rate.
[0094] Comparing the error rate maps of Figures 7 and 10, one can see that adaptive coarse pixel shading can save 50% shading work while maintaining image quality. In contrast, uniformly reducing the shading work by 50% across the entire frame can lead to greater image quality loss.
[0095] Figure 11 shows example non-limiting comparisons between per- pixel shading and adaptive coarse pixel shading. The cropped views show that in adaptive coarse pixel shading, high frequency features are preserved using per-pixel shading while low frequency features are coarsened using ¼ or 1/16 reduced pixel shading rates. In particular, by comparing the two “No TAA” (no temporal anti-aliasing) crop images on the bottom of Figure 11 (the left-hand crop image being full per-pixel pixel shading, the right- hand crop image being adaptive pixel shading), one can see no difference because in each case full pixel shading rate was used to render the region of interest, namely the leaves of the house plant. Because these narrow leaves contain many edges, the adaptive coarse pixel shading determination determined that full 1 : 1 pixel shading should be used for shading the pixels of this part of the image. On the other hand, the full adaptive pixel shaded image (see upper right-hand pane of Figure 11) has an overall effective pixel shading rate of 43% of the 100% per-pixel shading rate of the left- hand pane. Note further the adaptive rate visualization in the center pane indicating red tiles using a 1/4 pixel shading rate (1/2x2) and green tiles using a 1/16 pixel shading rate (1/4x4).
[0096] Example Non-Limiting Applications
[0097] The techniques disclosed herein provide a major use case for the coarse pixel shading hardware feature introduced in NVIDIA’ s Turing architecture. It can be applied to desktop rendering applications in general and has the potential of reducing the pixel shading computation by more than 50% without introducing perceivable image quality loss. Example uses for this technology include contexts that are pixel shading limited where the pixel shading is the bottleneck. For example, some computation is run for high quality lighting. Other contexts provide benefits with long pixel shaders that produce soft results. For example, soft shadows or soft lighting is often the most expensive per pixel to compute. High frequency content may have less opportunity for benefit because the higher shading rate will be needed. Lower screen resolutions may allow coarser shading without being visually detectable. While the feature is useful for eye tracking contexts (more detail where the gaze is), it is also useful in other contexts such as desktop contexts.
[0098] While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various
modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A graphics processing system comprising:
a graphics pipeline including a programmable pixel shader; and
a controller that analyzes results the graphics pipeline produces to determine localized opportunities to change pixel shading rate and controls the programmable pixel shader in accordance with the determined localized opportunities.
2. The graphics processing system of claim 1 wherein the controller is configured to control the programmable pixel shader to reduce pixel shading rate in determined localities.
3. The graphics processing system of claim 1 wherein the controller analyzes images the graphics pipeline produces to determine the localized opportunities.
4. The graphics processing system of claim 1 wherein the controller analyzes geometry the graphics pipeline processes to determine the localized opportunities.
5. The graphics processing system of claim 1 wherein controller updates a screen space distribution table in response to the determined localized
opportunities.
6. The graphics processing system of claim 1 wherein the controller stores in memory a shading rate table specifying different pixel shading rates selectable by items to be rendered.
7. The graphics processing system of claim 1 wherein the items comprise triangles.
8. The graphics processing system of claim 1 wherein the controller performs the analysis and control by executing instructions stored in non-transitory memory.
9. The graphics processing system of claim 1 wherein analyzer determines localized opportunities on a tile by tile basis in screen space.
10. The graphics processing system of claim 1 wherein the controller analyzes based on spatial, temporal and/or local contrast differences.
11. The graphics processing system of claim 1 wherein the controller analyzes based on geometry characteristics indicative of resulting image complexity.
12. A method comprising:
analyzing aspects of an image rendering process to determine the likelihood of the occurrence of edges, object movement and/or high contrast in image subdivisions; and
automatically controlling a programmable pixel shader to change pixel shading rates in response to the analysis.
13. The method of claim 12 wherein the analyzing comprises determining spatial differences, temporal color differences and dynamic range for portions of screen and/or object space.
14. The method of claim 12 wherein the analyzing comprises analyzing portions of screen space.
15. The method of claim 12 wherein the analyzing comprises analyzing portions of object space and the controlling comprises assigning said object space portions to viewports and associated pixel shading rates in response at least in part to the determined likelihoods.
16. The method of claim 12 wherein the analyzing comprises evaluating the following expression:
Figure imgf000030_0001
j, t } ~ . \l..\ £ mN,i\]n\ £N {/(x + i, y + j, t)}
17. The method of claim 12 wherein the controlling selectively increases pixel shading rate for selected portions of geometry and/or a screen.
18. Pixel shading control comprising:
a map for storage in non-transit ory memory, the map having a plurality of entries, each entry having one-to-one spatial correspondence to a subdivision in screen space, each entry storing a pixel shading rate for application to the corresponding screen space subdivision.
PCT/US2019/015135 2018-01-26 2019-01-25 Localized adaptive coarse pixel shading WO2019147929A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862622623P 2018-01-26 2018-01-26
US62/622,623 2018-01-26

Publications (1)

Publication Number Publication Date
WO2019147929A1 true WO2019147929A1 (en) 2019-08-01

Family

ID=67396196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/015135 WO2019147929A1 (en) 2018-01-26 2019-01-25 Localized adaptive coarse pixel shading

Country Status (1)

Country Link
WO (1) WO2019147929A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2602527A (en) * 2021-06-30 2022-07-06 Imagination Tech Ltd Graphics processing system and method of rendering
CN115546386A (en) * 2021-06-30 2022-12-30 想象技术有限公司 Graphics processing system and rendering method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070024639A1 (en) * 2005-08-01 2007-02-01 Luxology, Llc Method of rendering pixel images from abstract datasets
US20150379688A1 (en) * 2014-06-27 2015-12-31 Samsung Electronics Co., Ltd. Adaptive desampling in a graphics system with composited level of detail map
US9704270B1 (en) * 2015-07-30 2017-07-11 Teradici Corporation Method and apparatus for rasterizing and encoding vector graphics

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070024639A1 (en) * 2005-08-01 2007-02-01 Luxology, Llc Method of rendering pixel images from abstract datasets
US20150379688A1 (en) * 2014-06-27 2015-12-31 Samsung Electronics Co., Ltd. Adaptive desampling in a graphics system with composited level of detail map
US9704270B1 (en) * 2015-07-30 2017-07-11 Teradici Corporation Method and apparatus for rasterizing and encoding vector graphics

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2602527A (en) * 2021-06-30 2022-07-06 Imagination Tech Ltd Graphics processing system and method of rendering
CN115546386A (en) * 2021-06-30 2022-12-30 想象技术有限公司 Graphics processing system and rendering method
EP4116932A1 (en) * 2021-06-30 2023-01-11 Imagination Technologies Limited Graphics processing system and method of rendering
GB2602527B (en) * 2021-06-30 2023-02-08 Imagination Tech Ltd Graphics processing system and method of rendering
US11875443B2 (en) 2021-06-30 2024-01-16 Imagination Technologies Limited Graphics processing system and method of rendering
CN115546386B (en) * 2021-06-30 2024-02-06 想象技术有限公司 Graphics processing system, graphics processing method, graphics processing program, graphics processing method rendering method, graphics processing medium, and graphics processing system
US11972520B2 (en) 2021-06-30 2024-04-30 Imagination Technologies Limited Graphics processing system and method of rendering

Similar Documents

Publication Publication Date Title
US10362289B2 (en) Method for data reuse and applications to spatio-temporal supersampling and de-noising
US6850236B2 (en) Dynamically adjusting a sample-to-pixel filter in response to user input and/or sensor input
US7106322B2 (en) Dynamically adjusting a sample-to-pixel filter to compensate for the effects of negative lobes
US5949426A (en) Non-linear texture map blending
US6650323B2 (en) Graphics system having a super-sampled sample buffer and having single sample per pixel support
US6781585B2 (en) Graphics system having a super-sampled sample buffer and having single sample per pixel support
CN105374005B (en) Data processing system, method of operating the same, and computer-readable storage medium
US6525723B1 (en) Graphics system which renders samples into a sample buffer and generates pixels in response to stored samples at different rates
EP1161745B1 (en) Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for reduced artifacts
JP2002537614A (en) Graphics system configured to perform parallel sample pixel calculations
US6396502B1 (en) System and method for implementing accumulation buffer operations in texture mapping hardware
EP1161744B1 (en) Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for implementation of display effects
WO2019147929A1 (en) Localized adaptive coarse pixel shading
EP1264280A2 (en) Graphics system configured to implement fogging based on radial distances
US6549209B1 (en) Image processing device and image processing method
US20030174133A1 (en) Method for reduction of possible renderable graphics primitive shapes for rasterization
US6943791B2 (en) Z-slope test to optimize sample throughput
EP1155385B1 (en) Graphics system having a super-sampled sample buffer with efficient storage of sample position information
US6900803B2 (en) Method for rasterizing graphics for optimal tiling performance
US5886711A (en) Method and apparatus for processing primitives in a computer graphics display system
US6927775B2 (en) Parallel box filtering through reuse of existing circular filter
CN106815800B (en) Method and apparatus for controlling spatial resolution in a computer system
WO2001088854A2 (en) Graphics system using a blur filter
US6847368B2 (en) Graphics system with a buddy / quad mode for faster writes
WO2023208385A1 (en) A soft shadow algorithm with contact hardening effect for mobile gpu

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19744207

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19744207

Country of ref document: EP

Kind code of ref document: A1