US20230102620A1 - Variable rate rendering based on motion estimation - Google Patents
Variable rate rendering based on motion estimation Download PDFInfo
- Publication number
- US20230102620A1 US20230102620A1 US17/843,532 US202217843532A US2023102620A1 US 20230102620 A1 US20230102620 A1 US 20230102620A1 US 202217843532 A US202217843532 A US 202217843532A US 2023102620 A1 US2023102620 A1 US 2023102620A1
- Authority
- US
- United States
- Prior art keywords
- tile
- frame
- motion
- motion vector
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000033001 locomotion Effects 0.000 title claims abstract description 161
- 238000009877 rendering Methods 0.000 title claims abstract description 84
- 239000013598 vector Substances 0.000 claims description 80
- 238000000034 method Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 description 10
- 230000008901 benefit Effects 0.000 description 8
- 239000003086 colorant Substances 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 229910003460 diamond Inorganic materials 0.000 description 2
- 239000010432 diamond Substances 0.000 description 2
- 238000004880 explosion Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 230000026058 directional locomotion Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/40—Filling a planar surface by adding surface attributes, e.g. colour or texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/28—Indexing scheme for image data processing or generation, in general involving image processing hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
Definitions
- Computer graphics or image rendering is the process by which a computing system displays an image based on a computer program.
- a scene file containing information regarding objects in a scene is passed to one or more processing units that render an image (also referred to herein as a “frame” or “image frame”) for display based on the scene file.
- a display contains an array of pixels, each of which is the smallest addressable element in the display device.
- three-dimensional rendered animations often contain extraneous detail that a viewer cannot perceive. Further, rendering each pixel of a display to generate a high-resolution image is computationally intensive.
- FIG. 1 is a block diagram of a processing system that includes a motion estimator engine to generate a motion vector field for rendering regions of an image frame at variable resolutions based on the presence of objects and at least one of a magnitude and direction of motion according to some embodiments.
- FIG. 2 is a block diagram of the motion estimator engine and rendering processor of FIG. 1 according to some embodiments.
- FIG. 3 is a block diagram of a motion vector field generator of the motion estimator engine of FIG. 2 according to some embodiments.
- FIG. 4 is a block diagram of a logical pixel dimension identifier of the rendering processor of FIG. 2 according to some embodiments.
- FIG. 5 is a flow diagram illustrating a method for rendering regions of an image frame at variable resolutions based on a motion vector field and the presence of objects according to some embodiments.
- Variable resolution rendering can be used to reduce the computational workload on a processing system rendering relatively complex graphics (e.g., real-time 3D graphics animation) by assigning varying logical pixel dimensions to regions of an image frame and rendering pixels of the image frame based on the logical pixel dimensions.
- a processing system rendering relatively complex graphics (e.g., real-time 3D graphics animation) by assigning varying logical pixel dimensions to regions of an image frame and rendering pixels of the image frame based on the logical pixel dimensions.
- By identifying which regions of an image are of interest to a viewer it is possible to render in highest resolution (i.e., with smaller logical pixel dimensions) those areas of the image on which the viewer is expected to focus (the “foveal region”), and to render in lower resolution (i.e., with larger logical pixel dimensions) those areas of the image outside the region of interest so that they will be less noticeable to the viewer.
- larger logical pixel dimensions reduce the computational workload without affecting the quality of the displayed graphics as perceived by a
- a processing system uses a motion estimator engine to divide a previously rendered image into regions, referred to herein as “tiles”, of one or more pixels, sub-pixels, or fragments, and generate a motion vector field or other motion data identifying those tiles having moving areas.
- the processing system receives geometrical data from an application executing at a central processing unit, wherein the geometrical data identify those tiles having objects.
- the processing system uses the rendering processor to identify those tiles having little to no motion, based on the motion vector field, and having objects, and to assign smaller logical pixel dimensions in these regions.
- the rendering processor assigns logical pixel dimensions for each tile based on at least one of a magnitude and direction of motion within the tile.
- the rendering processor will assign logical pixel dimensions in that tile that are larger along the horizontal axis than along the vertical axis, reducing the effective rendering resolution within that tile to less than the nominal rate along the horizontal axis.
- the presence of an object or portion of an object in a tile overrides the presence of motion in a tile for purposes of assigning logical pixel dimensions, such that the rendering processor assigns smaller logical pixel dimensions to tiles containing objects or portions of objects, even if the tiles also have motion.
- whether the presence of an object overrides the presence of motion, or vice versa is configurable. In this way, the rendering processor avoids a perceptible reduction in visual quality.
- the motion estimator engine generates motion information for a frame based on user input data, the geometrical buildup of the frame (i.e., the data which will define the frame), and other inputs.
- the motion estimator engine examines the two most-recently fully-rendered image frames (referred to as the “N” and “N ⁇ 1” image frames) to create a motion vector field that measures the motion of the previous two image frames.
- the motion estimator engine compares each tile, or block, of the N and N ⁇ 1 image frames to determine motion vectors for each block.
- a block is a uniform size of a group of pixels used for block-based motion estimation.
- a motion estimator engine that is not block-based generates motion vectors per pixel, or per a group of multiple pixels. Based on an analysis of the N and N ⁇ 1 image frames, the motion estimator engine generates a motion vector field, which indicates areas, magnitude, and direction of motion between the previous two image frames.
- the rendering processor assumes that the motion vector field is valid (e.g., there was no scene change) and that the units of the next image frame (referred to as the “N+1” image frame) will continue along the same trajectory.
- the rendering processor receives geometric and color data from the application and identifies which groups of pixels of the N image frame contain objects or skin color, and it may calculate other relevant metrics. Based on one or more of the motion vector fields, the objects, color, and the other relevant metrics, the rendering processor assigns logical pixel dimensions for each tile. In some embodiments, the tiles form a uniform or non-uniform grid of blocks of multiple pixels.
- the sizes of the tiles need not be correlated to sizes of units used by the underlying metrics.
- the rendering processor assigns smaller logical pixel dimensions (e.g., 1 ⁇ 1 pixel or 0.5 ⁇ 0.5 pixels, in which case the logical pixel dimensions contain multiple fragments or sub-pixels) for tiles having little or no motion, objects, and/or skin color, or in which both stationary and in-motion objects are present.
- the rendering processor assigns larger logical pixel dimensions (e.g., 2 ⁇ 1, 1 ⁇ 2, 2 ⁇ 2, 4 ⁇ 4, 2 ⁇ 4, 4 ⁇ 2) for tiles having a larger magnitude of motion vector, or no objects or skin color.
- the rendering processor also assigns larger logical pixel dimensions for tiles having a large degree of change between corresponding tiles of the N and N ⁇ 1 image frames (such as for an explosion) and for tiles having a gradient of an image (such as a clear sky).
- the rendering processor balances the variable resolution rates (i.e. logical pixel sizes) of tiles of an image frame based on a frame rate requirement for an application executing at the CPU or other performance requirements.
- the rendering processor renders the pixels of the N+1 frame based on the logical pixel dimensions. If a logical pixel dimension is larger than one pixel, the rendering processor renders pixels of the logical pixel with the pixel value of the geometric center of the logical pixel. The rendering processor then assigns logical pixel dimensions for the next frame, such that the logical pixel dimensions of each tile are dynamically re-assigned for each frame. In this way, the variable resolution of each tile of each frame is adapted in real time, so that, e.g., regions with reduced rendering detail will return to full resolution when movement in those regions slows or stops.
- FIG. 1 is a block diagram of a processing system 100 that includes a motion estimator engine 120 to generate a motion vector field 125 and a rendering processor 130 for rendering a variable resolution image frame 135 based on logical pixel dimensions that are assigned based on a direction and magnitude of motion and the presence of objects in each tile according to some embodiments.
- the processing system 100 can be incorporated in any of a variety of electronic devices, such as a server, personal computer, tablet, set top box, gaming system, and the like.
- the motion estimator engine 120 is coupled to a memory 110 and the rendering processor 130 , which provides the variable resolution rendered image frame based on logical pixel dimensions 135 to a display 140 .
- the rendering processor 130 executes instructions and stores information in the memory 110 such as the results of the executed instructions.
- the memory 110 stores a plurality of previously-rendered images (not shown) that it receives from the rendering processor 130 .
- the memory 110 is implemented as a dynamic random access memory (DRAM), and in some embodiments, the memory 110 is implemented using other types of memory including static random access memory (SRAM), non-volatile RAM, and the like.
- DRAM dynamic random access memory
- SRAM static random access memory
- Some embodiments of the processing system 100 include an input/output (I/O) engine (not shown) for handling input or output operations associated with the display 140 , as well as other elements of the processing system 100 such as keyboards, mice, printers, external disks, and the like.
- the motion estimator engine 120 is configured to receive memory locations for a most recently rendered image (the N image) 105 and a second-most recently rendered image (the N ⁇ 1 image) 107 from a central processing unit (CPU) (not shown). The motion estimator engine 120 compares the sequential previously-rendered images 105 and 107 stored at the memory 110 to generate the motion vector field 125 .
- CPU central processing unit
- the rendering processor 130 receives commands generated by a central processing unit (CPU) 131 based on an application 132 instructing the rendering processor 130 to render a current (N+1) image (not shown). Some embodiments of the rendering processor 130 include multiple processor cores (not shown in the interest of clarity) that independently execute instructions concurrently or in parallel. Some embodiments of a command generated by the CPU include information defining textures, states, shaders, rendering objects, buffers, and the like that are used by the rendering processor 130 to render the objects or portions thereof in the N+1 image. The rendering processor 130 renders the objects to produce values of pixels that are provided to the display 140 , which uses the pixel values to display an image that represents the rendered objects.
- CPU central processing unit
- the rendering processor 130 identifies objects in the N+1 image frame based on geometric data 134 provided by the application 132 and assigns logical pixel dimensions for each tile of the N+1 image frame based on the identified objects and the motion vector field 125 .
- the rendering processor 130 renders the N+1 image based on the logical pixel dimensions such that tiles having logical pixel dimensions greater than 1 ⁇ 1 pixel are rendered at lower resolution than tiles having logical pixel dimensions of 1 ⁇ 1 pixel or smaller. Further, because the logical pixel dimensions are larger along an axis corresponding to a direction of motion for each tile, tiles are rendered at lower resolution along a direction of motion. The reduction in resolution conserves rendering processor resources without degrading the user-perceivable image quality.
- the motion estimator engine 120 receives the memory locations of the N and N ⁇ 1 images 105 and 107 from the CPU 131 .
- the motion estimator engine 120 analyzes at least the N and N ⁇ 1 images 105 and 107 stored at the memory 110 to generate a motion vector field 125 that estimates moving areas of the N+1 image (not shown).
- the motion vector field 125 indicates the direction and/or magnitude of motion for each unit of the image.
- the motion estimator engine 120 provides the motion vector field 125 to the rendering processor 130 .
- the rendering processor 130 receives the motion vector field 125 from the motion estimator engine 120 , and also receives geometric data 134 from the application 132 to identify objects in the N+1 image. In some embodiments, the rendering processor monitors WorldViewMatrix or ObjectTransformMatrix on a per-object basis to know the motion per object. Based on the identified objects and movement indicated by the motion vector field 125 , the rendering processor 130 assigns logical pixel dimensions for each tile of the N+1 image. In some embodiments, the logical pixel dimensions are greater than one pixel along an axis corresponding to a direction of motion indicated by the motion vector field 125 if the magnitude of the motion vector is greater than a threshold value.
- the logical pixel dimensions are greater than one pixel for tiles having a degree of change in pixel values between the N and N ⁇ 1 images greater than a threshold value. In some embodiments, logical pixel dimensions are greater than one pixel for tiles forming a gradient of the image (e.g. a clear sky). In some embodiments, the logical pixel dimensions for a tile are smaller than one pixel for tiles containing objects, a magnitude of motion vector smaller than a threshold value, skin color, or a high degree of detail.
- the rendering processor 130 renders the N+1 image based on the logical pixel dimensions.
- the rendering processor 130 renders the pixels of each of the logical pixels for a tile having logical pixel dimensions that are greater than 1 ⁇ 1 pixel using the same value. For example, if a logical pixel size for a tile is 2 ⁇ 2, the rendering processor 130 renders all four of the pixels of each logical pixel for that tile using the value of the geometric center of the tile. Similarly, if the logical pixel dimensions of a tile are 2 ⁇ 1, the rendering processor 130 renders both of the pixels of each logical pixel for that tile using the value of the geometric center of the tile.
- the rendering processor 130 provides the variable resolution rendered image frame 135 based on the logical pixel dimensions to the display 140 , which uses the pixel values of the variable resolution rendered image 135 to display an image that represents the N+1 image.
- the rendering processor 130 also provides a copy of the variable resolution rendered image 135 to the memory 110 , where it is stored for subsequent generation of a variable resolution version of the next image or for additional, intermediate rendering stages.
- FIG. 2 is a block diagram of a motion estimator engine 220 and rendering processor 230 of the processing system 100 of FIG. 1 according to some embodiments.
- the motion estimator engine 220 outputs a motion vector field 225 , which is used by the rendering processor 230 to generate a variable resolution rendered N+1 image 235 based on logical pixel dimensions.
- the motion estimator engine 220 is configured to generate a motion vector field 225 based on estimates of motion derived from a comparison of a previously-rendered N image 205 and N ⁇ 1 image 207 stored at a memory 210 .
- the motion estimator engine 220 includes a motion vector field generator 255 .
- the motion vector field generator 255 is configured to estimate movement of objects in consecutive images. Motion estimation assumes that in most cases consecutive images will be similar except for changes caused by objects moving within the images. To estimate motion, the motion vector field generator 255 determines motion vectors that describe the transformation from one two-dimensional image to another from adjacent images of an image sequence.
- a motion vector is a two-dimensional vector that provides an offset from the coordinates in one image to the coordinates in another image.
- the motion vector field generator 255 compares corresponding pixels of the N image 205 (the most recently rendered image) and the N ⁇ 1 image 207 (the image rendered immediately prior to the N image) to create a motion vector field 225 that models the movement of objects between the two images.
- the motion vector field generator 255 employs a block matching algorithm such as exhaustive search, three step search, simple and efficient search, four step search, diamond search, or other algorithms used in block matching.
- the motion vector field generator 255 uses a neural network to estimate the motion vector field for the current frame based on the motion vector field for the previous frame.
- the rendering processor 230 includes a skin color detector 270 and a logical pixel dimension identifier 275 .
- the skin color detector 270 and logical pixel dimension identifier 275 are implemented as shader programs on the rendering processor 230 .
- one or more of the skin color detector 270 and logical pixel dimension identifier 275 are implemented as fixed function hardware in the motion estimator engine 220 .
- the rendering processor 230 is configured to receive geometrical data 265 from an application 232 executing at a CPU 231 and the motion vector field 225 generated by the motion vector field generator 255 of the motion estimator engine 220 .
- the skin color detector 270 is configured to detect human skin colors in tiles of the image frame. Because human skin colors are likely to correspond to regions of interest, those tiles of the image that the skin color detector 270 identifies as containing human skin colors are assigned smaller logical pixel dimensions such that they will be rendered at higher resolution, as described herein.
- the logical pixel dimension identifier 275 is configured to assign logical pixel dimensions for each tile of the N+1 image frame.
- the logical pixel dimension identifier 275 assigns the logical pixel dimensions based on the motion vector field 225 , objects identified based on the geometrical data 265 , and human skin colors detected by the skin color detector 270 .
- the logical pixel dimension identifier 275 assigns smaller logical pixel dimensions to those tiles of the N+1 image frame identified as containing objects, little to no motion, and/or human skin color.
- the logical pixel dimension identifier 275 assigns larger logical pixel dimensions to those tiles of the N+1 image frame identified as containing a greater magnitude of motion and/or a gradient of an image, such as a clear sky.
- the logical pixel dimension identifier 275 assigns a larger logical pixel dimension to an axis of the logical pixel corresponding to the direction of motion.
- the logical pixel dimension identifier 275 assigns are larger logical pixel dimension for logical pixels of that tile along the vertical axis (e.g., 1 ⁇ 2 or 1 ⁇ 4 or 2 ⁇ 4 pixels).
- the rendering processor 230 is configured to render the N+1 image at a variable resolution based on the logical pixel dimensions of each tile.
- the rendering processor 230 includes a plurality of shaders (not shown), each of which is a processing element configured to perform specialized calculations and execute certain instructions for rendering computer graphics.
- the shaders compute color and other attributes for the pixels included in each logical pixel of a display.
- the shaders of the rendering processor 230 are two-dimensional (2D) shaders such as pixel shaders, or three-dimensional shaders such as vertex shaders, geometry shaders, or tessellation shaders, or any combination thereof.
- the shaders work in parallel to execute the operations required to render the N+1 image.
- FIG. 3 is a block diagram of a motion vector field generator 355 of the motion estimator engine 220 of FIG. 2 according to some embodiments.
- the motion vector field generator 355 compares corresponding units, or groups of pixels, of the N image 315 and the N ⁇ 1 image 317 to create a motion vector field 325 of vectors that model the movement of an object from one unit to another across consecutive images.
- the motion vector field generator 355 may employ a block matching algorithm such as exhaustive search, three step search, simple and efficient search, four step search, diamond search, or other algorithms used in block matching. In the example illustrated in FIG.
- the motion vector field generator 355 generates a motion vector field 325 containing motion vectors indicating motion, e.g., from unit C 4 to unit C 6 , from unit D 4 to unit D 6 , from unit E 4 to unit E 6 , from unit C 8 to unit C 10 , from unit D 8 to unit D 10 , and from unit E 8 to unit E 10 .
- FIG. 4 is a block diagram of a logical pixel dimension identifier 475 of the rendering processor 230 of FIG. 2 according to some embodiments.
- the logical pixel dimension identifier 475 assigns logical pixel dimensions to each tile (tile A 401 , tile B 402 , tile C 403 , and tile D 404 ) of a frame N+1 400 .
- the tiles analyzed by the logical pixel dimension identifier 475 are the same groups of pixels that are analyzed by the motion vector field generator 355 .
- the tiles analyzed by the logical pixel dimension identifier 475 are larger or smaller groups of pixels than the units that are analyzed by the motion vector field generator 355 .
- the logical pixel dimension identifier 475 analyzes the motion vector field (not shown), geometric data received from the application executing at the CPU (not shown), and human skin colors detected in the N+1 image frame 400 to assign logical pixel dimensions to each tile of the N+1 image frame 400 .
- the logical pixel dimension identifier 475 identifies the following properties of each of the tiles of the N+1 image frame 400 : tile A 401 has motion in a horizontal direction, tile B 402 has motion in a vertical direction, tile C 403 has an object or human skin color, and tile D 404 has non-directional motion (e.g. explosion or expansion/contraction) or diagonal motion and/or is a gradient of the N+1 image frame such as a clear sky.
- non-directional motion e.g. explosion or expansion/contraction
- diagonal motion e.g. explosion or expansion/contraction
- the logical pixel dimension identifier 475 assigns a logical pixel size of 2 ⁇ 1 pixels to tile A 401 , a logical pixel size of 1 ⁇ 2 pixels to tile B 402 , a logical pixel size of 1 ⁇ 1 pixels to tile C 403 , and a logical pixel size of 2 ⁇ 2 pixels to tile D 404 .
- the rendering processor (not shown) will render tile A 401 with higher resolution along the vertical axis and lower resolution along the horizontal axis, tile B 402 with a lower resolution along the vertical axis and a higher resolution along the horizontal axis, tile C 403 with a higher resolution along both the vertical and horizontal axes, and tile D 404 with a lower resolution along both the vertical and horizontal axes.
- FIG. 5 is a flow diagram illustrating a method 500 for rendering regions of an image frame at variable resolutions based on a motion vector field and the presence of objects according to some embodiments.
- the method 500 is implemented in some embodiments of the processing system 100 shown in FIG. 1 and the motion estimator engine 220 and rendering processor 230 shown in FIG. 2 .
- the motion estimator engine 220 receives the two most recently rendered (N and N ⁇ 1) images.
- the motion vector field generator 255 of the motion estimator engine 220 compares the N and N ⁇ 1 images to generate a motion vector field 225 .
- the motion vector field generator 255 uses a neural network or other predictive algorithm to extrapolate the motion of the motion vector field of the N image to the motion vector field of the N+1 image frame.
- the rendering processor 230 receives geometrical data 265 from the application 232 executing at the CPU 231 , from which the rendering processor 230 detects the presence of objects in tiles of the N+1 image frame.
- the logical pixel dimension identifier 275 assigns logical pixel dimensions for each tile of the N+1 image frame, based on the motion vector field 225 , the geometrical data 265 , and the presence of human skin colors.
- the logical pixel dimension identifier 275 assigns logical pixel dimensions such that tiles of the N+1 image that are estimated to contain objects or human skin colors or areas of little or no motion have smaller logical pixel dimensions than tiles that are estimated to contain greater magnitudes of motion or no objects.
- the rendering processor 230 renders the N+1 image based on the dimensions of the logical pixels of each tile, rendering the pixels of each logical pixel using the pixel value of the geometrical center of the logical pixel. The method flow then continues back to block 502 for the next image frame.
- a computer readable storage medium includes any non-transitory storage medium, or combination of non-transitory storage media, accessible by a computer system during use to provide instructions and/or data to the computer system.
- Such storage media include, but are not limited to, optical media (e.g., compact disc (CD), digital versatile disc (DVD), Blu-Ray disc), magnetic media (e.g., floppy disc, magnetic tape, or magnetic hard drive), volatile memory (e.g., random access memory (RAM) or cache), non-volatile memory (e.g., read-only memory (ROM) or Flash memory), or microelectromechanical systems (MEMS)-based storage media.
- optical media e.g., compact disc (CD), digital versatile disc (DVD), Blu-Ray disc
- magnetic media e.g., floppy disc, magnetic tape, or magnetic hard drive
- volatile memory e.g., random access memory (RAM) or cache
- non-volatile memory e.g., read-only memory (ROM) or Flash memory
- the computer readable storage medium in one embodiment, is embedded in the computing system (e.g., system RAM or ROM), fixedly attached to the computing system (e.g., a magnetic hard drive), removably attached to the computing system (e.g., an optical disc or Universal Serial Bus (USB)-based Flash memory), or coupled to the computer system via a wired or wireless network (e.g., network accessible storage (NAS)).
- the computing system e.g., system RAM or ROM
- fixedly attached to the computing system e.g., a magnetic hard drive
- removably attached to the computing system e.g., an optical disc or Universal Serial Bus (USB)-based Flash memory
- USB Universal Serial Bus
- certain aspects of the techniques described above are implemented by one or more processors of a processing system executing software.
- the software includes one or more sets of executable instructions stored or otherwise tangibly embodied on a non-transitory computer readable storage medium.
- the software includes the instructions and certain data that, when executed by the one or more processors, manipulate the one or more processors to perform one or more aspects of the techniques described above.
- the non-transitory computer readable storage medium includes, for example, a magnetic or optical disk storage device, solid state storage devices such as Flash memory, a cache, random access memory (RAM) or other non-volatile memory device or devices, and the like.
- the executable instructions stored on the non-transitory computer readable storage medium are implemented, for example, in source code, assembly language code, object code, or other instruction format that is interpreted or otherwise executable by one or more processors.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Generation (AREA)
Abstract
Description
- Computer graphics or image rendering is the process by which a computing system displays an image based on a computer program. A scene file containing information regarding objects in a scene is passed to one or more processing units that render an image (also referred to herein as a “frame” or “image frame”) for display based on the scene file. A display contains an array of pixels, each of which is the smallest addressable element in the display device. However, three-dimensional rendered animations often contain extraneous detail that a viewer cannot perceive. Further, rendering each pixel of a display to generate a high-resolution image is computationally intensive.
- The present disclosure may be better understood, and its numerous features and advantages made apparent to those skilled in the art by referencing the accompanying drawings. The use of the same reference symbols in different drawings indicates similar or identical items.
-
FIG. 1 is a block diagram of a processing system that includes a motion estimator engine to generate a motion vector field for rendering regions of an image frame at variable resolutions based on the presence of objects and at least one of a magnitude and direction of motion according to some embodiments. -
FIG. 2 is a block diagram of the motion estimator engine and rendering processor ofFIG. 1 according to some embodiments. -
FIG. 3 is a block diagram of a motion vector field generator of the motion estimator engine ofFIG. 2 according to some embodiments. -
FIG. 4 is a block diagram of a logical pixel dimension identifier of the rendering processor ofFIG. 2 according to some embodiments. -
FIG. 5 is a flow diagram illustrating a method for rendering regions of an image frame at variable resolutions based on a motion vector field and the presence of objects according to some embodiments. - Variable resolution rendering can be used to reduce the computational workload on a processing system rendering relatively complex graphics (e.g., real-time 3D graphics animation) by assigning varying logical pixel dimensions to regions of an image frame and rendering pixels of the image frame based on the logical pixel dimensions. By identifying which regions of an image are of interest to a viewer, it is possible to render in highest resolution (i.e., with smaller logical pixel dimensions) those areas of the image on which the viewer is expected to focus (the “foveal region”), and to render in lower resolution (i.e., with larger logical pixel dimensions) those areas of the image outside the region of interest so that they will be less noticeable to the viewer. For regions with less detail or greater magnitude of motion, larger logical pixel dimensions reduce the computational workload without affecting the quality of the displayed graphics as perceived by a user.
- For example, in some embodiments a processing system uses a motion estimator engine to divide a previously rendered image into regions, referred to herein as “tiles”, of one or more pixels, sub-pixels, or fragments, and generate a motion vector field or other motion data identifying those tiles having moving areas. The processing system receives geometrical data from an application executing at a central processing unit, wherein the geometrical data identify those tiles having objects. The processing system uses the rendering processor to identify those tiles having little to no motion, based on the motion vector field, and having objects, and to assign smaller logical pixel dimensions in these regions. The rendering processor assigns logical pixel dimensions for each tile based on at least one of a magnitude and direction of motion within the tile. For example, if the motion estimator engine identifies that a tile has a motion vector indicating left to right motion, the rendering processor will assign logical pixel dimensions in that tile that are larger along the horizontal axis than along the vertical axis, reducing the effective rendering resolution within that tile to less than the nominal rate along the horizontal axis. In some embodiments, the presence of an object or portion of an object in a tile overrides the presence of motion in a tile for purposes of assigning logical pixel dimensions, such that the rendering processor assigns smaller logical pixel dimensions to tiles containing objects or portions of objects, even if the tiles also have motion. In some embodiments, whether the presence of an object overrides the presence of motion, or vice versa, is configurable. In this way, the rendering processor avoids a perceptible reduction in visual quality.
- In some embodiments, the motion estimator engine generates motion information for a frame based on user input data, the geometrical buildup of the frame (i.e., the data which will define the frame), and other inputs. In some embodiments, the motion estimator engine examines the two most-recently fully-rendered image frames (referred to as the “N” and “N−1” image frames) to create a motion vector field that measures the motion of the previous two image frames. The motion estimator engine compares each tile, or block, of the N and N−1 image frames to determine motion vectors for each block. A block is a uniform size of a group of pixels used for block-based motion estimation. In some embodiments, a motion estimator engine that is not block-based generates motion vectors per pixel, or per a group of multiple pixels. Based on an analysis of the N and N−1 image frames, the motion estimator engine generates a motion vector field, which indicates areas, magnitude, and direction of motion between the previous two image frames.
- If the N and N−1 image frames correlate sufficiently to conclude that the N image frame is a continuation of the N−1 image frame that immediately preceded it, the rendering processor assumes that the motion vector field is valid (e.g., there was no scene change) and that the units of the next image frame (referred to as the “N+1” image frame) will continue along the same trajectory. The rendering processor receives geometric and color data from the application and identifies which groups of pixels of the N image frame contain objects or skin color, and it may calculate other relevant metrics. Based on one or more of the motion vector fields, the objects, color, and the other relevant metrics, the rendering processor assigns logical pixel dimensions for each tile. In some embodiments, the tiles form a uniform or non-uniform grid of blocks of multiple pixels. The sizes of the tiles need not be correlated to sizes of units used by the underlying metrics. The rendering processor assigns smaller logical pixel dimensions (e.g., 1×1 pixel or 0.5×0.5 pixels, in which case the logical pixel dimensions contain multiple fragments or sub-pixels) for tiles having little or no motion, objects, and/or skin color, or in which both stationary and in-motion objects are present. The rendering processor assigns larger logical pixel dimensions (e.g., 2×1, 1×2, 2×2, 4×4, 2×4, 4×2) for tiles having a larger magnitude of motion vector, or no objects or skin color. The rendering processor also assigns larger logical pixel dimensions for tiles having a large degree of change between corresponding tiles of the N and N−1 image frames (such as for an explosion) and for tiles having a gradient of an image (such as a clear sky). In some embodiments, the rendering processor balances the variable resolution rates (i.e. logical pixel sizes) of tiles of an image frame based on a frame rate requirement for an application executing at the CPU or other performance requirements.
- Once the rendering processor has assigned logical pixel dimensions for each of the tiles of the N+1 frame, the rendering processor renders the pixels of the N+1 frame based on the logical pixel dimensions. If a logical pixel dimension is larger than one pixel, the rendering processor renders pixels of the logical pixel with the pixel value of the geometric center of the logical pixel. The rendering processor then assigns logical pixel dimensions for the next frame, such that the logical pixel dimensions of each tile are dynamically re-assigned for each frame. In this way, the variable resolution of each tile of each frame is adapted in real time, so that, e.g., regions with reduced rendering detail will return to full resolution when movement in those regions slows or stops.
-
FIG. 1 is a block diagram of aprocessing system 100 that includes amotion estimator engine 120 to generate amotion vector field 125 and arendering processor 130 for rendering a variableresolution image frame 135 based on logical pixel dimensions that are assigned based on a direction and magnitude of motion and the presence of objects in each tile according to some embodiments. Theprocessing system 100 can be incorporated in any of a variety of electronic devices, such as a server, personal computer, tablet, set top box, gaming system, and the like. Themotion estimator engine 120 is coupled to amemory 110 and therendering processor 130, which provides the variable resolution rendered image frame based onlogical pixel dimensions 135 to adisplay 140. The renderingprocessor 130 executes instructions and stores information in thememory 110 such as the results of the executed instructions. For example, thememory 110 stores a plurality of previously-rendered images (not shown) that it receives from therendering processor 130. In some embodiments, thememory 110 is implemented as a dynamic random access memory (DRAM), and in some embodiments, thememory 110 is implemented using other types of memory including static random access memory (SRAM), non-volatile RAM, and the like. Some embodiments of theprocessing system 100 include an input/output (I/O) engine (not shown) for handling input or output operations associated with thedisplay 140, as well as other elements of theprocessing system 100 such as keyboards, mice, printers, external disks, and the like. - The
motion estimator engine 120 is configured to receive memory locations for a most recently rendered image (the N image) 105 and a second-most recently rendered image (the N−1 image) 107 from a central processing unit (CPU) (not shown). Themotion estimator engine 120 compares the sequential previously-renderedimages memory 110 to generate themotion vector field 125. - The
rendering processor 130 receives commands generated by a central processing unit (CPU) 131 based on anapplication 132 instructing therendering processor 130 to render a current (N+1) image (not shown). Some embodiments of therendering processor 130 include multiple processor cores (not shown in the interest of clarity) that independently execute instructions concurrently or in parallel. Some embodiments of a command generated by the CPU include information defining textures, states, shaders, rendering objects, buffers, and the like that are used by the renderingprocessor 130 to render the objects or portions thereof in the N+1 image. The renderingprocessor 130 renders the objects to produce values of pixels that are provided to thedisplay 140, which uses the pixel values to display an image that represents the rendered objects. - To facilitate more efficient rendering of images, the
rendering processor 130 identifies objects in the N+1 image frame based ongeometric data 134 provided by theapplication 132 and assigns logical pixel dimensions for each tile of the N+1 image frame based on the identified objects and themotion vector field 125. Therendering processor 130 renders the N+1 image based on the logical pixel dimensions such that tiles having logical pixel dimensions greater than 1×1 pixel are rendered at lower resolution than tiles having logical pixel dimensions of 1×1 pixel or smaller. Further, because the logical pixel dimensions are larger along an axis corresponding to a direction of motion for each tile, tiles are rendered at lower resolution along a direction of motion. The reduction in resolution conserves rendering processor resources without degrading the user-perceivable image quality. - In operation, the
motion estimator engine 120 receives the memory locations of the N and N−1images CPU 131. Themotion estimator engine 120 analyzes at least the N and N−1images memory 110 to generate amotion vector field 125 that estimates moving areas of the N+1 image (not shown). Themotion vector field 125 indicates the direction and/or magnitude of motion for each unit of the image. Themotion estimator engine 120 provides themotion vector field 125 to therendering processor 130. - The
rendering processor 130 receives themotion vector field 125 from themotion estimator engine 120, and also receivesgeometric data 134 from theapplication 132 to identify objects in the N+1 image. In some embodiments, the rendering processor monitors WorldViewMatrix or ObjectTransformMatrix on a per-object basis to know the motion per object. Based on the identified objects and movement indicated by themotion vector field 125, therendering processor 130 assigns logical pixel dimensions for each tile of the N+1 image. In some embodiments, the logical pixel dimensions are greater than one pixel along an axis corresponding to a direction of motion indicated by themotion vector field 125 if the magnitude of the motion vector is greater than a threshold value. In some embodiments, the logical pixel dimensions are greater than one pixel for tiles having a degree of change in pixel values between the N and N−1 images greater than a threshold value. In some embodiments, logical pixel dimensions are greater than one pixel for tiles forming a gradient of the image (e.g. a clear sky). In some embodiments, the logical pixel dimensions for a tile are smaller than one pixel for tiles containing objects, a magnitude of motion vector smaller than a threshold value, skin color, or a high degree of detail. - The
rendering processor 130 renders the N+1 image based on the logical pixel dimensions. Thus, therendering processor 130 renders the pixels of each of the logical pixels for a tile having logical pixel dimensions that are greater than 1×1 pixel using the same value. For example, if a logical pixel size for a tile is 2×2, therendering processor 130 renders all four of the pixels of each logical pixel for that tile using the value of the geometric center of the tile. Similarly, if the logical pixel dimensions of a tile are 2×1, therendering processor 130 renders both of the pixels of each logical pixel for that tile using the value of the geometric center of the tile. Therendering processor 130 provides the variable resolution renderedimage frame 135 based on the logical pixel dimensions to thedisplay 140, which uses the pixel values of the variable resolution renderedimage 135 to display an image that represents the N+1 image. Therendering processor 130 also provides a copy of the variable resolution renderedimage 135 to thememory 110, where it is stored for subsequent generation of a variable resolution version of the next image or for additional, intermediate rendering stages. -
FIG. 2 is a block diagram of amotion estimator engine 220 andrendering processor 230 of theprocessing system 100 ofFIG. 1 according to some embodiments. Themotion estimator engine 220 outputs amotion vector field 225, which is used by therendering processor 230 to generate a variable resolution rendered N+1 image 235 based on logical pixel dimensions. - The
motion estimator engine 220 is configured to generate amotion vector field 225 based on estimates of motion derived from a comparison of a previously-renderedN image 205 and N−1image 207 stored at amemory 210. Themotion estimator engine 220 includes a motionvector field generator 255. The motionvector field generator 255 is configured to estimate movement of objects in consecutive images. Motion estimation assumes that in most cases consecutive images will be similar except for changes caused by objects moving within the images. To estimate motion, the motionvector field generator 255 determines motion vectors that describe the transformation from one two-dimensional image to another from adjacent images of an image sequence. A motion vector is a two-dimensional vector that provides an offset from the coordinates in one image to the coordinates in another image. - The motion
vector field generator 255 compares corresponding pixels of the N image 205 (the most recently rendered image) and the N−1 image 207 (the image rendered immediately prior to the N image) to create amotion vector field 225 that models the movement of objects between the two images. In some embodiments, the motionvector field generator 255 employs a block matching algorithm such as exhaustive search, three step search, simple and efficient search, four step search, diamond search, or other algorithms used in block matching. In some embodiments, the motionvector field generator 255 uses a neural network to estimate the motion vector field for the current frame based on the motion vector field for the previous frame. - The
rendering processor 230 includes askin color detector 270 and a logicalpixel dimension identifier 275. In some embodiments, theskin color detector 270 and logicalpixel dimension identifier 275 are implemented as shader programs on therendering processor 230. In some embodiments, one or more of theskin color detector 270 and logicalpixel dimension identifier 275 are implemented as fixed function hardware in themotion estimator engine 220. Therendering processor 230 is configured to receivegeometrical data 265 from anapplication 232 executing at aCPU 231 and themotion vector field 225 generated by the motionvector field generator 255 of themotion estimator engine 220. - The
skin color detector 270 is configured to detect human skin colors in tiles of the image frame. Because human skin colors are likely to correspond to regions of interest, those tiles of the image that theskin color detector 270 identifies as containing human skin colors are assigned smaller logical pixel dimensions such that they will be rendered at higher resolution, as described herein. - The logical
pixel dimension identifier 275 is configured to assign logical pixel dimensions for each tile of the N+1 image frame. The logicalpixel dimension identifier 275 assigns the logical pixel dimensions based on themotion vector field 225, objects identified based on thegeometrical data 265, and human skin colors detected by theskin color detector 270. The logicalpixel dimension identifier 275 assigns smaller logical pixel dimensions to those tiles of the N+1 image frame identified as containing objects, little to no motion, and/or human skin color. The logicalpixel dimension identifier 275 assigns larger logical pixel dimensions to those tiles of the N+1 image frame identified as containing a greater magnitude of motion and/or a gradient of an image, such as a clear sky. For tiles of the N+1 image frame identified as containing a greater magnitude of motion, the logicalpixel dimension identifier 275 assigns a larger logical pixel dimension to an axis of the logical pixel corresponding to the direction of motion. Thus, for example, if themotion vector field 225 indicates downward motion for a tile of the N+1 image frame, the logicalpixel dimension identifier 275 assigns are larger logical pixel dimension for logical pixels of that tile along the vertical axis (e.g., 1×2 or 1×4 or 2×4 pixels). - The
rendering processor 230 is configured to render the N+1 image at a variable resolution based on the logical pixel dimensions of each tile. In some embodiments, therendering processor 230 includes a plurality of shaders (not shown), each of which is a processing element configured to perform specialized calculations and execute certain instructions for rendering computer graphics. For example, in some embodiments, the shaders compute color and other attributes for the pixels included in each logical pixel of a display. In some embodiments, the shaders of therendering processor 230 are two-dimensional (2D) shaders such as pixel shaders, or three-dimensional shaders such as vertex shaders, geometry shaders, or tessellation shaders, or any combination thereof. In some embodiments, the shaders work in parallel to execute the operations required to render the N+1 image. -
FIG. 3 is a block diagram of a motionvector field generator 355 of themotion estimator engine 220 ofFIG. 2 according to some embodiments. The motionvector field generator 355 compares corresponding units, or groups of pixels, of the N image 315 and the N−1 image 317 to create amotion vector field 325 of vectors that model the movement of an object from one unit to another across consecutive images. The motionvector field generator 355 may employ a block matching algorithm such as exhaustive search, three step search, simple and efficient search, four step search, diamond search, or other algorithms used in block matching. In the example illustrated inFIG. 3 , the motionvector field generator 355 generates amotion vector field 325 containing motion vectors indicating motion, e.g., from unit C4 to unit C6, from unit D4 to unit D6, from unit E4 to unit E6, from unit C8 to unit C10, from unit D8 to unit D10, and from unit E8 to unit E10. -
FIG. 4 is a block diagram of a logicalpixel dimension identifier 475 of therendering processor 230 ofFIG. 2 according to some embodiments. The logicalpixel dimension identifier 475 assigns logical pixel dimensions to each tile (tile A 401,tile B 402,tile C 403, and tile D 404) of a frame N+1 400. In some embodiments, the tiles analyzed by the logicalpixel dimension identifier 475 are the same groups of pixels that are analyzed by the motionvector field generator 355. In some embodiments, the tiles analyzed by the logicalpixel dimension identifier 475 are larger or smaller groups of pixels than the units that are analyzed by the motionvector field generator 355. The logicalpixel dimension identifier 475 analyzes the motion vector field (not shown), geometric data received from the application executing at the CPU (not shown), and human skin colors detected in the N+1image frame 400 to assign logical pixel dimensions to each tile of the N+1image frame 400. - In the depicted example, the logical
pixel dimension identifier 475 identifies the following properties of each of the tiles of the N+1 image frame 400:tile A 401 has motion in a horizontal direction,tile B 402 has motion in a vertical direction,tile C 403 has an object or human skin color, andtile D 404 has non-directional motion (e.g. explosion or expansion/contraction) or diagonal motion and/or is a gradient of the N+1 image frame such as a clear sky. Based on the identified areas of interest, the logicalpixel dimension identifier 475 assigns a logical pixel size of 2×1 pixels to tile A 401, a logical pixel size of 1×2 pixels to tileB 402, a logical pixel size of 1×1 pixels totile C 403, and a logical pixel size of 2×2 pixels to tileD 404. Based on the logical pixel dimensions, the rendering processor (not shown) will rendertile A 401 with higher resolution along the vertical axis and lower resolution along the horizontal axis,tile B 402 with a lower resolution along the vertical axis and a higher resolution along the horizontal axis,tile C 403 with a higher resolution along both the vertical and horizontal axes, andtile D 404 with a lower resolution along both the vertical and horizontal axes. -
FIG. 5 is a flow diagram illustrating amethod 500 for rendering regions of an image frame at variable resolutions based on a motion vector field and the presence of objects according to some embodiments. Themethod 500 is implemented in some embodiments of theprocessing system 100 shown inFIG. 1 and themotion estimator engine 220 andrendering processor 230 shown inFIG. 2 . - At
block 502, themotion estimator engine 220 receives the two most recently rendered (N and N−1) images. Atblock 504, the motionvector field generator 255 of themotion estimator engine 220 compares the N and N−1 images to generate amotion vector field 225. In some embodiments, the motionvector field generator 255 uses a neural network or other predictive algorithm to extrapolate the motion of the motion vector field of the N image to the motion vector field of the N+1 image frame. Atblock 506, therendering processor 230 receivesgeometrical data 265 from theapplication 232 executing at theCPU 231, from which therendering processor 230 detects the presence of objects in tiles of the N+1 image frame. Atblock 508, the logicalpixel dimension identifier 275 assigns logical pixel dimensions for each tile of the N+1 image frame, based on themotion vector field 225, thegeometrical data 265, and the presence of human skin colors. The logicalpixel dimension identifier 275 assigns logical pixel dimensions such that tiles of the N+1 image that are estimated to contain objects or human skin colors or areas of little or no motion have smaller logical pixel dimensions than tiles that are estimated to contain greater magnitudes of motion or no objects. Atblock 510, therendering processor 230 renders the N+1 image based on the dimensions of the logical pixels of each tile, rendering the pixels of each logical pixel using the pixel value of the geometrical center of the logical pixel. The method flow then continues back to block 502 for the next image frame. - A computer readable storage medium includes any non-transitory storage medium, or combination of non-transitory storage media, accessible by a computer system during use to provide instructions and/or data to the computer system. Such storage media include, but are not limited to, optical media (e.g., compact disc (CD), digital versatile disc (DVD), Blu-Ray disc), magnetic media (e.g., floppy disc, magnetic tape, or magnetic hard drive), volatile memory (e.g., random access memory (RAM) or cache), non-volatile memory (e.g., read-only memory (ROM) or Flash memory), or microelectromechanical systems (MEMS)-based storage media. The computer readable storage medium, in one embodiment, is embedded in the computing system (e.g., system RAM or ROM), fixedly attached to the computing system (e.g., a magnetic hard drive), removably attached to the computing system (e.g., an optical disc or Universal Serial Bus (USB)-based Flash memory), or coupled to the computer system via a wired or wireless network (e.g., network accessible storage (NAS)).
- In some embodiments, certain aspects of the techniques described above are implemented by one or more processors of a processing system executing software. The software includes one or more sets of executable instructions stored or otherwise tangibly embodied on a non-transitory computer readable storage medium. The software includes the instructions and certain data that, when executed by the one or more processors, manipulate the one or more processors to perform one or more aspects of the techniques described above. The non-transitory computer readable storage medium includes, for example, a magnetic or optical disk storage device, solid state storage devices such as Flash memory, a cache, random access memory (RAM) or other non-volatile memory device or devices, and the like. The executable instructions stored on the non-transitory computer readable storage medium are implemented, for example, in source code, assembly language code, object code, or other instruction format that is interpreted or otherwise executable by one or more processors.
- Note that not all the activities or elements described above in the general description are required, that a portion of a specific activity or device may not be required, and that one or more further activities may be performed, or elements included, in addition to those described. Still further, the order in which activities are listed are not necessarily the order in which they are performed. Also, the concepts have been described with reference to specific embodiments. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present disclosure as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure.
- Benefits, other advantages, and solutions to problems have been described above about specific embodiments. However, the benefits, advantages, solutions to problems, and any feature(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature of any or all the claims. Moreover, the particular embodiments disclosed above are illustrative only, as the disclosed subject matter may be modified and practiced in different but equivalent manners apparent to those skilled in the art having the benefit of the teachings herein. No limitations are intended to the details of construction or design herein shown, other than as described in the claims below. It is therefore evident that the particular embodiments disclosed above may be altered or modified and all such variations are considered within the scope of the disclosed subject matter. Accordingly, the protection sought herein is as set forth in the claims below.
Claims (21)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/843,532 US20230102620A1 (en) | 2018-11-27 | 2022-06-17 | Variable rate rendering based on motion estimation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/201,918 US11381825B2 (en) | 2018-11-27 | 2018-11-27 | Variable rate rendering based on motion estimation |
US17/843,532 US20230102620A1 (en) | 2018-11-27 | 2022-06-17 | Variable rate rendering based on motion estimation |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/201,918 Continuation US11381825B2 (en) | 2018-11-27 | 2018-11-27 | Variable rate rendering based on motion estimation |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230102620A1 true US20230102620A1 (en) | 2023-03-30 |
Family
ID=70771051
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/201,918 Active US11381825B2 (en) | 2018-11-27 | 2018-11-27 | Variable rate rendering based on motion estimation |
US17/843,532 Pending US20230102620A1 (en) | 2018-11-27 | 2022-06-17 | Variable rate rendering based on motion estimation |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/201,918 Active US11381825B2 (en) | 2018-11-27 | 2018-11-27 | Variable rate rendering based on motion estimation |
Country Status (1)
Country | Link |
---|---|
US (2) | US11381825B2 (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6385248B1 (en) * | 1998-05-12 | 2002-05-07 | Hitachi America Ltd. | Methods and apparatus for processing luminance and chrominance image data |
US20020141501A1 (en) * | 1998-11-20 | 2002-10-03 | Philips Electronics North America Corporation | System for performing resolution upscaling on frames of digital video |
US20060256855A1 (en) * | 2005-05-16 | 2006-11-16 | Stephen Gordon | Method and system for video classification |
US20150264390A1 (en) * | 2014-03-14 | 2015-09-17 | Canon Kabushiki Kaisha | Method, device, and computer program for optimizing transmission of motion vector related information when transmitting a video stream from an encoder to a decoder |
US20170353672A1 (en) * | 2016-06-07 | 2017-12-07 | Panasonic Intellectual Property Management Co., Ltd. | Imaging device provided with light source, image sensor including first accumulator and second accumulator, and controller |
US20180098089A1 (en) * | 2016-10-04 | 2018-04-05 | Qualcomm Incorporated | Adaptive motion vector precision for video coding |
US20180343448A1 (en) * | 2017-05-23 | 2018-11-29 | Intel Corporation | Content adaptive motion compensated temporal filtering for denoising of noisy video for efficient coding |
US20190281303A1 (en) * | 2018-03-07 | 2019-09-12 | Tencent America LLC | Method and apparatus for video coding |
US20200058152A1 (en) * | 2017-04-28 | 2020-02-20 | Apple Inc. | Video pipeline |
US20200111195A1 (en) * | 2018-10-09 | 2020-04-09 | Valve Corporation | Motion smoothing for re-projected frames |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6512537B1 (en) * | 1998-06-03 | 2003-01-28 | Matsushita Electric Industrial Co., Ltd. | Motion detecting apparatus, motion detecting method, and storage medium storing motion detecting program for avoiding incorrect detection |
US20080247465A1 (en) * | 2007-04-05 | 2008-10-09 | Jun Xin | Method and System for Mapping Motion Vectors between Different Size Blocks |
WO2013089700A1 (en) * | 2011-12-14 | 2013-06-20 | Intel Corporation | Methods, systems, and computer program products for assessing a macroblock candidate for conversion to a skipped macroblock |
US9813721B2 (en) * | 2014-11-20 | 2017-11-07 | Getgo, Inc. | Layer-based video encoding |
US10218975B2 (en) * | 2015-09-29 | 2019-02-26 | Qualcomm Incorporated | Transform precision manipulation in video coding |
US10694187B2 (en) * | 2016-03-18 | 2020-06-23 | Lg Electronics Inc. | Method and device for deriving block structure in video coding system |
TWI610558B (en) * | 2016-05-26 | 2018-01-01 | 晨星半導體股份有限公司 | Bit Allocation Method and Video Encoding Device |
US10169843B1 (en) | 2017-11-20 | 2019-01-01 | Advanced Micro Devices, Inc. | Temporal foveated rendering using motion estimation |
US10609440B1 (en) * | 2018-06-08 | 2020-03-31 | Amazon Technologies, Inc. | Timing data anomaly detection and correction |
-
2018
- 2018-11-27 US US16/201,918 patent/US11381825B2/en active Active
-
2022
- 2022-06-17 US US17/843,532 patent/US20230102620A1/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6385248B1 (en) * | 1998-05-12 | 2002-05-07 | Hitachi America Ltd. | Methods and apparatus for processing luminance and chrominance image data |
US20020141501A1 (en) * | 1998-11-20 | 2002-10-03 | Philips Electronics North America Corporation | System for performing resolution upscaling on frames of digital video |
US20060256855A1 (en) * | 2005-05-16 | 2006-11-16 | Stephen Gordon | Method and system for video classification |
US20150264390A1 (en) * | 2014-03-14 | 2015-09-17 | Canon Kabushiki Kaisha | Method, device, and computer program for optimizing transmission of motion vector related information when transmitting a video stream from an encoder to a decoder |
US20170353672A1 (en) * | 2016-06-07 | 2017-12-07 | Panasonic Intellectual Property Management Co., Ltd. | Imaging device provided with light source, image sensor including first accumulator and second accumulator, and controller |
US20180098089A1 (en) * | 2016-10-04 | 2018-04-05 | Qualcomm Incorporated | Adaptive motion vector precision for video coding |
US20200058152A1 (en) * | 2017-04-28 | 2020-02-20 | Apple Inc. | Video pipeline |
US20180343448A1 (en) * | 2017-05-23 | 2018-11-29 | Intel Corporation | Content adaptive motion compensated temporal filtering for denoising of noisy video for efficient coding |
US20190281303A1 (en) * | 2018-03-07 | 2019-09-12 | Tencent America LLC | Method and apparatus for video coding |
US20200111195A1 (en) * | 2018-10-09 | 2020-04-09 | Valve Corporation | Motion smoothing for re-projected frames |
Also Published As
Publication number | Publication date |
---|---|
US20200169734A1 (en) | 2020-05-28 |
US11381825B2 (en) | 2022-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11270506B2 (en) | Foveated geometry tessellation | |
US8970580B2 (en) | Method, apparatus and computer-readable medium rendering three-dimensional (3D) graphics | |
EP3714435B1 (en) | Temporal foveated rendering using motion estimation | |
US8576225B2 (en) | Seamless fracture in a production pipeline | |
US10937195B2 (en) | Label based approach for video encoding | |
US20180005039A1 (en) | Method and apparatus for generating an initial superpixel label map for an image | |
US8723864B2 (en) | Pre-culling processing method, system and computer readable medium for hidden surface removal of image objects | |
KR20170031480A (en) | Apparatus and Method of rendering | |
US20230102620A1 (en) | Variable rate rendering based on motion estimation | |
EP2908289B1 (en) | Information processing apparatus, generation method, program, and storage medium | |
Arranz et al. | Multiresolution energy minimisation framework for stereo matching | |
US9036089B2 (en) | Practical temporal consistency for video applications | |
US20230134779A1 (en) | Adaptive Mesh Reprojection for Low Latency 6DOF Rendering | |
US20230177763A1 (en) | Method for Adapting the Rendering of a Scene | |
US20200068214A1 (en) | Motion estimation using pixel activity metrics | |
KR20230081098A (en) | 3D human reconstruction method based on orthographic image prediction | |
Shegeda et al. | A gpu-based real-time algorithm for virtual viewpoint rendering from multi-video | |
KR20080031106A (en) | Method for controlling voltage used to processing 3 dimensional graphics data and apparatus using it | |
US20120075288A1 (en) | Apparatus and method for back-face culling using frame coherence | |
JPH11144082A (en) | Image generating device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ADVANCED MICRO DEVICES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FAINSTAIN, EVGENE;WASSON, SCOTT A;SIGNING DATES FROM 20181124 TO 20181207;REEL/FRAME:061088/0487 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |