WO2024118077A1

WO2024118077A1 - Method and apparatus for generating super-resolution image based on sample test geometry information and coverage mask

Info

Publication number: WO2024118077A1
Application number: PCT/US2022/051459
Authority: WO
Inventors: Yu DAI
Original assignee: Zeku Technology (Shanghai) Corp., Ltd.
Priority date: 2022-11-30
Filing date: 2022-11-30
Publication date: 2024-06-06

Abstract

According to one aspect of the present disclosure, a system for performing image super- resolution (SR). The system may include a graphical processing unit (GPU). The GPU may perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant are covered by a coverage mask. The GPU may maintain geometric information associated with each of the samples. The system may include a data processing unit (DPU) may include with an SR engine and an Al model. The SR engine may obtain the geometric information and the information associated with the coverage mask from the GPU. The SR engine may perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model. The SR engine may generate SR image data based on the upscale procedure.

Description

METHOD AND APPARATUS FOR GENERATING SUPERRESOLUTION IMAGE BASED ON SAMPLE TEST GEOMETRY INFORMATION AND COVERAGE MASK

BACKGROUND

[0001] Embodiments of the present disclosure relate to apparatuses and methods for operating an image signal processor (ISP).

[0002] An image/video capturing device, such as a camera or camera array, can be used to capture an image/video or a picture of a scene. Cameras or camera arrays have been included in many handheld devices, especially since the advent of social media that allows users to upload pictures and videos of themselves, friends, family, pets, or landscapes on the Internet with ease and in real-time. Examples of camera components that operate together to capture an image/video include lens(es), image sensor(s), ISP(s), and/or encoders, just to name a few components thereof. The lens, for example, may receive and focus light onto one or more image sensors that are configured to detect photons. When photons impinge on the image sensor, an image signal corresponding to the scene is generated and sent to the ISP. The ISP performs various operations associated with the image signal to generate one or more processed images of the scene that can then be output to a user, stored in memory, or output to the cloud.

SUMMARY

[0003] According to one aspect of the present disclosure, a system for performing image super-resolution (SR). The system may include a graphical processing unit (GPU). The GPU may be configured to perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant is covered by a coverage mask. The GPU may be configured to maintain geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. The GPU may be configured to maintain information associated with the coverage mask used for each pixel during the sample test. The system may include a data processing unit (DPU) may include with an SR engine and an Al model. The SR engine may be configured to obtain the geometric information and the information associated with the coverage mask from the GPU. The SR engine may be configured to perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model. The SR engine may be configured to generate SR image data based on the upscale procedure performed for each of the plurality of pixels.

[0004] According to another aspect of the present disclosure, an apparatus for wireless communication is provided. The apparatus may include a cellular communication component configured to perform operations associated with cellular communication. The apparatus may include a system for performing image SR. The system may include a GPU. The GPU may be configured to perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant are covered by a coverage mask. The GPU may be configured to maintain geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. The GPU may be configured to maintain information associated with the coverage mask used for each pixel during the sample test. The system may include a DPU may include with an SR engine and an Al model. The SR engine may be configured to obtain the geometric information and the information associated with the coverage mask from the GPU. The SR engine may be configured to perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model. The SR engine may be configured to generate SR image data based on the upscale procedure performed for each of the plurality of pixels.

[0005] According to yet another aspect of the present disclosure, a method of performing image SR is provided. The method may include performing, by a GPU, a sample test for each of a plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by a coverage mask. The method may include maintaining, by the GPU, geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. The method may include maintaining, by the GPU, information associated with the coverage mask used for each pixel during the sample test. The method may include obtaining, by an SR engine of a DPU, the geometric information and the information associated with the coverage mask from the GPU. The method may include performing, by the SR engine of the DPU, an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and an Al model of the DPU. The method may include generating, by the SR engine of the DPU, an SR image data based on the upscale procedure performed for each of the plurality of pixels.

[0006] These illustrative embodiments are mentioned not to limit or define the present disclosure, but to provide examples to aid understanding thereof. Additional embodiments are discussed in the Detailed Description, and further description is provided there. BRIEF DESCRIPTION OF THE DRAWINGS

[0007] The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present disclosure and, together with the description, further serve to explain the principles of the present disclosure and to enable a person skilled in the pertinent art to make and use the present disclosure.

[0008] FIG. 1 illustrates a block diagram of an exemplary image enhancement system, according to some embodiments of the present disclosure.

[0009] FIG. 2A illustrates a first tile-based rendering pipeline that may be implemented by an image processing system, according to some embodiments of the present disclosure.

[0010] FIG. 2B illustrates a second tile-based rendering pipeline that may be implemented by an image processing system, according to some embodiments of the present disclosure.

[0011] FIG. 3A illustrates a diagram of a single pixel in a low-resolution image, according to some embodiments of the present disclosure.

[0012] FIG. 3B illustrates a diagram of the exemplary 2x2 up-sampled pixels in high- resolution image, according to some embodiments of the present disclosure.

[0013] FIG. 4 illustrates a diagram of an exemplary degeneration of virtual pixel quadrants for 2X MSAA, according to some embodiments of the present disclosure.

[0014] FIG. 5 illustrates a diagram of an exemplary coverage mask used for 8X MSAA, according to some embodiments of the disclosure.

[0015] FIG. 6 illustrates a block diagram of an exemplary apparatus that generates SR images according to some embodiments of the present disclosure.

[0016] FIG. 7 illustrates a flowchart of an exemplary method of generating an SR image, according to some embodiments of the present disclosure.

[0017] FIG. 8 is a block diagram illustrating an example of a computer system useful for implementing various embodiments set forth in the disclosure.

[0018] Embodiments of the present disclosure will be described with reference to the accompanying drawings.

DETAILED DESCRIPTION

[0019] Although specific configurations and arrangements are discussed, it should be understood that this is done for illustrative purposes only. A person skilled in the pertinent art will recognize that other configurations and arrangements can be used without departing from the spirit and scope of the present disclosure. It will be apparent to a person skilled in the pertinent art that the present disclosure can also be employed in a variety of other applications. [0020] It is noted that references in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” “some embodiments,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases do not necessarily refer to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of a person skilled in the pertinent art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

[0021] In general, terminology may be understood at least in part from usage in context. For example, the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features, structures or characteristics in a plural sense. Similarly, terms, such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.

[0022] Various aspects of method and apparatus will now be described. These apparatus and methods will be described in the following detailed description and illustrated in the accompanying drawings by various blocks, modules, units, components, circuits, steps, operations, processes, algorithms, etc. (collectively referred to as “elements”). These elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether such elements are implemented as hardware, firmware, or software depends upon the particular application and design constraints imposed on the overall system.

[0023] For ease of nomenclature, the term “camera” is used herein to refer to an image capture device or other data acquisition device. Such a data acquisition device can be any device or system for acquiring, recording, measuring, estimating, determining and/or computing data representative of a scene, including but not limited to two-dimensional image data, three-dimensional image data, and/or light field data. Such a data acquisition device may include optics, sensors, and image processing electronics for acquiring data representative of a scene, using techniques that are well known in the art. One skilled in the art will recognize that many types of data acquisition devices can be used in connection with the present disclosure, and that the present disclosure is not limited to cameras. Thus, the use of the term “camera” herein is intended to be illustrative and exemplary, but should not be considered to limit the scope of the present disclosure. Specifically, any use of such term herein should be considered to refer to any suitable data acquisition device.

[0024] Image enhancement is one of the most important computer vision applications. Image enhancement units are generally deployed in all cameras, such as mobile phones or digital cameras. Image enhancement is a challenging problem since it consists of multiple units that perform various operations. These conventional units may include, e.g., a super resolution (SR) unit, a denoising unit, and/or a high dynamic range (HDR) unit. SR is a type of image enhancement that produces high-resolution frames from lower-resolution rendering inputs. With smart upscaling techniques, SR can boost framerate without a maj or loss of image quality. In fact, in some cases, the image quality can be visually enhanced.

[0025] A first existing technique uses an artificial intelligence (Al) algorithm to generate high-resolution images via a tensor core. To that end, the first existing technique uses a deep learning neural network that leverages the power of the tensor core to increase frame rates and produce image sharpness during super resolution upscaling. For a video game to obtain DLSS support, it first uses a supercomputer for training its respective Al networks. This is accomplished by providing the network with tens of thousands of frame image captures in high-resolution with super-sample anti-aliasing. The network then compares the images to learn how to upscale lower quality source frames to approximate the quality of high-resolution images. The first existing technique requires a very large Al engine, e.g., such as tensor cores. In a mobile GPU, applying this first existing technique is not feasible, and large Al cores may not be integrated into a mobile SoC any time soon due to the constraints of power consumption on mobile GPU architecture. For example, on mobile devices with only a 5 watts (W) or even 1W power budget, a large Al processor is impossible to implement.

[0026] A second existing technique uses certain spatial upscaling algorithms to provide similar results. For example, the second existing technique uses an edge detection method, which is called Edge Adaptive Spatial Up-Sampling, plus a collection of other image sharpness and post-processing algorithms, to perform super resolution. Video games that implement this technique plug special filter programs into the rendering pipeline after the anti-aliased image. To achieve better results, the image is in perceptual color space. Some other image processing, e.g., such as chromatic aberration and heads-up display (HUD), may be performed after FSR on the already-up-scaled frame buffer. However, the second existing technique uses several image filtering steps using kernels. These kernels create complex GPU computations that require a great deal of power, which makes this technique incompatibility with low-power mobile SoC design. Moreover, the algorithm also requires sampling from frame buffers that are rendered by a previous render pass. Its unique 12-tap sampling pattern is usually in conflict with the tile partition for tile-based rendering GPU architecture. This may result in more double data rate (DDR) accesses to external memory, which will introduce additional power concerns for a mobile SoC platform.

[0027] Although both the first and second existing techniques can provide applications with an image up-scaling solution while keeping decent image quality, they come with a higher cost either in hardware design or computational capacity, which may not be available on mobile devices due to their low-power SoC design.

[0028] Thus, there exists an unmet need for a powerful yet efficient solution that maintains image fidelity while achieving SR processing on a mobile SoC platform.

[0029] To overcome these and other challenges, the present disclosure provides an exemplary SoC that includes a GPU and a data processing unit (DPU). The DPU may be configured to process image data, audio data, video data, etc. In the exemplary SoC, the DPU may be configured to upscale the GPU rendered image data via a special pre-trained Al module. The upscaling is applied to GPU rendered frame buffer content read from the GPU. The upscaled image is sent to the display device directly for presentation. The actual rendered image, which is a quarter size of display resolution, is resolved to a frame buffer in a DDR memory. The GPU of the exemplary SoC may include a tile-based rendering architecture, and the upscaling by the DPU may be performed at the tile-level. Because the present upscaling is performed as a post-processing of GPU rendered image, and the interference to GPU pipeline is limited to the minimum, any upscaling latency may be made up for at the system-level. To avoid image blurriness caused by a loss of geometry information during post-processing of the image, the exemplary SoC of the present disclosure generates a 2x2 sub-pixel sampling coverage mask for each screen pixel during GPU rendering. These coverage masks may be one of the inputs used for training and inference in the DPU.

[0030] With Multi-Stage Anti-Aliasing (MSAA) enabled at the GPU, these sample coverage masks are by-product of the MSAA process, and hence, come without sacrifice of performance. In the exemplary tile-based rendering GPU architecture described herein, the coverage mask can be saved into local GPU memory (e.g., a sub-pixel coverage masks buffer). The coverage masks may be sent to an SR engine in the DPU for upscaling. It only costs a small amount of memory to hold these bits. However, the on-screen geometry information obtained by the GPU using the coverage masks may be maintained and provided to SR engine. Exporting the multi-sampling coverage masks to the external SR engine in DPU to facilitate the up-sampling may enable SR ISP to be achieved without violating the power constraints of mobile SoC architecture. In this way, the coverage masks and their associated on-screen geometry information may be used by the SR engine can compensate for the lack of geometry information in SR Al algorithm, which may enhance the image sharpness with almost no performance loss when MSAA is enabled. Additional details of the exemplary SR technique are provided below in connection with FIGs. 1, 2A, 2B, 3A, 3B, 4A, 4B, 4C, 5, 6, 7, and 8. [0031] FIG. 1 illustrates an exemplary block diagram of an exemplary image processing system 100 (referred to hereinafter as “image processing system 100”), according to some embodiments of the present disclosure. FIG. 2A illustrates a first tile-based rendering pipeline 200 that may be implemented by image processing system 100 of FIG. 1, according to some embodiments of the present disclosure. FIG. 2B illustrates a second tile-based rendering pipeline 250 that may be implemented by image processing system 100 of FIG. 1, according to some embodiments of the present disclosure. FIGs. 1, 2A, and 2B will be described together.

[0032] Referring to FIG. 1, image processing system 100 may include a GPU 102, a DPU 104, a display 130, and a frame buffer in DDR memory 140, which may be implemented on the same or different SoCs. GPU 102 may include, e.g., a per-tile parameter buffer 110, a pixel shader 112, an on-chip tile buffer 114, and a sub-pixel coverage masks buffer 116. DPU 104 may include, e.g., an SR engine 120 and a pre-trained Al model component 122.

[0033] Still referring to FIG. 1, GPU 102 may render tiles associated with an image. Tile-based rendering may be achieved using different GPU architectures. For example, some may have single pass rendering architecture, as depicted in FIG. 2A, while others may have a two-pass architecture, as depicted in FIG. 2B.

[0034] Referring to FIGs. 2A and 2B, when a tile operation begins, the corresponding tile list is retrieved by raster/tiler 202 from system memory (not shown). The tile list may identify which screen-space primitive data to fetch. The raster/tiler 202 may only fetch screenspace position data for the geometry within the tile. Depth test component 204 may perform hidden surface removal (HSR), along with a depth test. Information associated with the HSR and/or depth test may be maintained at depth buffer 206. Per-tile parameter buffer 110 may maintain information about which pixels are covered by a triangle mask and any associated geometry associated with the pixel and coverage mask, as described below in connection with FIG. 3 A. Pixel shader 112 may apply coloring operations, e.g., such as fragment shaders, to the visible pixels. Once the tile’s rendering is complete, SR engine 120 may write its color data to frame buffer in DDR memory 140. Display 130 may display the image once the SR rendering for all tiles of the image is complete.

[0035] In tiled-based rendering GPU architecture of FIGs. 2A and 2B, the GPU hardware processes all triangle positions (e.g., coverage mask positions with respect to a pixel) along with depth and stencil tests before pixel shading stage. Only the intermediate data from those visible triangles are maintained in on-chip tile buffer 114. The hidden triangles will be removed from the later pixel shading stage by pixel shader 112. When 4X MSAA is enabled, the GPU hardware will perform the coverage test of four samples once for each screen pixel, as described below in connection with FIGs. 3A and 3B.

[0036] FIG. 3A illustrates a diagram 300 of a single pixel in a low-resolution image, according to some embodiments of the present disclosure. FIG. 3B illustrates a diagram 350 of the exemplary 2x2 up-sampled pixels in high resolution image, according to some embodiments of the present disclosure. FIGs. 3 A and 3B will be described together.

[0037] Referring to FIG. 3 A, as mentioned above, when 4X MSAA is enabled, the GPU hardware may perform the coverage test of four samples 320, e.g., one for each pixel 302a. The sample test may include partitioning pixel 302a into four quadrants based on the pixel’s center 310, and then applying a coverage mask 304 thereto. The coverage test may be used to identify covered sample(s) 320 and uncovered sample(s) 330. The geometric position of each of the covered sample(s) 320 and uncovered sample(s) 330 may and the position of coverage mask 304 with respect to pixel 302a may be an output of the coverage test, for example. Coverage mask 304 can be sent to sub-pixel coverage mask buffer 116 after the sample test is performed by the GPU hardware (not shown). The geometry information obtained from the coverage test and the coverage masks may be sent to SR engine 120 by the on-chip tile buffer 114 and the sub-pixel coverage mask buffer 116, respectively. SR engine 120 may perform upscaling of the pixel based on the geometry information, information associated with coverage mask 304, and the pre-trained Al model.

[0038] Referring to FIG. 3B, SR engine 120 may upscale a low-resolution image to a 2x2 scaled high-resolution image. Therefore, SR engine 120 saves 2x2 coverage bits for each sub-pixel 302b. FIG. 3B illustrates how these sub-pixel samples are mapped to 2x2 supersamples in a high-resolution image. For example, assume the center 310 of pixel 302a is by offset (0.5, 0.5) of screen position. After SR, one pixel 302a (see FIG. 3A) will be mapped to four sub-pixels 302b in a high-resolution image (see FIG. 3B). This is the equivalent of a (±0.25, ±0.25) offset from the center 310 of pixel 302a. The number of MSAA samples generated within pixel 302a is up to GPU implementation. Usually, the four samples are not exactly by offset (±0.25, ±0.25) of center. However, they are likely distributed across the four virtual inner quadrants of pixel 302a, which can be considered a very close approximation of up-scaled samples in high-resolution image.

[0039] The coverage bits are Is for the covered samples 320, and 0s of those uncovered samples 330. In memory, one pixel is described using four bits. A 2x2 quad in low-resolution image will require 2 bytes of storage to save the result. Coverage mask 304 may indicate to SR engine 120 whether the triangle covers all the sub-pixels 302b in the high-resolution image, or just partially cover some of them. Using this approximation, SR engine 120 may distinguish those sub-pixels 302b with covered samples 320 (e.g., inner triangle pixels) and those subpixels 302b with uncovered samples 330 (e.g., edge pixels) based on the Is and 0s received from on-chip tile buffer 114.

[0040] The storage of coverage mask 304 depends on tile size. For example, if the driver of GPU 102 determines that the tile size is 256x192 pixels in width and height for a given render pass, the coverage bits will need 256 x 192 x 4 bits = 24KB intermediate graphics memory in on-chip tile buffer 114. This memory usage is associated with the entire frame, and it can be compressed to save storage per GPU implementation. When a sample fails a coverage test, information associated with the covering neighbor triangle is not saved. Instead, the coverage bit only implies whether a subpixel is covered by the same triangle as the center pixel in low resolution image. SR engine 120 will use larger weights of neighbor pixels during color calculation for those samples that fail the coverage test. Finally, the upscaled SR image is sent to display 130 directly for presentation. The actual rendered image, which is a quarter size of display resolution, is resolved to frame buffer in DDR memoryl40.

[0041] FIG. 4 illustrates a diagram 400 of a degeneration of virtual pixel quadrants for 2X MSAA, according to some embodiments of the present disclosure. Referring to FIG. 4, in the case of 2X MSAA, the four virtual quadrants Qo, Qi, Q2, Q3 as shown FIG. 4 are degenerated to either top-bottom or left-right coverage mask 404 depending on whether these 420 locations are Y or X major. We still use 4 bits to save to coverage mask. In the case of top-bottom coverage, the top two quadrants have the same coverage value, while the value of bottom two quadrants are always the same. Similarly, the left and right two quadrants have the same coverage value separately for the left-right case. [0042] FIG. 5 illustrates a diagram 500 of an exemplary coverage mask used for 8X MSAA, according to some embodiments of the disclosure. For example, referring to FIG. 5, in the case of 8X or 16X MSAA, more than one sample 520 in each virtual quadrant will be used for the coverage test. The test result of each quadrant will be simply OR operand of each sample coverage bit. As shown in FIG. 5, an uncovered sample 530 in quadrant Qi fails the coverage test, while a covered sample 520 passes the coverage test. After OR operand these values, coverage bit for Qi is 1.

[0043] FIG. 6 illustrates an exemplary block diagram of an apparatus 600 having an image signal processor (ISP), according to some embodiments. Apparatus 600 may include an application processor (AP) 660, an ISP 620, a memory 618, and input-output devices 608. ISP 620 may include an image enhancement unit 602 and a camera 606. Input-output device 608 may include user input devices 610, inertial measurement unit (IMU) 612 (e.g., translational motion sensor, rotational motion sensor, accelerometer, gyroscope, etc.), display and audio devices 614, and wireless communication devices 616. In some embodiments, apparatus 600 may be an image capturing device, such as a smartphone or digital camera.

[0044] AP 660 may be the main application process of apparatus 600, and may host the operating system (OS) of apparatus 600 and all the applications. AP 660 may be any kind of general-purpose processors such as a microprocessor, a microcontroller, a digital signal processor, or a central processing unit, and other needed integrated circuits such as glue logic. The term “processor” may refer to a device having one or more processing units or elements, e.g., a central processing unit (CPU) with multiple processing cores. AP 660 may be used to control the operations of apparatus 600 by executing instructions stored in memory 618, which can be in the same chip as AP 660 or in a separate chip from AP 660. AP 660 may also generate control signals and transmit them to various parts of apparatus 600 to control and monitor the operations of these parts. In some embodiments, AP 660 can run the OS of apparatus 600, control the communications between the user and apparatus 600, and control the operations of various applications. For example, AP 660 may be coupled to a communications circuitry and execute software to control the wireless communications functionality of apparatus 600. In another example, AP 660 may be coupled to ISP 620 and input-output devices 608 to control the processing and display of sensor data, e.g., image data, one or more frames, HDR images, low dynamic range (LDR) images, etc.

[0045] ISP 620 may include software and/or hardware operatively coupled to AP 660 and input-output devices 608. In some embodiments, components, e.g., circuitry, of ISP 620 may be integrated on a single chip. In some embodiments, ISP 620 includes an image processing hardware coupled to (e.g., placed between) AP 660 and image enhancement unit 602/camera 606. ISP 620 may include a suitable circuitry that, when controlled by AP 660, performs functions not supported by AP 660, e.g., generating an enhanced image using the exemplary image enhancement model described above in connection with FIGs. 1, 2 A, and 2B. In various embodiments, ISP 620 may include a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a graphics processing unit (GPU), a microprocessor, a microcontroller, a digital signal processor, and other needed integrated circuits for its purposes.

[0046] Image enhancement unit 602 may include GPU 102 and DPU 104 of FIG. 1. In some embodiments, image enhancement unit 602 may include one or more of the units of system 100 when implemented to generate an SR image. For example, image enhancement unit 602 may receive an image from camera 606. Based on the geometric information from the coverage tests, the coverage masks, and a pre-trained Al model, image enhancement unit 602 may upscale a low-resolution image into an SR image.

[0047] FIG. 7 illustrates a flowchart of an exemplary method 700 of training an image enhancement mode, according to some embodiments of the present disclosure. Exemplary method 700 may be performed by a system, e.g., such as system 100, GPU 102, DPU 104, pertile parameter buffer 110, pixel shader 112, on-chip tile buffer 114, sub-pixel coverage masks buffer 116, SR engine 120, pre-trained Al model component 122, display 130, frame buffer in DDR memory 140, image enhancement unit 602, camera 606, ISP 620, and/or computer system 800. Method 700 may include steps 702-712 as described below. It is to be appreciated that some of the steps may be optional, and some of the steps may be performed simultaneously, or in a different order than shown in FIG. 7.

[0048] Referring to FIG. 7, at 702, the system may perform, by a GPU, a sample test for each of a plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by a coverage mask. For example, referring to FIG. 3 A, when 4X MSAA is enabled, the GPU hardware may perform the coverage test of four samples 320, e.g., one for each pixel 302a. The sample test may include partitioning pixel 302a into four quadrants based on the pixel’s center 310, and then applying a coverage mask 304 thereto. The coverage test may be used to identify covered sample(s) 320 and uncovered sample(s) 330.

[0049] At 704, the system may maintain, by the GPU, geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. For example, referring to FIGs. 1 and 3 A, the geometric position of the each of the covered sample(s) 320 and uncovered sample(s) 330 may and the position of coverage mask 304 with respect to pixel 302a may be an output of the coverage test, for example. The geometric position of each of the covered sample(s) and uncovered sample(s) 330 may be maintained by on-chip tile buffer 114.

[0050] At 706, the system may maintain, by the GPU, information associated with the coverage mask used for each pixel during the sample test. For example, referring to FIGs. 1 and 3 A, coverage mask 304 can be sent to sub-pixel coverage mask buffer 116 after the sample test is performed by the GPU hardware (not shown).

[0051] At 708, the system may obtain, by an SR engine of a DPU, the geometric information and the information associated with the coverage mask from the GPU. For example, referring to FIG. 1, the geometry information obtained from the coverage test and the coverage masks may be sent to SR engine 120 by the on-chip tile buffer 114 and the sub-pixel coverage masks buffer 116, respectively.

[0052] At 710, the system may perform, by the SR engine of the DPU, an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and an Al model of the DPU. For example, referring to FIGs. 1, 3A, and 3B, SR engine 120 may perform upscaling of the pixel based on the geometry information, information associated with coverage mask 304, and the pre-trained Al model. SR engine 120 may upscale a low-resolution image to a 2x2 scaled high-resolution image. Therefore, SR engine 120 saves 2x2 coverage bits for each sub-pixel 302b. FIG. 3B illustrates how these sub-pixel samples are mapped to 2x2 super-samples in a high-resolution image. For example, assume the center 310 of pixel 302a is by offset (0.5, 0.5) of screen position. After SR, one pixel 302a (see FIG. 3A) will be mapped to four sub-pixels 302b in a high-resolution image (see FIG. 3B). This is the equivalent of a (±0.25, ±0.25) offset from the center 310 of pixel 302a. The number of MSAA samples generated within pixel 302a is up to GPU implementation. Usually, the four samples are not exactly by offset (±0.25, ±0.25) of center. However, they are likely distributed across the four virtual inner quadrants of pixel 302a, which can be considered a very close approximation of up-scaled samples in high- resolution image. The coverage bits are Is for the covered samples 320, and 0s of those uncovered samples 330. In memory, one pixel is described using four bits. A 2x2 quad in low- resolution image will require 2 bytes of storage to save the result. Coverage mask 304 may indicate to SR engine 120 whether the triangle covers all the sub-pixels 302b (e.g., up-scaled pixels) in the high-resolution image, or just partially cover some of them. Using this approximation, SR engine 120 may distinguish those sub-pixels 302b with covered samples 320 (e.g., inner triangle pixels) and those sub-pixels 302b with uncovered samples 330 (e.g., edge pixels) based on the Is and Os received from on-chip tile buffer 114.

[0053] At 712, the system may generate, by the SR engine of the DPU, an SR image data based on the upscale procedure performed for each of the plurality of pixels. For example, referring to FIGs. 1 and 3B, the storage of coverage mask 304 depends on tile size. For example, if the driver of GPU 102 determines that the tile size is 256x192 pixels in width and height for a given render pass, the coverage bits will need 256 x 192 x 4 bits = 24KB intermediate graphics memory in on-chip tile buffer 114. This memory usage is associated with the entire frame, and it can be compressed to save storage per GPU implementation. When a sample fails a coverage test, information associated with the covering neighbor triangle is not saved. Instead, the coverage bit only implies whether a subpixel is covered by the same triangle as the center pixel in low resolution image. SR engine 120 will use larger weights of neighbor pixels during color calculation for those samples that fail coverage test. Finally, the upscaled SR image is generated based on the upscaled pixels and is sent to display 130 directly for presentation.

[0054] Various embodiments can be implemented, for example, using one or more computer systems, such as computer system 800 shown in FIG. 8. One or more computer system 800 can be used, for example, to implement method 700 of FIG. 7. For example, computer system 800 can detect and correct grammatical errors and/or train an artificial neural network model for detecting and correcting grammatical errors, according to various embodiments. Computer system 800 can be any computer capable of performing the functions described herein.

[0055] Computer system 800 can be any well-known computer capable of performing the functions described herein. Computer system 800 includes one or more processors (also called central processing units, or CPUs), such as a processor 804. Processor 804 is connected to a communication infrastructure 806 (e.g., a bus). One or more processors 804 may each be a graphics processing unit (GPU). In an embodiment, a GPU is a processor that is a specialized electronic circuit designed to process mathematically intensive applications. The GPU may have a parallel structure that is efficient for parallel processing of large blocks of data, such as mathematically intensive data common to computer graphics applications, images, videos, etc. [0056] Computer system 800 also includes user input/output device(s) 803, such as monitors, keyboards, pointing devices, etc., that communicate with communication infrastructure 806 through user input/output interface(s) 802.

[0057] Computer system 800 also includes a main or primary memory 808, such as random access memory (RAM). Main memory 808 may include one or more levels of cache. Main memory 808 has stored therein control logic (i.e., computer software) and/or data. Computer system 800 may also include one or more secondary storage devices or memory 810. Secondary memory 810 may include, for example, a hard disk drive 812 and/or a removable storage device or drive 814. Removable storage drive 814 may be a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup device, and/or any other storage device/drive. Removable storage drive 814 may interact with a removable storage unit 818. Removable storage unit 818 includes a computer usable or readable storage device having stored thereon computer software (control logic) and/or data. Removable storage unit 818 may be a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, and/ any other computer data storage device. Removable storage drive 814 reads from and/or writes to removable storage unit 818 in a well-known manner.

[0058] According to an exemplary embodiment, secondary memory 810 may include other means, instrumentalities or other approaches for allowing computer programs and/or other instructions and/or data to be accessed by computer system 800. Such means, instrumentalities or other approaches may include, for example, a removable storage unit 822 and an interface 820. Examples of the removable storage unit 822 and the interface 820 may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a memory stick and universal serial bus (USB) port, a memory card and associated memory card slot, and/or any other removable storage unit and associated interface.

[0059] Computer system 800 may further include a communication or network interface 824. Communication interface 824 enables computer system 800 to communicate and interact with any combination of remote devices, remote networks, remote entities, etc. (individually and collectively referenced as 828). For example, communication interface 824 may allow computer system 800 to communicate with remote devices 828 over communication path 826, which may be wired and/or wireless, and which may include any combination of LANs, WANs, the Internet, etc. Control logic and/or data may be transmitted to and from computer system 800 via communication path 826.

[0060] In an embodiment, a tangible apparatus or article of manufacture comprising a tangible computer useable or readable medium having control logic (software) stored thereon is also referred to herein as a computer program product or program storage device. This includes, but is not limited to, computer system 800, main memory 808, secondary memory 810, and removable storage units 818 and 822, as well as tangible articles of manufacture embodying any combination of the foregoing. Such control logic, when executed by one or more data processing devices (such as computer system 800), causes such data processing devices to operate as described herein.

[0061] Based on the teachings contained in this disclosure, it will be apparent to persons skilled in the relevant art(s) how to make and use embodiments of the present disclosure using data processing devices, computer systems and/or computer architectures other than that shown in FIG. 8. For example, embodiments may operate with software, hardware, and/or operating system implementations other than those described herein.

[0062] In various aspects of the present disclosure, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or encoded as instructions or code on a non- transitory computer-readable medium. Computer-readable media includes computer storage media. Storage media may be any available media that can be accessed by a computing device, such as system 100 in FIG. 1. By way of example, and not limitation, such computer-readable media can include random-access memory (RAM), read-only memory (ROM), EEPROM, compact disc read-only memory (CD-ROM) or other optical disk storage, hard disk drive (HDD), such as magnetic disk storage or other magnetic storage devices, Flash drive, solid state drive (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a processing system, such as a mobile device or a computer. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), and floppy disk where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

[0063] According to one aspect of the present disclosure, a system for performing image SR. The system may include a GPU. The GPU may be configured to perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant is covered by a coverage mask. The GPU may be configured to maintain geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. The GPU may be configured to maintain information associated with the coverage mask used for each pixel during the sample test. The system may include a DPU may include with an SR engine and an Al model. The SR engine may be configured to obtain the geometric information and the information associated with the coverage mask from the GPU. The SR engine may be configured to perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model. The SR engine may be configured to generate SR image data based on the upscale procedure performed for each of the plurality of pixels.

[0064] In some embodiments, the system may include a display device configured to display the SR image data. [0065] In some embodiments, the system may include a frame buffer in a DDR memory configured to maintain original image data associated with the plurality of pixels.

[0066] In some embodiments, the sample test is associated with a 4X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant is covered by the coverage mask, the GPU may be configured to identify whether one sample in each pixel quadrant is covered by the coverage mask.

[0067] In some embodiments, the sample test may be associated with an 8X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU may be configured to identify whether two samples in each pixel quadrant are covered by the coverage mask.

[0068] In some embodiments, the sample test is associated with a 2X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU may be configured to identify whether one sample in each pixel located in the top two or the bottom two quadrants are covered by the coverage mask.

[0069] In some embodiments, the sample test may be associated with a 2X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant is covered by the coverage mask, the GPU may be configured to identify whether one sample in each pixel located in the left two or the right two quadrants are covered by the coverage mask.

[0070] According to another aspect of the present disclosure, an apparatus for wireless communication is provided. The apparatus may include a cellular communication component configured to perform operations associated with cellular communication. The apparatus may include a system for performing image SR. The system may include a GPU. The GPU may be configured to perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant are covered by a coverage mask. The GPU may be configured to maintain geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. The GPU may be configured to maintain information associated with the coverage mask used for each pixel during the sample test. The system may include a DPU may include with an SR engine and an Al model. The SR engine may be configured to obtain the geometric information and the information associated with the coverage mask from the GPU. The SR engine may be configured to perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model. The SR engine may be configured to generate SR image data based on the upscale procedure performed for each of the plurality of pixels.

[0071] In some embodiments, the system may include a display device configured to display the SR image data.

[0072] In some embodiments, the system may include a frame buffer in a DDR memory configured to maintain original image data associated with the plurality of pixels.

[0073] In some embodiments, the sample test is associated with a 4X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant is covered by the coverage mask, the GPU may be configured to identify whether one sample in each pixel quadrant is covered by the coverage mask.

[0074] In some embodiments, the sample test may be associated with an 8X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU may be configured to identify whether two samples in each pixel quadrant are covered by the coverage mask.

[0075] In some embodiments, the sample test is associated with a 2X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant is covered by the coverage mask, the GPU may be configured to identify whether one sample in each pixel located in the top two or the bottom two quadrants are covered by the coverage mask.

[0076] In some embodiments, the sample test may be associated with a 2X MSAA process. In some embodiments, to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU may be configured to identify whether one sample in each pixel located in the left two or the right two quadrants are covered by the coverage mask.

[0077] According to yet another aspect of the present disclosure, a method of performing image SR is provided. The method may include performing, by a GPU, a sample test for each of a plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by a coverage mask. The method may include maintaining, by the GPU, geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel. The method may include maintaining, by the GPU, information associated with the coverage mask used for each pixel during the sample test. The method may include obtaining, by an SR engine of a DPU, the geometric information and the information associated with the coverage mask from the GPU. The method may include performing, by the SR engine of the DPU, an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and an Al model of the DPU. The method may include generating, by the SR engine of the DPU, an SR image data based on the upscale procedure performed for each of the plurality of pixels.

[0078] In some embodiments, the method may include displaying, by a display device, the SR image data.

[0079] In some embodiments, the method may include maintaining, by a frame buffer in a DDR memory, original image data associated with the plurality of pixels.

[0080] In some embodiments, the sample test may be associated with a 4X MSAA process. In some embodiments, the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant are covered by the coverage mask may include identifying whether one sample in each pixel quadrant is covered by the coverage mask.

[0081] In some embodiments, the sample test may be associated with an 8X MSAA process. In some embodiments, the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant are covered by the coverage mask may include identifying whether two samples in each pixel quadrant are covered by the coverage mask.

[0082] In some embodiments, the sample test may be associated with a 2X MSAA process. In some embodiments, the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by the coverage mask may include identifying whether one sample in each pixel located in the top two or the bottom two quadrants are covered by the coverage mask. In some embodiments, the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant are covered by the coverage mask may include identifying whether one sample in each pixel located in the left two or the right two quadrants are covered by the coverage mask. [0083] Embodiments of the present disclosure have been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.

[0084] The Summary and Abstract sections may set forth one or more but not all exemplary embodiments of the present disclosure as contemplated by the inventor(s), and thus, are not intended to limit the present disclosure and the appended claims in any way.

[0085] Various functional blocks, modules, and steps are disclosed above. The particular arrangements provided are illustrative and without limitation. Accordingly, the functional blocks, modules, and steps may be re-ordered or combined in different ways than in the examples provided above. Likewise, some embodiments include only a subset of the functional blocks, modules, and steps, and any such subset is permitted.

[0086] The breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

WHAT IS CLAIMED IS

1. A system for performing image super-resolution (SR), comprising: a graphical processing unit (GPU) configured to: perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant are covered by a coverage mask; maintain geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel; and maintain information associated with the coverage mask used for each pixel during the sample test; and a data processing unit (DPU) comprising an SR engine and an artificial intelligence (Al) model, wherein the SR engine is configured to: obtain the geometric information and the information associated with the coverage mask from the GPU; perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model; and generate SR image data based on the upscale procedure performed for each of the plurality of pixels.

2. The system of claim 1, further comprising: a display device configured to: display the SR image data.

3. The system of claim 1, further comprising: a frame buffer in a double data rate (DDR) memory configured to: maintain original image data associated with the plurality of pixels.

4. The system of claim 1, wherein: the sample test is associated with a 4X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether one sample in each pixel quadrant is covered by the coverage mask.

5. The system of claim 1, wherein: the sample test is associated with an 8X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether two samples in each pixel quadrant are covered by the coverage mask.

6. The system of claim 1, wherein: the sample test is associated with a 2X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether one sample in each pixel located in the top two or the bottom two quadrants are covered by the coverage mask.

7. The system of claim 1, wherein: the sample test is associated with a 2X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant is covered by the coverage mask, the GPU is configured to: identify whether one sample in each pixel located in the left two or the right two quadrants are covered by the coverage mask.

8. An apparatus for wireless communication, comprising: a cellular communication component configured to perform operations associated with cellular communication; and a system for performing image super resolution (SR), comprising: a graphical processing unit (GPU) configured to: perform, for each of a plurality of pixels, a sample test to identify whether one or more samples in each pixel quadrant are covered by a coverage mask; maintain geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel; and maintain information associated with the coverage mask used for each pixel during the sample test; and a data processing unit (DPU) comprising an SR engine and an artificial intelligence (Al) model, wherein the SR engine is configured to: obtain the geometric information and the information associated with the coverage mask from the GPU; perform an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and the Al model; and generate super-resolution (SR) image data based on the upscale procedure performed for each of the plurality of pixels.

9. The apparatus of claim 8, wherein the system for performing SR further comprises: a display device configured to: display the SR image data.

10. The apparatus of claim 8, wherein the system for performing SR further comprises: a frame buffer in a double data rate (DDR) memory configured to: maintain original image data associated with the plurality of pixels.

11. The apparatus of claim 8, wherein: the sample test is associated with a 4X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether one sample in each pixel quadrant is covered by the coverage mask.

12. The apparatus of claim 8, wherein: the sample test is associated with an 8X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether two samples in each pixel quadrant are covered by the coverage mask.

13. The apparatus of claim 8, wherein: the sample test is associated with a 2X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether one sample in each pixel located in the top two or the bottom two quadrants are covered by the coverage mask.

14. The apparatus of claim 8, wherein: the sample test is associated with a 2X multi-stage anti-aliasing (MSAA) process, and to perform, for each of the plurality of pixels, the sample test to identify whether one or more samples in each pixel quadrant are covered by the coverage mask, the GPU is configured to: identify whether one sample in each pixel located in the left two or the right two quadrants are covered by the coverage mask.

15. A method of performing image super resolution (SR), comprising: performing, by a graphical processing unit (GPU), a sample test for each of a plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by a coverage mask; maintaining, by the GPU, geometric information associated with each of the samples based on an outcome of the sample test performed for each pixel; maintaining, by the GPU, information associated with the coverage mask used for each pixel during the sample test; obtaining, by an SR engine of a data processing unit (DPU), the geometric information and the information associated with the coverage mask from the GPU; performing, by the SR engine of the DPU, an upscale procedure for each of the plurality of pixels based on the geometric information, the information associated with the coverage mask, and an Al model of the DPU; and generating, by the SR engine of the DPU, an SR image data based on the upscale procedure performed for each of the plurality of pixels.

16. The method of claim 15, further comprising: displaying, by a display device, the SR image data.

17. The method of claim 15, further comprising: maintaining, by a frame buffer in a double data rate (DDR) memory, original image data associated with the plurality of pixels.

18. The method of claim 15, wherein: the sample test is associated with a 4X multi-stage anti-aliasing (MSAA) process, and the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant are covered by the coverage mask comprises: identifying whether one sample in each pixel quadrant is covered by the coverage mask.

19. The method of claim 15, wherein: the sample test is associated with an 8X multi-stage anti-aliasing (MSAA) process, and the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by the coverage mask comprises: identifying whether two samples in each pixel quadrant are covered by the coverage mask.

20. The method of claim 15, wherein: the sample test is associated with a 2X multi-stage anti-aliasing (MSAA) process, and the performing, by the GPU, the sample test for each of the plurality of pixels to identify whether one or more samples in each pixel quadrant is covered by the coverage mask comprises: identifying whether one sample in each pixel located in the top two or the bottom two quadrants are covered by the coverage mask; or identifying whether one sample in each pixel located in the left two or the right two quadrants are covered by the coverage mask.