WO2022178786A1

WO2022178786A1 - Image processor and image processing device

Info

Publication number: WO2022178786A1
Application number: PCT/CN2021/077975
Authority: WO
Inventors: 董镇江; 谢环; 蒋东龙; 曾奋元
Original assignee: 华为技术有限公司
Priority date: 2021-02-25
Filing date: 2021-02-25
Publication date: 2022-09-01
Also published as: CN116671096A

Abstract

The present application relates to the technical field of image processing, and provides an image processor and an image processing device, for use in reducing performance requirements of the image processor when performing SVGF processing on an image. The image processor comprises an SVGF accelerator. The SVGF accelerator comprises: a time domain processing engine for performing, according to a previous frame image of the image to be processed, time domain filtering processing on the image to be processed, so as to obtain a first image, wherein parameter information of a pixel point in the first image is first parameter information; a momentum processing engine comprising a first cache for caching the first parameter information, and a momentum calculation circuit for performing filtering momentum processing on the first image according to the first parameter information to obtain a second image, wherein parameter information of a pixel point in the second image is second parameter information; and a spatial domain processing engine comprising a second cache for caching the second parameter information, and a spatial domain calculation circuit for performing spatial domain filtering processing on the second image according to the second parameter information to obtain a target image.

Description

An image processor and image processing equipment

technical field

The present application relates to the technical field of image processing, and in particular, to an image processor and an image processing device.

Background technique

Spatiotemporal variance-guided filtering (SVGF) is a denoising method that uses spatiotemporal reprojection and feature buffer-driven bilateral filtering to blur high variance regions. It is widely used in image processors (graphic processing unit, GPU) fiber tracing (ray tracing, RT) noise reduction. Among them, the processing flow of SVGF mainly includes three steps, temporal pass processing, filter momentum pass processing and spatial pass processing, each step needs to perform a lot of calculations.

In the prior art, the SVGF method is usually implemented by a stream multiprocessor (SM) in the GPU based on software, and during specific implementation, the three threads running on the SM can execute the process in sequence. The three steps of the SVGF, and the input data and output data for each of the three threads, are stored in buffers in the GPU coupled to the SM. However, due to the large amount of calculation and data of the SVGF method, the buffer needs to have a large read and write bandwidth and a small read and write delay, that is, the method has high performance requirements on the GPU, for example, Taking an image with a resolution of 1080P processed by a desktop GPU as an example, its display frame rate can only reach about 30 frames per second (fps), while the performance of the GPU in most mobile terminals is only 1/ that of the desktop GPU. 10 or lower, so it cannot be used on devices with lower GPU performance.

SUMMARY OF THE INVENTION

The present application provides an image processor and an image processing device, which can be used to reduce the performance requirements of the image processor for SVGF processing of images, and at the same time improve the image processing rate.

To achieve the above object, the application adopts the following technical solutions:

In a first aspect, an image processor is provided, the image processor includes an SVGF accelerator, and the SVGF accelerator includes: a temporal processing engine, configured to perform temporal filtering processing on the to-be-processed image according to a previous frame of the image to be processed , to obtain the first image, the parameter information of the pixels in the first image is the first parameter information; the momentum processing engine includes: a first buffer for buffering the first parameter information, such as buffering one or more parameters in the first image The first parameter information of each pixel point; the momentum calculation circuit is used to filter the momentum process on the first image according to the first parameter information, so as to obtain the second image, and the parameter information of the pixel point in the second image is the second parameter information; The spatial processing engine includes: a second buffer for buffering second parameter information, such as buffering the second parameter information of one or more pixels in the second image; The two images are subjected to spatial filtering processing to obtain the target image, and the parameter information of the pixel points in the target image is the target parameter information.

In the above technical solution, both the momentum processing engine and the spatial processing engine include their own buffers, so that when performing the corresponding filtering momentum processing and spatial filtering processing, the parameters of the pixels that need to be multiplexed in multiple consecutive processing processes can be changed. The information is cached in the corresponding buffer, and by obtaining the parameter information of the pixel points obtained by the latest processing from the upper-level processing engine, the corresponding processing can be performed based on the cached parameter information and the obtained parameter information, so that the processing can be realized. The data transmitted between the engines is minimal, thereby greatly reducing the delay of data transmission, improving the speed of image processing, and reducing the performance requirements of the image processor for SVGF processing of images.

In a possible implementation manner of the first aspect, the time-domain processing engine includes a time-domain calculation circuit, and the time-domain calculation circuit includes: an information acquisition circuit configured to obtain, for the first pixel point in the image to be processed, The first pixel point corresponds to the historical parameter information of a plurality of pixel points in the first preset window where the historical pixel point in the previous frame of image is located, and the historical parameter information includes geometric coordinate information, historical brightness information and historical momentum information , the first pixel is any pixel in the to-be-processed image; the signal generation circuit is used to generate a control signal according to the geometric coordinate information of the plurality of pixels; the first convolution circuit is used when the control signal is valid When determining the first brightness information and the first momentum information of the first pixel in the first image according to the historical brightness information and historical momentum information of the plurality of pixels, respectively; the first convolution circuit is also used for the control signal When invalid, the first brightness information and the first momentum information of the first pixel in the first image are respectively determined according to the preset brightness information and the preset momentum information; the first convolution circuit is also used for according to the first pixel of the first pixel. A momentum information determines the first variance. In the above possible implementation manner, a simple and effective circuit structure of a time-domain calculation circuit is provided, through which the time-domain filtering processing can be quickly performed on the pixels in the image to be processed pixel by pixel, and the time-domain filtering processing can be performed quickly. In the process, the historical parameter information of the historical pixel points in the continuous multiple processing processes can be reused, so that the speed of the time domain filtering processing can be improved.

In a possible implementation manner of the first aspect, the signal generation circuit is specifically configured to: when the geometric coordinate information of at least one pixel point in the preset area exists among the plurality of pixel points in the first preset window and satisfies the specified A valid control signal is generated when the coordinate range is within the specified coordinate range, and the specified coordinate range is related to the geometric coordinate information of the first pixel point; when the geometric coordinate information of multiple pixel points in the first preset window does not satisfy the specified coordinate range, an invalid control signal is generated. control information. Among the above possible implementations, a method in which the signal generation circuit generates a control signal according to the geometric coordinate information of a plurality of pixel points in the first preset window is provided, which can effectively avoid the geometric coordinate information of the first pixel point. The influence of pixels with large gaps on the temporal filtering processing of the first pixel.

In a possible implementation manner of the first aspect, the first convolution circuit includes: a first luminance convolution circuit, configured to use a first convolution weight of pixels in the preset area when the control signal is valid and brightness information, to determine the first brightness information of the first pixel in the first image; the first momentum convolution circuit is used for when the control signal is valid, according to the first convolution weight and sum of the pixels in the preset area The momentum information is used to determine the first momentum information of the first pixel in the first image; the first variance calculation circuit is used to determine the first variance of the first pixel according to the first momentum information. In the above possible implementation manner, a simple and effective circuit structure of the first convolution circuit is provided, and the first luminance convolution circuit and the first momentum convolution circuit in the circuit structure can be executed simultaneously, so as to quickly pass the convolution circuit. The product operation determines the first brightness information and the first momentum information of the pixel, and the first variance calculation circuit can determine the first variance based on the determined first momentum information, thereby improving the first brightness of the pixel determined by the first convolution circuit. information, the first momentum information, and the rate of the first variance.

The first parameter information of any pixel in the image to be processed can be quickly determined.

In a possible implementation manner of the first aspect, the time domain calculation circuit further includes: a first weight calculation circuit, configured to determine a first convolution weight of each pixel in the preset area. In the above possible implementation manner, the time domain calculation circuit may further include a first weight calculation circuit for determining the first convolution weight of each pixel in the preset area, thereby improving the first convolution weight of each pixel. Accuracy of convolution weights.

In a possible implementation manner of the first aspect, the historical parameter information further includes the historical calculation times of the historical pixel point, and the time domain calculation circuit further includes: an update circuit configured to, when the control signal is valid, update the The value obtained by adding one to the historical calculation times of the historical pixel point is determined as the historical calculation times of the first pixel point; the update circuit is also used to determine the historical calculation times of the first pixel point to be 0 when the control signal is invalid. In the above possible implementation manner, the time domain calculation circuit may further include an update circuit for determining the historical calculation times of the first pixel point, so as to facilitate the rapid selection of the pixel point in the second image in the subsequent filtering momentum processing process. Two parameter information.

In a possible implementation manner of the first aspect, the time domain processing engine further includes: a third buffer for buffering the historical parameter information and/or the initial parameter information of the first pixel in the image to be processed , the initial parameter information includes at least one of the following: initial luminance information, initial momentum information, motion vector, and geometric coordinate information. In the above possible implementation manner, the third buffer can be used to store the first parameter information of the pixels that need to be multiplexed in multiple consecutive processing processes, so that the delay of data transmission can be greatly reduced, thereby improving the image processing rate.

In a possible implementation manner of the first aspect, the first parameter information includes first luminance information, first momentum information, and first variance, and the momentum calculation circuit includes: a second convolution circuit, configured for the first For the first pixel in the image, according to the first brightness information and the first momentum information of the plurality of pixels in the second preset window where the first pixel is located, the convolutional brightness information and the volume of the first pixel are determined. accumulated momentum information; the second convolution circuit is also used to determine the convolution variance of the first pixel point according to the convolution momentum information; the selection circuit is used to select the first brightness information and the first variance as the first pixel point respectively The second brightness information and the second variance in the second image, or the convolution brightness information and the convolution variance are selected as the second brightness information and the second variance of the first pixel in the second image, respectively. In the above possible implementation manners, a simple and effective circuit structure of a momentum calculation circuit is provided. The circuit structure can quickly perform the filtering momentum processing on the pixels in the first image pixel by pixel, and in the filtering momentum processing process, it can be The first parameter information of the pixels stored in the first buffer is multiplexed, so that the speed of filtering momentum processing can be improved.

In a possible implementation manner of the first aspect, the second convolution circuit includes: a second luminance convolution circuit, configured to use the second convolution weight of the plurality of pixel points and the first convolution weight of the plurality of pixel points Convolution operation is performed on the luminance information to obtain the convolution luminance information of the first pixel; the second momentum convolution circuit is used for the second convolution weight of the plurality of pixels and the first momentum of the plurality of pixels. A convolution operation is performed on the information to obtain the convolution momentum information of the first pixel point; the second variance calculation circuit is used to determine the convolution variance of the first pixel point according to the convolution momentum information. In the above possible implementation manner, a simple and effective circuit structure of the second convolution circuit is provided, and the second luminance convolution circuit and the second momentum convolution circuit in the circuit structure can be executed simultaneously, so as to quickly pass the volume The product operation determines the convolution brightness information and convolution momentum information of the pixels in the first image, and the second variance calculation circuit can determine the convolution variance based on the determined convolution momentum information, thereby improving the second convolution point circuit. The rate at which to convolve luminance information, convolve momentum information, and convolve variance.

In a possible implementation manner of the first aspect, the momentum calculation circuit further includes: a second weight calculation circuit, configured to determine a second convolution weight of each pixel point in the plurality of pixel points. In the above possible implementation manner, the dynamic calculation circuit may further include a second convolution weight for determining each pixel point in the plurality of pixel points, thereby improving the accuracy of the second convolution weight for each pixel point.

In a possible implementation manner of the first aspect, the first parameter information further includes the number of historical calculations, and the selection circuit is specifically configured to: when the number of historical calculations of the first pixel is greater than or equal to a preset number of times, select the first pixel. The brightness information and the first variance are respectively used as the second brightness information and the second variance of the first pixel in the second image; when the number of historical calculations of the first pixel is less than the preset number of times, select the convolution brightness information and the convolution variance are respectively used as the second luminance information, the second momentum information and the second variance of the first pixel in the second image. In the above possible implementation manners, a simple and effective manner for selecting the second luminance information, the second momentum information and the second variance is provided.

In a possible implementation manner of the first aspect, the spatial domain calculation circuit includes: a convolution conversion circuit, configured to respectively extract the first pixel point in the second image within each wavelet transform window of the plurality of wavelet transform windows The second parameter information of a plurality of valid pixel points, the plurality of wavelet transform windows are obtained through the wavelet transform of a preset number of steps, the second parameter information includes the second brightness information and the second variance; the third weight calculation circuit, For each wavelet transform window in the multiple wavelet transform windows, according to the second variance of the multiple effective pixel points in the wavelet transform window, determine the third convolution weight of the multiple effective pixel points; the third volume The product circuit is used for, for each wavelet transform window in the plurality of wavelet transform windows, according to the third convolution weight of the plurality of effective pixels in the wavelet transform window and the second brightness information of the plurality of effective pixels and the sum The second variance is used to determine the third brightness information and the third variance of the first pixel respectively; the third convolution circuit is also used for the third brightness information and the third brightness of the first pixel corresponding to the multiple wavelet transform windows. variance, respectively determine the target brightness information and target variance of the first pixel in the target image. In the above possible implementation manner, a simple and effective circuit structure of a spatial domain calculation circuit is provided, the circuit structure can quickly perform spatial filtering processing on the pixels in the second image pixel by pixel, and in the process of spatial filtering processing, it can be The second parameter information of the pixel points stored in the second buffer is multiplexed, so that the speed of spatial filtering processing can be improved.

In a possible implementation manner of the first aspect, the third weight calculation circuit is specifically configured to: for the plurality of valid pixels, according to the second variance and the second brightness of the plurality of valid pixels in the second image The information and the geometric coordinate information respectively determine the depth weight, normal vector weight and brightness weight; according to the depth weight, normal vector weight and brightness weight of the multiple valid pixels, determine the third convolution weight of the multiple valid pixels. In the above possible implementation manner, the third weight calculation circuit can quickly and effectively determine the number of effective pixels in the plurality of effective pixels according to the second variance, the second brightness information and the geometric coordinate information in the second parameter information of the multiple effective pixels. The third convolution weight for each valid pixel.

In a possible implementation manner of the first aspect, the third convolution circuit includes: a third luminance convolution circuit, configured to A convolution operation is performed on the second luminance information to obtain the third luminance information of the first pixel; the third difference calculation circuit is used for calculating the third convolution weight of the plurality of valid pixels and the The second variance determines the third variance of the first pixel point. In the above possible implementation manner, a simple and effective circuit structure of the third convolution circuit is provided, and the third luminance convolution circuit in the circuit structure can be used to quickly determine the third value of the first pixel point through the convolution operation. The luminance information, the second variance calculation circuit can be used to quickly determine the third variance of the first pixel point.

In a possible implementation manner of the first aspect, the first buffer stores a first preset number in the first image that is required by the momentum calculation circuit to filter the momentum processing of the pixels in the first image. The first parameter information of the pixel point, the first preset number is the square of the size of the second preset window; the second buffer stores the space domain calculation circuit in the second image. The second parameter information of the second preset number of pixels in the second image, the second preset number is the square of (Size_Window3+(Step-1)×2), and Size_Window3 represents the size of the initial convolution window of the wavelet transform, Step represents the number of wavelet transform steps. The above possible implementation manners can reduce the storage space required by the image processor when the image to be processed is processed by SVGF, thereby realizing the setting of the minimum cache in different processing engines.

In a possible implementation manner of the first aspect, the first buffer stores a third preset number in the first image that is required by the momentum calculation circuit to filter the momentum processing of the pixels in the first image. The first parameter information of the pixel point, the first parameter information of the third preset number of pixels, the third preset number can be [Width×(Size_Window2×Step-1)+Size_Window2], Width represents the width of the image to be processed ; The second buffer is stored with the second parameter information of the fourth preset number of pixels in the second image required when the spatial calculation circuit performs the spatial momentum processing on the pixels in the second image, the fourth preset The number is [Width×(Size_Window3×Step-1)+Size_Window3×Step]. The above possible implementation manners can reduce the bandwidth requirement required by the image processor when the image to be processed is processed by the SVGF, thereby realizing the setting of the minimum bandwidth.

In a second aspect, an image processing device is provided, the image processing device includes an image processor, the image processor includes a stream multiprocessor, and the first aspect or any possible implementation manner of the first aspect is provided. SVGF accelerator in an image processor.

It can be understood that any image processing device provided above includes all the contents of the image processor provided above. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects of the image processor provided above. , and will not be repeated here.

Description of drawings

FIG. 1 is a schematic structural diagram of an image processing device according to an embodiment of the present application;

2 is a schematic structural diagram of a GPU according to an embodiment of the present application;

3 is a schematic diagram of processing multiple image blocks by multiple processing cores of a GPU according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of an SVGF accelerator in a GPU provided by an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a time domain calculation circuit provided by an embodiment of the present application;

6 is a schematic diagram of a plurality of preset regions in a first preset window provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a time-domain processing engine provided by an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a momentum calculation circuit provided by an embodiment of the present application;

FIG. 9 is a schematic structural diagram of a momentum processing engine according to an embodiment of the present application;

10 is a schematic diagram of a plurality of wavelet transform windows provided by an embodiment of the present application;

FIG. 11 is a schematic structural diagram of an airspace calculation circuit provided by an embodiment of the present application;

12 is a schematic structural diagram of an airspace processing engine provided by an embodiment of the present application;

13 is a schematic diagram of a caching method provided by an embodiment of the present application;

14 is a schematic diagram of another caching mode provided by an embodiment of the present application;

FIG. 15 is a schematic flowchart of an SVGF processing provided by an embodiment of the present application;

16 is a schematic diagram of a first preset window, a second preset window, and a wavelet transform window in the process of pixel-by-pixel processing of two adjacent pixels by three processing engines in an SVGF accelerator provided by an embodiment of the present application.

Detailed ways

The making and using of the various embodiments are discussed in detail below. It should be appreciated, however, that many of the applicable inventive concepts provided herein can be embodied in a wide variety of specific contexts. The specific embodiments discussed are merely illustrative of specific ways to make and use the description and the technology, and do not limit the scope of the application.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.

Various circuits or other components may be described or referred to as "for" performing one or more tasks. In this context, "for" is used to imply structure by indicating that the circuit/component includes structure (eg, circuitry) that performs one or more tasks during operation. Thus, the specified circuit/component may be said to be used to perform the task even when the specified circuit/component is not currently operational (eg, not turned on). Circuits/components used with the phrase "for" include hardware, such as circuits that perform operations, and the like.

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application. In this application, "at least one" means one or more, and "plurality" means two or more. "And/or", which describes the association relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, which can indicate: the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship. "At least one item(s) below" or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a) of a, b or c may represent: a, b, c, a and b, a and c, b and c or a, b and c, where a, b and c can be It can be single or multiple.

In the embodiments of the present application, the words "first" and "second" are used to distinguish objects with similar names or functions or functions. Those skilled in the art can understand that the words "first" and "second" do not equate quantity. and the order of execution. The term "coupled" is used to denote electrical connection, including direct connection through wires or terminals or indirect connection through other devices. Therefore "coupling" should be regarded as an electronic communication connection in a broad sense.

It should be noted that, in this application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations. Any embodiment or design described in this application as "exemplary" or "such as" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present the related concepts in a specific manner.

The technical solution of the present application can be applied to an image processing device with an image processor (graphics processing unit, GPU). The image processing device can be a mobile phone, a tablet computer, a computer, a notebook computer, a video camera, a camera, a wearable device, an in-vehicle device (for example, a car, a bicycle, an electric vehicle, an airplane, a ship, a train, a high-speed rail, etc.), a virtual reality (virtual reality) reality, VR) devices, augmented reality (AR) devices, or intelligent robots, etc. For the convenience of description, the above-mentioned devices are collectively referred to as image processing devices in this application.

FIG. 1 is a schematic structural diagram of an image processing device provided by an embodiment of the application. The device is illustrated by taking a mobile phone as an example. The device may include: a memory 101 , a processor 102 , a sensor component 103 , a multimedia component 104 , and an audio component 105 and power supply assembly 106, etc.

Below in conjunction with Fig. 1 each constituent part of this equipment is introduced in detail:

The memory 101 can be used to store data, software programs and modules; it mainly includes a stored program area and a stored data area, wherein the stored program area can store an operating system, an application program required for at least one function, such as a sound playback function, an image playback function, etc. ; The storage data area can store data created according to the use of the device, such as audio data, image data, phone book, etc. Additionally, the device may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. In this embodiment of the present application, the memory 101 may include one or more storage devices, and the one or more storage devices may be partially or fully integrated in the processor 102. For example, the memory 101 includes a double data rate (DDR) ) memory (abbreviated as DDR), which may be integrated in the processor 102 .

The processor 102 is the control center of the device, using various interfaces and lines to connect various parts of the entire device, by running or executing the software programs and/or modules stored in the memory 101, and calling the data stored in the memory 101, Perform various functions of the device and process data to monitor the device as a whole. Optionally, the processor 102 may include one or more processing units. For example, the processor 102 may include a central processing unit (CPU) and a graphics processor (graphic processing unit, GPU). The CPU mainly processes the operating system , user interface and applications, etc. GPU is a processor specially designed to process images. In addition, the processor 102 may further include other hardware circuits or accelerators, such as field programmable gate arrays or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof, which is not specifically limited in this embodiment of the present application.

Sensor assembly 103 includes one or more sensors for providing status assessments of various aspects of the device. The sensor component 103 may include a light sensor for detecting the distance between an external object and the device, or used in imaging applications, that is, as an integral part of a camera or a camera. In addition, the sensor component 103 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor, and the acceleration/deceleration, orientation, open/closed state of the device, relative positioning of the components can be detected through the sensor component 103, or temperature changes of the device, etc.

The multimedia component 104 provides an output interface screen between the device and the user, the screen may be a touch panel, and when the screen is a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action. In addition, the multimedia component 104 further includes at least one camera, for example, the multimedia component 104 includes a front camera and/or a rear camera. When the device is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.

Audio component 105 may provide an audio interface between the user and the device, for example, audio component 105 may include audio circuitry, speakers, and a microphone. The audio circuit can convert the received audio data into an electrical signal and transmit it to the speaker, which is converted into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electrical signal, which is received by the audio circuit and converted into audio data, and then output the audio data for sending, for example, to another such device, or output the audio data to the processor 102 for further processing.

Power supply assembly 106 is used to provide power to the various components of the device, and power supply assembly 106 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the device.

Although not shown, the device may further include a wireless fidelity (wireless fidelity, WiFi) module, a Bluetooth module, and the like, which are not described herein again in this embodiment of the present application. Those skilled in the art can understand that the structure of the device shown in FIG. 1 does not constitute a limitation to the device, and the device may include more or less components than the one shown, or combine some components, or different Component placement.

FIG. 2 is a schematic structural diagram of an image processor GPU according to an embodiment of the present application. The GPU may include: N processing cores (cores), which are respectively represented as core1 to coreN, where N is a positive integer. As shown in FIG. 2, each of the N processing cores may include a stream multiprocessor (SM) 201, a spatiotemporal variance-guided filtering (SVGF) accelerator 202, and a buffer 203, the SM 201 and the SVGF accelerator 202 may share the buffer 203, and the SVGF accelerator 202 may also be referred to as an SVGF processor. The SM 201 is a general-purpose processor in the processing core of the GPU, the SVGF accelerator 202 is a dedicated processor in the processing core of the GPU for executing the SVGF method, and the SM 201 and the SVGF accelerator 202 can share the buffer 203. Further, the GPU can also be coupled with the memory in the image processing device where the GPU is located, and the memory can be used to store the image to be processed, as well as the relevant data of the SM 201 and the SVGF accelerator 202 in the image processing process, etc. This application implements The example does not impose specific restrictions on this. In FIG. 2 , the memory includes DDR, and the GPU is coupled with the DDR as an example for description.

When the GPU processes images, the N processing cores can be used to process different images at the same time, and can also be used to process different image tiles (tiles) of the same image at the same time. Exemplarily, as shown in FIG. 3 , taking each of the N processing cores for processing different image blocks (tiles) of the same image as an example, the image to be processed may be divided into multiple image blocks. , the size of the multiple image blocks may be the same, the number of the multiple image blocks may be less than or equal to N, and the multiple image blocks may be processed by different processing cores in the N processing cores respectively. For each processing core in the N processing cores, the SM 201 in the processing core can be used to perform general processing on the image block. For example, the SM 201 can be used to obtain parameter information of pixels in the image block, etc., and the SVGF accelerator 202 can be used for SVGF processing of the image block.

It should be noted that the process that the N processing cores are used to process different images at the same time is similar to the process that the N processing cores are used to process different image blocks of the same image at the same time. This application mainly relates to the relevant description of the SVGF accelerator 202 performing SVGF processing on images or image blocks through hardware implementation. The following takes one processing core in the N processing cores as an example to describe the specific structure of the SVGF accelerator 202 in the processing core. Detailed Description.

For ease of understanding, concepts related to the temporal filtering processing, the filtering momentum processing, and the spatial filtering processing involved in the SVGF method are explained here first.

Temporal filtering processing: refers to the method of filtering the next frame image by using the previous frame image in multiple consecutive frame images. For example, performing temporal filtering processing on a pixel may refer to using the pixel in the previous frame of image and multiple surrounding pixels to filter the pixel in the next frame of image.

Filter momentum processing: It refers to the process of selectively using the pixel and surrounding multiple pixels in the same frame image to filter the pixel according to the number of temporal filtering of a pixel (also known as the number of historical calculations). Way. For example, filtering momentum processing on a pixel may mean that when the number of temporal filtering of the pixel is greater than a preset value, the pixel and surrounding multiple pixels in the same frame image are used to filter the pixel. .

Spatial filtering processing: refers to a method of filtering a pixel point and multiple surrounding pixels in the same frame image. For example, performing spatial filtering processing on a pixel may refer to using the pixel and multiple pixels in different surrounding ranges to perform multiple filtering on the pixel respectively.

FIG. 4 is a schematic structural diagram of an SVGF accelerator in a GPU provided by an embodiment of the present application. The SVGF accelerator includes: a temporal processing engine (TPE) 1, a momentum processing engine (moment processing engine, MPE) 2 and a spatial domain A processing engine (spatial processing engine, SPE) 3, MPE2 includes a first buffer 21 and a momentum calculation circuit 22, and SPE 3 includes a second buffer 31 and a spatial calculation circuit 32.

In this embodiment of the present application, the TPE 1 may be used to perform temporal filtering processing on the to-be-processed image according to the previous frame of the image to obtain a first image, and the parameter information of the pixels in the first image is the first parameter information TPE1 can also be used to send the first parameter information to MPE 2; the first buffer 21 in MPE 2 can be used to buffer the first parameter information, for example, the first buffer 21 can be used to buffer one or more in the first image The first parameter information of the pixel; the momentum calculation circuit 22 in the MPE 2 can be used to filter the momentum processing of the first image according to the first parameter information to obtain the second image, and the parameter information of the pixel in the second image is the second parameter The momentum calculation circuit 22 can also be used to send the second parameter information to the SPE 3; the second buffer 31 in the SPE 3 can be used to buffer the second parameter information, for example, the second buffer 31 can be used to buffer the second image The second parameter information of one or more pixels; the spatial calculation circuit 32 is used to perform spatial filtering processing on the second image according to the second parameter information to obtain the target image, and the parameter information of the pixels in the target image is the target parameter information.

Optionally, the TPE 1 may include a time domain calculation circuit 11 and a third buffer 12; the time domain calculation circuit 11 is specifically configured to perform a time domain filtering process on the to-be-processed image according to the previous frame of the image to be processed to obtain The first image; the third buffer 12 can be used to buffer the parameter information of the pixels in the previous frame image, and the parameter information of the pixels in the previous frame image can be called historical parameter information, that is, the third buffer 12 can be used for Cache the historical parameter information of one or more pixels in the previous frame of image; and/or, the third buffer 12 can also be used to cache the parameter information of the pixels in the image to be processed, the parameter information of the pixels in the image to be processed The parameter information may be referred to as initial parameter information, that is, the third buffer 12 may also be used to buffer the initial parameter information of one or more pixels in the image to be processed.

Wherein, the to-be-processed image and the previous frame of image may be complete two frames of images, or may be two corresponding image blocks in the two frames of images. For example, if the size of one frame of image is 1920×1080, the size of the image to be processed and the previous frame of image may both be 1920×1080; or, the size of one frame of image is 1920×1080 and one frame of image may be divided into There are 6 image blocks, and the size of each image block is 640×540, then the size of the image to be processed and the image of the previous frame can both be 640×540.

In addition, the parameter information of a pixel may include one or more of the following parameters: geometric coordinate information, luminance (luminance) information, momentum (moment) information, motion vector (motion vector), variance, historical calculation times; the geometric The coordinate information may include normal, depth, and coordinate. That is, any parameter information among the initial parameter information, the historical parameter information, and the first parameter information may include one or more of the above parameters. Optionally, the initial parameter information of the pixels in the image to be processed may be determined by the SM in the GPU, and the historical parameter information of the pixels in the previous frame of the image may be processed by the SVGF accelerator in the previous frame. obtained after the image.

Furthermore, the temporal computing circuit 11 may perform pixel-by-pixel processing on the pixel points in the to-be-processed image in a row-by-row or column-by-column sequence when performing temporal filter processing on the to-be-processed image. Similarly, when the momentum calculation circuit 22 performs filtering momentum processing on the first image, and when the spatial domain computing circuit 32 performs spatial filtering processing on the second image, the first image and the first image can also be processed in a row-by-row or column-by-column order. The pixels in the two images are processed pixel by pixel. The specific structures of the time domain calculation circuit 11 , the momentum calculation circuit 22 and the space domain calculation circuit 32 will be introduced and described below.

In this embodiment of the present application, as shown in FIG. 5 , the time domain calculation circuit 11 may include: an information acquisition circuit 111 , a signal generation circuit 112 and a first convolution circuit 113 . The relevant descriptions about the information acquisition circuit 111 , the signal generation circuit 112 , and the first convolution circuit 113 are as follows.

In a possible embodiment, the information obtaining circuit 111 is configured to: for the first pixel in the to-be-processed image, obtain the first pre-prediction where the first pixel corresponds to the historical pixel in the previous frame of image. Assuming the historical parameter information of multiple pixels in the window, the first pixel is any pixel in the image to be processed.

Wherein, the initial parameter information of the first pixel point may include a first coordinate position, a first motion vector, initial brightness information and initial momentum information. The motion vector of the first pixel point may be a motion vector between the first coordinate position of the first pixel point and the historical coordinate position of the first pixel point corresponding to the historical pixel point in the previous frame of image. For example, if the first coordinate position of the first pixel is (0, 0) and the first motion vector is (2, 2), then the first pixel corresponds to the historical coordinate position of the historical pixel in the previous frame of image is (0,0)+(2,2)=(2,2).

In addition, the first preset window is a preset convolution window used for performing temporal filtering processing on the pixels in the image to be processed. In the process of processing the pixels in the image to be processed in a row-by-row or column-by-column sequence, the first preset window is sliding, so that for different first pixels, many pixels in the first preset window are slid. pixels are also different.

Specifically, the information acquisition circuit 111 may determine, according to the first coordinate position of the first pixel and the first motion vector, the historical pixel corresponding to the first pixel in the previous frame of image, that is, determine the historical coordinate of the historical pixel position; the information acquisition circuit 111 may also acquire historical parameter information of a plurality of pixel points within the first preset window where the historical pixel point is located based on the historical coordinate position. Optionally, the information acquisition circuit 111 uses the historical coordinate position of the historical pixel point as the center of the first preset window, and acquires historical parameter information of a plurality of pixel points in the first preset window. Exemplarily, the first preset window is a 3×3 window, and the historical coordinate position is (2, 2), then the multiple pixels in the first preset window may be the coordinate positions in the previous frame of image: Pixels from (1,1) to (3,3).

In a possible embodiment, the signal generating circuit 112 is configured to: generate a control signal according to the geometric coordinate information of the plurality of pixel points. Optionally, the signal generation circuit 112 is specifically configured to: generate an effective control signal when the geometric coordinate information of at least one pixel in the preset area among the plurality of pixels in the first preset window satisfies the specified coordinate range, For example, the effective control signal may be high level, and the specified coordinate range is related to the geometric coordinate information of the first pixel; when the geometric coordinate information of multiple pixels in the first preset window does not meet the specified coordinate range Invalid control information is generated, for example, invalid control information may be low.

The first preset window may be divided into multiple preset regions, and the shapes and sizes of the multiple preset regions may be different. For example, as shown in FIG. 6 , the first preset window is a 3×3 window, the multiple preset areas may include a first preset area and a second preset area, and the first preset area may be “field” The second preset area may be an "L"-shaped area.

In addition, the geometric coordinate information of each pixel in the plurality of pixels may include a historical normal, a historical depth, and a historical coordinate position; the geometric coordinate information of the first pixel may include a first normal, a first depth, and a first Coordinate location. Optionally, the above-mentioned specified coordinate range may specifically include: the historical coordinate position is within the length and width of the previous frame of image, the difference between the first depth and the historical depth is less than the preset ratio, the first normal direction and the historical method. The difference between the directions is less than the preset difference.

Specifically, the plurality of preset regions included in the first preset window may be configured with priorities, and the signal generation circuit 112 may perform processing in order of priority from high to low. If the geometric coordinate information of at least one pixel in a certain preset area satisfies the specified coordinate range, for example, the historical coordinate position of the at least one pixel is within the length and width of the previous frame of image, the first depth and the at least one If the difference between the historical depths of the pixels is smaller than the preset ratio, and the difference between the first normal and the historical normal of the at least one pixel is smaller than the preset difference, then the signal generation circuit 112 generates an effective control signal. If none of the pixels in a preset area meet the specified coordinate range, for example, the historical coordinate positions of all the pixels in the preset area are not within the length and width of the previous frame of image, or the first normal direction The difference with the historical normal of all the pixels is greater than or equal to the preset difference, or the difference between the first normal and the historical normal of all the pixels is greater than or equal to the preset difference , the signal generation circuit 112 may repeat the above operations according to the pixel points in the preset area of the next priority. If none of the pixels in the multiple preset regions included in the first preset window satisfy the specified coordinate range, the signal generating circuit 112 may generate an invalid control signal. For convenience of description, if the geometric coordinate information of at least one pixel in a certain preset area in the plurality of preset areas satisfies the specified coordinate range, the preset area becomes the target preset area.

In a possible embodiment, the first convolution circuit 113 is configured to: when the control signal is valid, determine whether the first pixel is in the first The first brightness information and the first momentum information in an image; when the control signal is invalid, the first brightness information and the first momentum of the first pixel in the first image are respectively determined according to the preset brightness information and the preset momentum information information; the first variance is determined according to the first momentum information.

Optionally, the first convolution circuit 113 may include: a first luminance convolution circuit, a first momentum convolution circuit, and a first variance calculation circuit. The first brightness convolution circuit is used for: when the control signal is valid, the convolution operation is performed based on the historical brightness information of the pixel points in the target preset area to obtain the first brightness information of the first pixel point; When invalid, the preset brightness information is determined as the first brightness information of the first pixel, for example, the preset brightness information may be 0; the first brightness information and the historical brightness information of the pixels in the target preset area are combined. The luminance information obtained after alpha blending is determined as the first luminance information of the first pixel in the first image. The first momentum convolution circuit is used to: perform a convolution operation based on the historical momentum information of the pixel points in the target preset area when the control signal is valid, so as to obtain the first momentum information of the first pixel point in the first image When the control signal is invalid, the preset momentum information is determined as the first momentum information of the first pixel, for example, the preset momentum information can be 0; the first momentum information and the pixel in the target preset area are determined The momentum information obtained after performing alpha blending on the historical momentum information of , is determined as the first momentum information of the first pixel in the first image. The first variance calculation circuit is configured to determine the first variance of the first pixel in the first image according to the first momentum information.

Exemplarily, if the control signal is valid, the target preset area is a preset area of a "field" shape including four pixels, and each of the four pixels corresponds to a first convolution weight. , then the first luminance convolution circuit can perform a convolution operation according to the first convolution weight of the four pixels and the historical luminance information of the four pixels, and combine the luminance information obtained by the convolution operation with the four pixels The historical brightness information of the point is alpha blended to obtain the first brightness information of the first pixel in the first image; the first momentum convolution circuit can be based on the first convolution weight of the four pixels and the four pixels. The historical momentum information of the point is subjected to a convolution operation, and the momentum information obtained by the convolution operation is mixed with the momentum brightness information of the four pixels to obtain the first momentum information of the first pixel in the first image; The first variance calculation circuit may determine the first variance according to the obtained first momentum information. For example, the first momentum information includes the momentum values of the red (red, R) channel and the green (green, G) channel. The difference calculation circuit determines the first variance based on the momentum values of the R channel and the G channel.

Further, the first convolution weights corresponding to the pixels in different preset areas among the multiple preset areas included in the first preset window may be different, and the first convolution weights may be configured or is obtained through calculation, and the specific value of the first convolution weight may be greater than or equal to 0 and less than or equal to 1. For example, the plurality of preset regions include a "Tian"-shaped preset region and an "L"-shaped preset region, and the first convolution weights of the pixels in the "Tian"-shaped preset region are obtained by calculation, The first convolution weight of the pixels in the "L"-shaped preset area is configured (for example, the first convolution weight of each pixel may be configured as 1).

Optionally, the time domain calculation circuit 11 may further include: a first weight calculation circuit 114 . The first weight calculation circuit 114 is used for: according to the geometric coordinate information of each pixel in the target preset area, determine the normal weight, depth weight and brightness weight of the pixel; according to each pixel in the target preset area The point's normal weight, depth weight, and luminance weight determine the first convolution weight for that pixel.

Specifically, for each pixel in the target preset area, the geometric coordinate information of the pixel may include historical coordinate position, historical normal and historical depth, and the pixel also includes historical brightness information, then the first weight calculation circuit The normal weight of the pixel point can be determined according to the difference between the historical coordinate position and the historical normal direction between the pixel point and the historical pixel point corresponding to the first pixel point, according to the difference between the historical coordinate position and the historical depth. Determine the depth weight of the pixel point, and determine the brightness weight of the pixel point according to the difference between the historical coordinate position and the historical brightness information; after that, the first weight calculation circuit can The normal weight of the pixel point, the depth weight and the brightness The weights are respectively multiplied by their respective preset proportions, and the sum of the products obtained by multiplying the three by their respective preset proportions is determined as the first convolution weight of the pixel. It should be noted that the preset proportions may be preset, and the respective preset proportions of the normal weight, depth weight, and brightness weight may be the same or different, which are not specifically limited in this embodiment of the present application.

Further, the time domain calculation circuit 11 may further include: an update circuit 115 . The update circuit 115 is used for: when the control signal is valid, the value obtained by adding one to the historical calculation times of the historical pixel point is determined as the historical calculation times of the first pixel point, for example, the historical calculation times of the historical pixel point is 0, the historical calculation times of the first pixel is 0+1=1; when the control signal is invalid, it is determined that the historical calculation times of the first pixel is 0.

Exemplarily, as shown in FIG. 7 , a schematic structural diagram of a TPE 1 provided in an embodiment of the present application. The TPE1 includes: a time domain calculation circuit 11 and a third buffer 12, where the third buffer 12 can be used to buffer historical parameter information of multiple pixels in the first preset window. The time domain calculation circuit 11 includes a signal acquisition circuit 111, a signal generation circuit 112, a first convolution circuit 113, a first weight calculation circuit 114 and an update circuit 115, and the first convolution circuit 113 includes a first luminance convolution circuit, a first momentum convolution circuit and a first variance calculation circuit. Specifically, for the first pixel in the image to be processed, the temporal filtering process may include: the signal acquisition circuit 111 acquires historical parameters of multiple pixels in the first preset window in the previous frame of image information; the signal generation circuit 112 can be used to generate a control signal according to the geometric coordinate information of the plurality of pixels; the first weight calculation circuit 114 is used to determine the first convolution weight (preconfigured or Obtained after calculation); the first brightness convolution circuit is used to determine the first brightness information of the first pixel in the first image according to the first convolution weight and historical brightness information of each pixel in the target preset area; A momentum convolution circuit is used to determine the first brightness information of the first pixel in the first image according to the first convolution weight and historical momentum information of each pixel in the target preset area; the first variance calculation circuit uses for determining the first variance according to the first momentum information; the updating circuit 115 is used for updating the historical calculation times of the first pixel point. For a detailed description of each of the foregoing circuits, reference may be made to the foregoing description, and details are not described herein again in this embodiment of the present application.

After the time domain calculation circuit 11 completes the time domain filtering processing of the first pixel in the to-be-processed image according to the above method, the time domain calculation circuit 11 can send the first parameter information of the first pixel to the MPE 2, and the MPE 2 The first buffer 21 in can buffer the first parameter information of the first pixel. For example, the first parameter information of the first pixel point may include geometric coordinate information, first brightness information, first momentum information, first variance, and historical calculation times, and the like.

In this embodiment of the present application, as shown in FIG. 8 , the momentum calculation circuit 22 may include: a second convolution circuit 221 and a selection circuit 222 . The relevant descriptions about the second convolution circuit 221 and the selection circuit 222 are as follows.

In a possible embodiment, for the first pixel in the first image, the first pixel is any pixel in the first image, and the second convolution circuit 221 is used for: according to the location of the first pixel Perform convolution operation on the first brightness information of multiple pixels in the second preset window to obtain the convolution brightness information of the first pixel; The convolution operation is performed on the first momentum information of the pixel point to obtain the convolution momentum information of the first pixel point; the convolution variance of the first pixel point is determined according to the convolution momentum information.

Wherein, the second preset window is a preset convolution window used to perform filtering momentum processing on the pixels in the first image, for example, the second preset window may be a 7×7 window. In the process of processing the pixels in the first image in a row-by-row or column-by-column sequence, the second preset window is sliding, so that for different first pixels, multiple Pixels are also different.

Optionally, the second convolution circuit 221 may include: a second luminance convolution circuit, a second momentum convolution circuit, and a second variance calculation circuit. The second luminance convolution circuit is configured to perform a convolution operation according to the convolution weights of the plurality of pixels in the second preset window and the first luminance information of the plurality of pixels, so as to obtain the convolution of the first pixel. Accumulate brightness information. The second momentum convolution circuit is configured to perform a convolution operation according to the convolution weights of the plurality of pixels in the second preset window and the first momentum information of the plurality of pixels, so as to obtain the volume of the first pixel. Accumulated momentum information. The second variance calculation circuit is configured to determine the convolution variance of the first pixel point according to the convolution momentum information.

Exemplarily, if the second preset window includes 49 pixel points, and each pixel point in the 49 pixel points corresponds to a convolution weight, the second luminance convolution circuit can be based on the 49 pixel points. The convolution weight and the first brightness information of the 49 pixels are subjected to a convolution operation to obtain the convolution brightness information of the first pixel; the second momentum convolution circuit can be based on the convolution weight of the 49 pixels and the The first momentum information of the 49 pixels is subjected to a convolution operation to obtain the convolution momentum information of the first pixel; the second variance calculation circuit can determine the convolution variance of the first pixel according to the convolution momentum information, for example, The convolution momentum information includes the momentum values of the R channel and the G channel, and the second variance calculation circuit determines the convolution variance of the first pixel point according to the momentum values of the R channel and the G channel.

Further, the momentum calculation circuit 22 may further include: a second weight calculation circuit 223 . The second weight calculation circuit 223 is configured to: determine the normal weight, depth weight and brightness weight of each pixel in the second preset window according to the geometric coordinate information of the pixel; The normal weight, depth weight and luminance weight of each pixel determine the second convolution weight of the pixel.

Specifically, for each pixel in the second preset window, the geometric coordinate information of the pixel may include the first coordinate position, the first normal direction and the first depth, and the pixel also includes the first brightness information, then If the first coordinate position is within the length and width of the first image, the second weight calculation circuit can be used to: determine the normal direction of the pixel point according to the difference in the first normal direction between the pixel point and the first pixel point weight; determine the depth weight of the pixel point according to the difference of the first depth between the pixel point and the first pixel point; determine the depth of the pixel point according to the difference of the first brightness information between the pixel point and the first pixel point Brightness weight; multiply the normal weight, depth weight and brightness weight of the pixel by their respective preset proportions, and determine the sum of the products after multiplying the three by their respective preset proportions as the pixel's Second convolution weight. It should be noted that the preset proportions may be preset, and the respective preset proportions of the normal weight, depth weight, and brightness weight may be the same or different, which are not specifically limited in this embodiment of the present application. In addition, for a pixel whose first coordinate range is not within the length and width range of the first image, the second convolution weight of the pixel may be 0.

In a possible embodiment, for the first pixel in the first image, the first pixel is any pixel in the first image, and the selection circuit 222 is configured to: select the first brightness information, the first momentum The information and the first variance are respectively used as the second brightness information, the second momentum information and the second variance of the first pixel in the second image; or the convolution brightness information, the convolution momentum information and the convolution variance are selected are respectively the second luminance information, the second momentum information and the second variance of the first pixel in the second image.

Specifically, the first parameter information of the first pixel point further includes the number of historical calculations, and the selection circuit 222 is specifically configured to: when the number of historical calculations of the first pixel point is greater than or equal to the preset number of times, select the first brightness information, the first The momentum information and the first variance are respectively used as the second brightness information, the second momentum information and the second variance of the first pixel in the second image; when the number of historical calculations of the first pixel is less than the preset number, select The convolution luminance information, the convolution momentum information and the convolution variance are respectively used as the second luminance information, the second momentum information and the second variance of the first pixel in the second image. It should be noted that, the above preset number of times may be preset, for example, the preset number of times may be 3 or 4, etc., which is not specifically limited in this embodiment of the present application.

Exemplarily, as shown in FIG. 9 , a schematic structural diagram of an MPE 2 provided in an embodiment of the present application. The MPE 2 includes: a first buffer 21 and a momentum calculation circuit 22, and the first buffer 21 is used to buffer the first parameter information of the pixels. The momentum calculation circuit 22 includes a second convolution circuit 221, a selection circuit 222 and a second weight calculation circuit 223, and the second convolution circuit 221 includes a second luminance convolution circuit, a second momentum convolution circuit and a second variance calculation circuit circuit. Specifically, for the first pixel in the first image, the momentum filtering process may include: when acquiring the first parameter information of multiple pixels in the second preset window, the second weight calculation circuit 223 Determine the second convolution weight of each pixel in the plurality of pixels in the second preset window; the second luminance convolution circuit is based on the second convolution weight of the plurality of pixels in the second preset window and the first A luminance information performs a convolution operation to obtain the convolution luminance information of the first pixel point; the second momentum convolution circuit performs a convolution operation according to the second convolution weight of the plurality of pixels in the second preset window and the first luminance information The convolution operation is used to obtain the convolution brightness information of the first pixel point; the second variance calculation circuit is used to determine the convolution variance according to the convolution momentum information; the selection circuit 222 is used to select the first pixel point according to the historical calculation times of the first pixel point. luminance information, first momentum information and first variance, or selecting convolutional luminance information, convolution momentum information and convolution variance as the second luminance information and second momentum information of the first pixel in the second image, respectively and the second variance. For a detailed description of each of the foregoing circuits, reference may be made to the foregoing description, and details are not described herein again in this embodiment of the present application.

After the momentum calculation circuit 22 completes the filtering momentum processing of the first pixel in the first image according to the above method, the momentum calculation circuit 22 can send the second parameter information of the first pixel to the SPE 3, and the first pixel in the SPE 3 The second buffer 31 can buffer the second parameter information of the first pixel. For example, the second parameter information of the first pixel point may include geometric coordinate information, second luminance information, second momentum information, second variance, and historical calculation times, and the like.

In the embodiment of the present application, when the spatial domain calculation circuit 32 performs spatial domain filtering processing on the second image according to the second parameter information, it can use wavelet transform to perform spatial domain filtering processing on the second image, that is, the spatial domain calculation circuit 32 can perform spatial domain filtering processing on the second image. The same pixel point of , respectively, is subjected to multiple spatial filtering processing, and the size of the wavelet transform window used in each spatial filtering processing is obtained according to the wavelet transform. For example, as shown in FIG. 10 , the number of steps of wavelet transform is 3 and the initial window is 3×3, then the spatial domain calculation circuit 32 needs to perform four spatial domain filtering processing on the same pixel in the second image, and the first spatial domain filtering process The wavelet transform window during processing is 3×3, and the size of the wavelet transform window (which can also be called a hole convolution window at this time) during the second to fourth spatial filtering processing is 5×5, 7×7 and 9 respectively. ×9.

As shown in FIG. 11 , the spatial domain calculation circuit 32 may include: a convolution conversion circuit 321 , a third weight calculation circuit 322 and a third convolution circuit 323 . When using different wavelet transform windows for spatial filtering, the convolution conversion circuit 321 , the third weight calculation circuit 322 and the third convolution circuit 323 can all be used to perform the corresponding functions below.

In a possible embodiment, the convolution conversion circuit 321 is configured to: extract the second parameters of multiple effective pixels in each of the multiple wavelet transform windows of the first pixel in the second image The multiple wavelet transform windows are obtained through wavelet transform with a preset number of steps, the second parameter information may include second brightness information and second variance, and the first pixel point may be any pixel point in the second image . Wherein, the first pixel point may be located at the center position of the wavelet transform window, and the plurality of valid pixels may include pixels located at the center position of the wavelet transform window, the vertex positions of the four sides, and the center positions of the four sides. point. For example, as shown in FIG. 10 , when the wavelet transform window is 3×3, the convolution conversion circuit 321 is used to extract the second parameter information of multiple pixels (ie, effective pixels) in the wavelet transform window. When the second parameter information can be the second parameter information output by the MPE 2; when the wavelet transform window is 5×5, 7×7 or 9×9, the convolution conversion circuit 321 is used to extract the location within the wavelet transform window. The second parameter information of the center position, the vertex position of the four sides, and the pixel point (ie, the effective pixel point) at the center position of the four sides. At this time, the second parameter information is the second parameter information obtained after the last spatial processing. Parameter information. In FIG. 8 , the first pixel is represented by a black box, and the valid pixel is represented by a box filled with dots.

In a possible embodiment, the third weight calculation circuit 322 is configured to: for each wavelet transform window in the plurality of wavelet transform windows, according to the second variance of the plurality of valid pixels in the wavelet transform window, A third convolution weight of the plurality of valid pixels is determined. Specifically, for each valid pixel point in the plurality of valid pixel points, the third weight calculation circuit 322 is specifically configured to: determine the depth weight respectively according to the second variance, the second luminance information and the geometric coordinate information of the valid pixel point , normal vector weight and brightness weight; determine the third convolution weight of the effective pixel point according to the normal weight, depth weight and brightness weight of the effective pixel point. Exemplarily, for a 3×3 wavelet transform window, the third weight calculation circuit 322 may be used to: perform Gaussian blending (gaussians blur) processing on the second variances of the 9 valid pixels in the wavelet transform window to obtain Mixed variance; for each valid pixel in the 9 valid pixels, the depth weight, normal vector weight and luminance weight are determined respectively according to the mixed variance, and the second luminance information and geometric coordinate information of the valid pixel, and then Determine the third convolution weight.

In a possible embodiment, the third convolution circuit 323 is configured to: for each wavelet transform window in the plurality of wavelet transform windows, according to the third convolution of the plurality of valid pixels in the wavelet transform window The weight, the second brightness information and the second variance of the plurality of effective pixel points, respectively determine the third brightness information and the third variance of the first pixel point; The brightness information and the third variance are used to determine the target brightness information and the target variance of the first pixel in the target image, respectively.

Optionally, the third convolution circuit 323 may include: a third luminance convolution circuit and a third difference calculation circuit. Wherein, the third luminance convolution circuit is used for: for each wavelet transform window in the plurality of wavelet transform windows, according to the third convolution weight of the plurality of effective pixels in the wavelet transform window and the plurality of effective pixels The second brightness information of the point is subjected to convolution operation to obtain the third brightness information of the first pixel, so that for multiple wavelet transform windows, there are multiple third convolution weights and multiple first pixels corresponding to the first pixel. Three brightness information; determine the target brightness information of the first pixel point according to the multiple third convolution weights and multiple third brightness information, for example, the target brightness information of the first pixel point may be the sum of multiple third brightness information Ratio to the sum of multiple third convolution weights. In addition, the third difference calculation circuit is configured to: for each wavelet transform window in the plurality of wavelet transform windows, determine the weight square of the third convolution weight of the plurality of effective pixel points in the wavelet transform window, and calculate the The weight squares of multiple valid pixels and the second variance of the multiple valid pixels are convolved to obtain the third variance of the first pixel, so that for multiple wavelet transform windows, the first pixel corresponds to There are multiple third convolution weights and multiple third differences; according to multiple third convolution weights and multiple third differences of the first pixel points corresponding to the multiple wavelet transform windows, it is determined that the first pixel point is in the The target variance in the target image, for example, the target variance of the first pixel point may be the ratio of the sum of multiple third differences and the sum of multiple third convolution weights.

Exemplarily, as shown in FIG. 12 , a schematic structural diagram of an SPE 3 provided in this embodiment of the present application. The SPE 3 includes: a second buffer 31 and a space calculation circuit 32, and the second buffer 31 is used for buffering the second parameter information of the pixels. The spatial calculation circuit 32 includes a convolution conversion circuit 321, a third weight calculation circuit 322 and a third convolution circuit 323, and the third convolution circuit 323 includes a third luminance convolution circuit and a third difference calculation circuit. Specifically, for the first pixel in the second image, the spatial filtering process may include: the convolution conversion circuit 321 is configured to extract the multiplicity of the first pixel in each wavelet transform window of the plurality of wavelet transform windows. The second parameter information of the effective pixels; the third weight calculation circuit 322 is used to determine the third convolution weight of the multiple effective pixels in each wavelet transform window; the third luminance convolution circuit is used to determine the first pixel The target brightness information of the point in the target image; the third variance calculation circuit is used to determine the target variance of the first pixel point in the target image. For a detailed description of each of the foregoing circuits, reference may be made to the foregoing description, and details are not described herein again in this embodiment of the present application.

Further, the above-mentioned first buffer 11, second buffer 21 and third buffer 31 may be collectively referred to as the local buffer of the SVGF accelerator. The buffer shared by the SM and the SVGF accelerator in the GPU may be referred to as a secondary buffer, and a memory coupled to the GPU in the image processing device where the GPU is located may be referred to as a secondary memory. Wherein, when the SVGF accelerator is performing SVGF processing, the different parameter information of the above-mentioned pixels may be stored in the local buffer, and may also be stored in the secondary buffer or the secondary storage.

In a possible embodiment, as shown in FIG. 13 , there is a storage method provided in this embodiment of the present application, in which a minimum local buffer can be implemented. Specifically, the local buffer stores: 1). MPE 2. The first parameter information of the first preset number of pixels in the first image required when the MPE performs the filtering momentum processing on the pixels in the first image , the first preset number can be the square of Size_Window2, Size_Window2 represents the size of the second preset window, and the first parameter information of each pixel can include first brightness information, first momentum information, first variance and historical calculation times, (optionally, the first parameter information may also include geometric coordinate information of each pixel point, such as normal direction and depth, etc.). 2). The second parameter information of the second preset number of pixels in the second image required when SPE 3 performs spatial momentum processing on the pixels in the second image, the second preset number can be (Size_Window3+( Step-1)×2) square, Size_Window3 represents the size of the initial convolution window of the wavelet transform, Step represents the number of wavelet transform steps, the second parameter information may include the second brightness information and the second variance, (optional, the first The two-parameter information may also include geometric coordinate information of each pixel, such as normal and depth, etc.).

At this time, the secondary buffer or the secondary memory may store: 1). Historical parameter information of each pixel in the previous frame of image, the historical parameter information may include geometric coordinate information, historical brightness information, and historical momentum information and the number of historical calculations, as well as the geometric coordinate information and motion vector of each pixel in the image to be processed. 2). The first parameter information of the third preset number of pixels in the first image required by the MPE 2 processing output by the TPE 1, and the third preset number may be [Width×(Size_Window2×Step-1)+ Size_Window2], Width represents the width of the image to be processed, and the first parameter information of each pixel point may include first brightness information, first momentum information, first variance and historical calculation times. 3). The second parameter information of the fourth preset number of pixels in the second image required by the SPE 3 processing output by the MPE 2, the fourth preset number may be [Width×(Size_Window3×Step-1)+ Size_Window3×Step], the second parameter information may include second luminance information and second variance, (optionally, the second parameter information may also include geometric coordinate information of each pixel, such as normal and depth, etc.).

In the above storage mode, in the process of pixel-by-pixel processing by MPE 2 and SPE 3, the process of updating the parameter information of the pixel by the local buffer and the secondary memory is similar. The pixel processing of the pixels of the first image is taken as an example for description. Specifically, every time the second preset window slides by one pixel, the MPE 2 needs to update the first parameter information of the first column of pixels on the right side of the second preset window. At this time, the first parameter of the last pixel in the column The parameter information can be obtained from TPE 1, and the first parameter information of the remaining pixels in the column of pixels can be obtained from the secondary buffer or the secondary memory. The local buffer needs to delete the first parameter information of a column of pixels sliding out of the second preset window, and the secondary buffer or the secondary memory can delete the first parameter information of the first pixel in the column of pixels. In FIG. 13, the first column of pixels on the right side of the second preset window is represented as Col-1, the last pixel in Col-1 is represented as pix-1, and the row of pixels that slide out of the second preset window is represented as Col-1. It is Col-2, and the first pixel in Col-1 is represented as pix1.

In another possible embodiment, as shown in FIG. 14 , another storage method is provided in this embodiment of the present application, and the minimum bandwidth requirement of the SVGF accelerator can be achieved in this storage method. Specifically, the local buffer stores: 1). The first parameter information of a third preset number of pixels in the first image required for processing by the MPE 2 output by the TPE 1. The third preset number may be [Width×(Size_Window2×Step-1)+Size_Window2], Width represents the width of the image to be processed, and the first parameter information of each pixel may include first brightness information, first momentum information, first variance and historical calculation frequency. 2). The second parameter information of the fourth preset number of pixels in the second image required by the SPE 3 processing output by the MPE 2, and the fourth preset number may be [Width×(Size_Window3×Step-1)+ Size_Window3×Step], the second parameter information may include second luminance information and second variance, (optionally, the second parameter information may also include geometric coordinate information of each pixel, such as normal and depth, etc.).

At this time, the secondary buffer or the secondary memory may store: 1). Historical parameter information of each pixel in the previous frame of image, the historical parameter information may include geometric coordinate information, historical brightness information, and historical momentum information and the number of historical calculations, as well as the geometric coordinate information and motion vector of each pixel in the image to be processed.

In the above storage mode, in the process of pixel-by-pixel processing by MPE 2 and SPE 3, the process of updating the parameter information of the pixel by the local buffer and the secondary memory is similar. The pixel processing of the pixels of the first image is taken as an example for description. Specifically, every time the second preset window slides by one pixel, the MPE 2 needs to update the first parameter information of the first column of pixels on the right side of the second preset window. At this time, the first parameter of the last pixel in the column The parameter information can be obtained from TPE 1, and the first parameter information of the remaining pixels in the column of pixels is obtained from the local buffer. The local buffer may delete the first parameter information of the first pixel in a column of pixels sliding out of the second preset window. In FIG. 14, the pixels in the first column on the right side of the second preset window are represented as Col-1, and the last pixel in Col-1 is represented as pix-1, which will slide out of a column of pixels in the second preset window. The first pixel of is represented as pix1.

Under the above two storage modes, TPE 1, MPE 2 and SPE 3 can perform SVGF processing on the to-be-processed image according to the processing flow shown in Figure 15 below. Specifically, S11.TPE 1 performs temporal filtering processing on the image to be processed pixel by pixel; S12. TPE 1 determines whether the processing of all pixels is completed, if so (that is, completed), then the TPE 1 processing ends, if not (that is, not completed) ) then return to S11, and execute S13; S13.TPE 1 judges whether to complete the first parameter information of the pixel point of the required size of MPE 2 processing, if completed, then execute S21, if not completed, then return to S11. S21.MPE 2 performs filter momentum processing on the first image pixel by pixel; S22.MPE 1 determines whether the processing of all pixels is completed, if completed, the MPE 2 processing ends, if not, returns to S21, and executes S23; S23.MPE 2. Determine whether to complete the SPE 3. Process the second parameter information of the pixel of the required size, if completed, execute S31, and if not, return to S21. S31.SPE 3 performs spatial filtering processing on the second image pixel by pixel; S32. SPE 3 judges whether the processing of all pixel points is completed, if completed, the SPE 3 processing ends, and if not completed, returns to S31. Correspondingly, in FIG. 16 , the two adjacent pixels are processed pixel by pixel by TPE 1, MPE 2 and SPE 3 as an example, and the moment when TPE 1, MPE 2 and SPE 3 are started, and TPE 1, MPE 2 are shown. Schematic diagrams of the first preset window, the second preset window and the wavelet transform window corresponding to SPE 3 in the processing process before and after sliding.

In the SVGF accelerator provided in the embodiment of the present application, TPE 1, MPE 2, and SPE 3 all include respective buffers, so that when performing corresponding time-domain filtering processing, filtering momentum processing, and spatial filtering processing, multiple The parameter information of the pixels that need to be multiplexed in the continuous processing process is cached in the buffer, and the parameter information of the pixels obtained by the latest processing can be obtained based on the cached parameter information and the acquired parameters by obtaining the parameter information of the newly processed pixels from the previous-level processing engine. The information is processed accordingly, so that the data transmitted between the processing engines can be minimized, thereby greatly reducing the delay of data transmission and improving the speed of image processing. In addition, by properly configuring the buffer space in TPE 1, MPE 2 and SPE 3, an optimal buffer ratio and/or bandwidth ratio can also be achieved, thereby further optimizing the performance of the SVGF accelerator.

Based on this, an embodiment of the present application further provides an image processing device, the image processing device includes an image processor, and the image processor may include any of the SVGF accelerators provided above. For the specific description of the SVGF accelerator, please refer to the above Relevant descriptions in the text are not repeated in this embodiment of the present application.

In the several embodiments provided in this application, it should be understood that the image processing device and the SVGF accelerator described above are only schematic, for example, the division of the modules or units is only a logical function division, There may be other divisions in actual implementation, for example, multiple units or components may be combined or integrated into another device, or some features may be omitted or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may be one physical unit or multiple physical units, that is, they may be located in one place, or may be distributed to multiple different places . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware.

Finally, it should be noted that: the above are only the specific embodiments of the present application, but the protection scope of the present application is not limited to this, and any changes or replacements within the technical scope disclosed in the present application should be included in the present application. within the scope of protection of the application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

An image processor, characterized in that the image processor includes a spatiotemporal variance-guided filtering SVGF accelerator, and the SVGF accelerator includes:

A time-domain processing engine, configured to perform time-domain filtering processing on the to-be-processed image according to the previous frame of the image to be processed to obtain a first image, where the parameter information of the pixels in the first image is the first parameter information ;

Momentum processing engine, including:

a first buffer for buffering the first parameter information;

a momentum calculation circuit, configured to perform filtering momentum processing on the first image according to the first parameter information to obtain a second image, and the parameter information of the pixels in the second image is the second parameter information;

Airspace processing engines, including:

a second buffer for buffering the second parameter information;

A spatial domain calculation circuit, configured to perform spatial filtering processing on the second image according to the second parameter information to obtain a target image.
The image processor according to claim 1, wherein the time domain processing engine comprises a time domain calculation circuit, and the time domain calculation circuit comprises:

The information acquisition circuit is configured to, for the first pixel in the to-be-processed image, acquire a plurality of data corresponding to the first pixel in the first preset window where the historical pixel in the previous frame of image is located The historical parameter information of the pixel point, the historical parameter information includes geometric coordinate information, historical brightness information and historical momentum information, and the first pixel point is any pixel point in the image to be processed;

a signal generating circuit, configured to generate a control signal according to the geometric coordinate information of the plurality of pixel points;

a first convolution circuit, configured to determine the first brightness information of the first pixel in the first image according to the historical brightness information and historical momentum information of the plurality of pixels respectively when the control signal is valid and first momentum information;

The first convolution circuit is further configured to respectively determine the first brightness information and the first brightness information of the first pixel in the first image according to the preset brightness information and the preset momentum information when the control signal is invalid. a momentum information;

The first convolution circuit is further configured to determine a first variance according to the first momentum information of the first pixel point.
The image processor according to claim 2, wherein the signal generating circuit is specifically used for:

A valid control signal is generated when the geometric coordinate information of at least one pixel in the preset area among the plurality of pixels in the first preset window satisfies the specified coordinate range, and the specified coordinate range and the first The geometric coordinate information of the pixel point is related;

Invalid control information is generated when none of the geometric coordinate information of the plurality of pixels in the first preset window meets the specified coordinate range.
The image processor according to claim 3, wherein the first convolution circuit comprises:

a first luminance convolution circuit, configured to determine that the first pixel is in the first image according to the first convolution weight and luminance information of the pixel in the preset area when the control signal is valid The first brightness information of;

A first momentum convolution circuit, configured to determine the first pixel in the first image according to the first convolution weight and momentum information of the pixel in the preset area when the control signal is valid first momentum information;

A first variance calculation circuit, configured to determine the first variance of the first pixel point according to the first momentum information.
The image processor according to claim 4, wherein the time domain calculation circuit further comprises:

The first weight calculation circuit is used to determine the first convolution weight of each pixel in the preset area.
The image processor according to any one of claims 2-5, wherein the historical parameter information further includes the historical calculation times of the historical pixel points, and the time domain calculation circuit further includes:

an update circuit, configured to determine the value obtained by adding one to the historical calculation times of the historical pixel point as the historical calculation times of the first pixel point when the control signal is valid;

The updating circuit is further configured to determine that the number of historical calculation times of the first pixel is 0 when the control signal is invalid.
The image processor according to any one of claims 2-6, wherein the time domain processing engine further comprises:

A third buffer, configured to buffer the historical parameter information and/or the initial parameter information of the first pixel in the image to be processed, where the initial parameter information includes at least one of the following: initial brightness information, initial Momentum information, motion vector, geometric coordinate information.
The image processor according to any one of claims 1-7, wherein the first parameter information includes first luminance information, first momentum information and first variance, and the momentum calculation circuit includes:

The second convolution circuit is configured to, for the first pixel in the first image, respectively, according to the first brightness information and the first brightness information of the plurality of pixels in the second preset window where the first pixel is located Momentum information, determining the convolution luminance information and convolution momentum information of the first pixel;

The second convolution circuit is further configured to determine the convolution variance of the first pixel point according to the convolution momentum information;

a selection circuit, configured to select the first brightness information and the first variance as the second brightness information and the second variance of the first pixel in the second image, respectively, or select the convolution The luminance information and the convolution variance are respectively used as the second luminance information and the second variance of the first pixel in the second image.
The image processor according to claim 8, wherein the second convolution circuit comprises:

A second luminance convolution circuit, configured to perform a convolution operation according to the second convolution weights of the plurality of pixels and the first luminance information of the plurality of pixels, so as to obtain the convolution of the first pixels brightness information;

A second momentum convolution circuit, configured to perform a convolution operation according to the second convolution weights of the plurality of pixels and the first momentum information of the plurality of pixels, so as to obtain the convolution of the first pixels Momentum information;

The second variance calculation circuit is configured to determine the convolution variance of the first pixel point according to the convolution momentum information.
The image processor according to claim 9, wherein the momentum calculation circuit further comprises:

The second weight calculation circuit is configured to determine the second convolution weight of each pixel point in the plurality of pixel points.
The image processor according to any one of claims 8-10, wherein the first parameter information further includes the number of historical calculations, and the selection circuit is specifically configured to:

When the number of historical calculations of the first pixel is greater than or equal to a preset number of times, the first brightness information and the first variance are selected as the first pixel in the second image respectively. Two luminance information and a second variance;

When the number of historical calculations of the first pixel is less than the preset number of times, the convolution brightness information and the convolution variance are selected as the second value of the first pixel in the second image, respectively. Luminance information, second momentum information, and second variance.
The image processor according to any one of claims 1-11, wherein the airspace calculation circuit comprises:

A convolution conversion circuit, configured to separately extract the second parameter information of multiple valid pixels in each of the multiple wavelet transform windows of the first pixel in the second image, the multiple wavelet transform windows. The transformation window is obtained through wavelet transformation of a preset number of steps, and the second parameter information includes second luminance information and second variance;

The third weight calculation circuit is configured to, for each wavelet transform window in the plurality of wavelet transform windows, determine the weight of the plurality of effective pixel points according to the second variance of the plurality of effective pixel points in the wavelet transform window. the third convolution weight;

The third convolution circuit is configured to, for each wavelet transform window in the plurality of wavelet transform windows, according to the third convolution weight of the plurality of effective pixel points in the wavelet transform window and the plurality of effective pixel points The second brightness information and the second variance of , respectively determine the third brightness information and the third variance of the first pixel;

The third convolution circuit is further configured to, according to the third luminance information and the third difference of the first pixel points corresponding to the multiple wavelet transform windows, respectively determine that the first pixel point is in target brightness information and target variance in the target image.
The image processor according to claim 12, wherein the third weight calculation circuit is specifically used for:

For the plurality of valid pixels, the depth weight, the normal vector weight and the luminance weight are respectively determined according to the second variance, the second luminance information and the geometric coordinate information of the plurality of valid pixels in the second image;

The third convolution weight of the plurality of valid pixels is determined according to the depth weight, the normal vector weight and the luminance weight of the plurality of valid pixels.
The image processor according to claim 12 or 13, wherein the third convolution circuit comprises:

A third luminance convolution circuit, configured to perform a convolution operation according to the third convolution weights of the plurality of valid pixels and the second luminance information of the plurality of valid pixels, so as to obtain the third brightness information;

and a third variance calculation circuit, configured to determine the third variance of the first pixel point according to the third convolution weight of the multiple valid pixel points and the second variance of the multiple valid pixel points.
An image processing device, characterized in that the image processing device includes an image processor, and the image processor includes a stream multiprocessor and an SVGF accelerator in the image processor according to any one of claims 1-14 .