CN117495707A

CN117495707A - Gradient-guided multi-frame image denoising method, device, equipment and medium

Info

Publication number: CN117495707A
Application number: CN202311559753.1A
Authority: CN
Inventors: 曾嘉明
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2023-11-21
Filing date: 2023-11-21
Publication date: 2024-02-02

Abstract

The invention provides a method, a device, equipment and a medium for denoising a multi-frame image guided by gradient, which comprise the following steps: acquiring multi-frame images in the same scene and determining a reference image and N frames of other images; aligning the N frames of other images with the reference image; calculating gradient information of the reference image, and creating a gradient guidance map according to the gradient information so as to divide texture areas of the reference image; and respectively calculating the denoised pixel value of each target pixel point on the reference image to obtain a denoised image, determining the pixel points of the corresponding positions of the current target pixel point, the current surrounding pixel point, the current target pixel point and the current surrounding pixel point on the N-frame aligned image as the current pixel point to be fused according to the type of the texture region where the pixel points are positioned when the denoised pixel value of the current target pixel point is calculated, and obtaining the denoised pixel value according to the pixel value and the position information of each current pixel point to be fused. The invention improves the image denoising effect, reserves the image details and reduces the calculation complexity.

Description

Gradient-guided multi-frame image denoising method, device, equipment and medium

Technical Field

The present invention relates to the field of image fusion technologies, and in particular, to a gradient-guided multi-frame image denoising method, apparatus, device, and medium.

Background

With the progress of computer and multimedia technologies, there is an increasing demand for photographing. However, due to the characteristics of digital devices, photographed images are often disturbed by noise, resulting in degradation of image quality. The existing single-frame image denoising algorithm is simple in calculation, but tends to lose details of images. To solve this problem, a multi-frame image denoising algorithm has been developed. The algorithm uses a plurality of images shot at different moments to carry out denoising in a weighted fusion mode, so that damage to details of the images is avoided. In general, the multi-frame image denoising process is: (1) frame picking processing is carried out, and the clearest reference frame is found; (2) the images are aligned; (3) weight calculation; (4) and (5) image fusion.

The multi-frame images in the time domain are fused, so that the image details can be reserved to the greatest extent. However, for a scene with relatively high noise, denoising in the time domain alone is not sufficient to achieve satisfactory results. Denoising is usually performed by combining a time domain and a space domain, but common space domain denoising usually causes problems, and image details are easy to influence. If some complex spatial denoising algorithms are used, the image details can be better preserved, but the computational complexity is increased, which is not acceptable for the real-time requirement of photographing of digital equipment.

Therefore, it is necessary to propose a method, a device, an apparatus and a medium for denoising a multi-frame image guided by gradient, so as to solve the above problems.

Disclosure of Invention

The invention aims to provide a gradient-guided multi-frame image denoising method, device, equipment and medium, which are used for solving the problems that the image details are difficult to keep by a common time domain and space domain denoising algorithm and the computational complexity of a complex time domain and space domain denoising algorithm is high.

In a first aspect, the present invention provides a gradient-guided multi-frame image denoising method, the method comprising:

and acquiring multi-frame images continuously shot for the same shooting scene.

And selecting one frame of reference image from the multi-frame images, and determining N frames of other images, wherein N is a positive integer.

And according to the reference image, carrying out alignment processing on the other images of the N frames to obtain an aligned image of the N frames.

And calculating gradient information of the reference image by using an edge detection operator, wherein the gradient information comprises longitudinal texture confidence and transverse texture confidence.

And creating a gradient guide map according to the gradient information so as to divide texture areas of the reference image.

And respectively calculating the denoised pixel value of each target pixel point on the reference image to obtain a denoised image.

In a possible embodiment, when calculating the denoised pixel value of the current target pixel, determining the type of the texture area where the current target pixel is located, determining the current target pixel, the current surrounding pixel within a set range around the current target pixel, and the pixel at the corresponding position of the current target pixel and the current surrounding pixel on the aligned image of N frames as the current pixel to be fused, calculating the fusion weight of each current pixel to be fused according to the pixel value and the position information of each current pixel to be fused, and calculating the denoised pixel value of each current target pixel according to the fusion weight and the pixel value of each current pixel to be fused.

In one possible embodiment, the types of texture regions include a longitudinal texture region, a quasi-longitudinal texture region, a flat texture region, a quasi-lateral texture region, and a lateral texture region.

Judging the type of the texture area where the current target pixel point is located, including:

and calculating to obtain a difference value between the absolute value of the transverse texture confidence coefficient of the current target pixel point and the absolute value of the longitudinal texture confidence coefficient of the current target pixel point.

And judging the type of the texture area where the current target pixel point is positioned as the longitudinal texture area when the difference value is smaller than a first threshold value.

And judging the type of the texture area where the current target pixel point is positioned as the quasi-longitudinal texture area, wherein the difference value is larger than or equal to the first threshold value and smaller than a second threshold value.

And judging the type of the texture area where the current target pixel point is positioned as the flat texture area, wherein the difference value is larger than or equal to the second threshold value and smaller than a third threshold value.

And judging the type of the texture area where the current target pixel point is positioned as the quasi-transverse texture area, wherein the difference value is larger than or equal to the third threshold value and smaller than a fourth threshold value.

And judging the type of the texture area where the current target pixel point is positioned as the transverse texture area, wherein the difference value is larger than or equal to the fourth threshold value.

In one possible embodiment, calculating the fusion weight of each current pixel to be fused according to the pixel value and the position information of each current pixel to be fused includes:

and calculating the pixel weight of each current pixel to be fused according to the pixel value of each current pixel to be fused.

And calculating the distance weight of each current pixel to be fused according to the position information of each current pixel to be fused.

And calculating the fusion weight of each current pixel point to be fused according to the pixel weight and the distance weight of each current pixel point to be fused.

In a possible embodiment, according to the fusion weight and the pixel value of each current pixel to be fused, the denoised pixel value of the current target pixel is calculated, including:

and accumulating the fusion weight of each current pixel point to be fused to obtain the weight sum of the current target pixel points.

And calculating the product value of the fusion weight of each current pixel point to be fused and the corresponding pixel value, and accumulating each product value to obtain the fusion sum of the current target pixel points.

And calculating the ratio of the fusion sum and the weight sum of the current target pixel point to obtain the denoised pixel value of the current target pixel point.

In one possible embodiment, selecting a frame of reference image from the multiple frames of images and determining N frames of other images includes:

and calculating gradient information of the multi-frame images to evaluate the definition of the multi-frame images, selecting an image with highest definition from the multi-frame images to be determined as a reference image, removing images with unsatisfactory definition from the multi-frame images, and determining the rest N frames of images as other images.

In one possible embodiment, according to the reference image, performing alignment processing on the N frames of the other images to obtain N frames of aligned images, including:

and respectively downsampling the reference image and the N frames of other images into a multi-layer Gaussian pyramid image, wherein the multi-layer Gaussian pyramid image represents a plurality of image layers with different resolutions, and each image layer comprises a plurality of image blocks.

And respectively carrying out layer-by-layer image block alignment processing on the other images of each frame and the reference image from the highest image layer to the lowest image layer to obtain N frames of aligned images.

In a second aspect, the invention also provides a gradient-guided multi-frame image denoising apparatus, which comprises a module/unit for executing the method of any one of the possible designs of the first aspect. These modules/units may be implemented by hardware, or may be implemented by hardware executing corresponding software.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory. Wherein the memory is for storing one or more computer programs; the one or more computer programs, when executed by the processor, enable the electronic device to implement the method of any one of the possible designs of the first aspect described above.

In a fourth aspect, there is also provided in an embodiment of the invention a computer readable storage medium comprising a computer program which, when run on an electronic device, causes the electronic device to carry out the method of any one of the possible designs of the first aspect described above.

In a fifth aspect, embodiments of the present invention also provide a method comprising a computer program product, which when run on an electronic device, causes the electronic device to perform any one of the possible designs of the above aspects.

The invention has the beneficial effects that: and under the guidance of gradient information, dividing the texture region of the reference image, performing spatial fusion by using pixels with different spatial distributions for target pixels in different texture regions in the reference image, and performing temporal fusion with pixels of the target pixels at corresponding positions on a frame image. When the fusion weight of the target pixel point is calculated, the pixel value weight and the distance weight are considered, so that the denoising effect of the reference image is obviously improved, and meanwhile, the image details are kept as far as possible. Meanwhile, the fusion calculation complexity is reduced, so that the real-time requirement of photographing of digital equipment is met.

Drawings

Fig. 1 is a schematic flow chart of a gradient-guided multi-frame image denoising method.

Fig. 2 is a flow chart of the implementation of the gradient guided multi-frame image denoising method.

Fig. 3 is a schematic diagram of gradient guided pixel fusion of the gradient guided multi-frame image denoising method according to the present invention.

Fig. 4 is a schematic diagram of a gradient-guided multi-frame image denoising apparatus according to the present invention.

Fig. 5 is a schematic structural diagram of an electronic device according to the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Technical terms related to the present invention are explained below.

1. YUV: YUV is a color coding mode in which Y represents Luminance (luminence), i.e. gray values, and UV represents chromaticity (chromance) and density (Chroma), respectively.

2. Tuning (tuning) parameters: different scenes are adapted by setting parameters of different intensities.

3. Longitudinal texture confidence G _x Is an index for measuring longitudinal texture in an image. Longitudinal texture confidence is typically calculated by calculating the longitudinal texture intensity of each pixel in the image.

4. Confidence of lateral texture G _y Is an index for measuring the transverse texture in an image. Lateral texture confidence is typically calculated by calculating the lateral texture intensity for each pixel in the image.

With the development of technology, people have increasingly high requirements on photographing technology. Because the imaging element can be affected by thermal noise, obvious noise can appear in images directly shot by digital equipment (such as a camera, a digital single-lens reflex camera and a tablet personal computer), and the visual experience of people is affected. The image details are difficult to keep by a common time domain and space domain denoising algorithm, but the complex time domain and space domain denoising algorithm has higher computational complexity and cannot meet the real-time requirement of photographing of digital equipment.

In order to solve the above technical problems, an embodiment of the present invention provides a gradient-guided multi-frame image denoising method, which can be applied to a digital device with a photographing function, such as a mobile phone, a digital single lens reflex, a tablet computer, etc., and can be cured on a chip of an image signal processor (Image Signal Processor, ISP) to increase the operation speed, see fig. 1 and 2, and the method includes:

S101: and acquiring multi-frame images continuously shot for the same shooting scene.

S102: and selecting one frame of reference image from the multi-frame images, and determining N frames of other images, wherein N is a positive integer.

S103: and according to the reference image, carrying out alignment processing on other images of the N frames to obtain an aligned image of the N frames.

S104: computing gradient information of the reference image using an edge detection operator, wherein the gradient information includes longitudinal texture confidence G _x And lateral texture confidence G _y . Preferably, the edge detection operator may adopt a Sobel (Sobel) operator, a Roberts (Roberts) operator, a Prewitt operator, a Laplace (Laplace) operator, a Canny (Canny) operator, or a gradient detection operator.

S105: and creating a gradient guide map according to the gradient information so as to divide texture areas of the reference image.

S106: and respectively calculating the denoised pixel value of each target pixel point on the reference image to obtain a denoised image.

In a preferred embodiment, when the denoised pixel value of the current target pixel point is calculated, the type of the texture area where the current target pixel point is located is determined, according to the type of the texture area where the current target pixel point is located, the current surrounding pixel point within the current target pixel point periphery setting range, the pixel points of the current surrounding pixel point and the corresponding position of the current surrounding pixel point on the N-frame aligned image are determined to be the current pixel point to be fused, according to the pixel value and the position information of each current pixel point to be fused, the fusion weight of each current pixel point to be fused is calculated, and according to the fusion weight and the pixel value of each current pixel point to be fused, the denoised pixel value of the current target pixel point is calculated. Preferably, the corresponding setting ranges in the different types of texture areas are different.

In this embodiment, for different texture areas, different setting ranges are set for the target pixel point, so as to achieve the best denoising effect. For example, the pixels in the longitudinal texture area are arranged compactly in the vertical direction, a proper setting range can be determined in the longitudinal direction, and the number of the pixels to be fused is controlled by selecting the pixels to be fused in a proper spatial distribution range, so that the fusion calculation amount and calculation complexity can be reduced, and the real-time requirement of photographing of digital equipment can be met. And performing time domain denoising on the current target pixel point and the pixel point at the corresponding position on the N-frame aligned image, and simultaneously, participating in performing space domain combined time domain denoising on the current surrounding pixel point and the pixel point at the corresponding position on the N-frame aligned image, thereby being beneficial to improving the image denoising effect. The invention provides a multi-frame denoising algorithm guided by gradient, which adopts different fusion modes for different texture areas under the guidance of gradient information, can remarkably inhibit noise in an image, can enable the image to have smooth edges, is not easy to generate edge noise, and well retains image details.

In a specific embodiment, selecting one frame of reference image from multiple frames of images and determining N frames of other images includes: and calculating gradient information of the multi-frame images to evaluate the definition of the multi-frame images, selecting the image with the highest definition from the multi-frame images to be determined as a reference image, removing images with unsatisfactory definition from the multi-frame images, and determining the rest N frames of images as other images.

In a specific embodiment, according to a reference image, performing alignment processing on N frames of other images to obtain N frames of aligned images, including: the reference image and the N frames of other images are respectively downsampled into a multi-layer Gaussian pyramid image, wherein the multi-layer Gaussian pyramid image represents a plurality of image layers with different resolutions, and each image layer comprises a plurality of image blocks. And respectively carrying out layer-by-layer image block alignment processing on other images of each frame and the reference image from the highest image layer to the lowest image layer to obtain N frames of aligned images. In other words, the reference image and the N frames of other images are downsampled into a 4-layer Gaussian pyramid image, when the alignment processing is carried out on the current other images, the image blocks of each layer of the current other images are aligned with the image blocks of the corresponding layer on the reference image from the highest layer, the image block position which is most matched with the corresponding image block on the reference image is found in each layer of the image blocks of the current other images, the position information of the matched image blocks is transmitted downwards layer by layer, so that the position information of all the image blocks aligned with the reference frame in the current other images is obtained, and the alignment of the current other images and the reference image is completed.

In a specific embodiment, after obtaining the N-frame aligned image, the method further includes: and filtering the reference image to obtain a filtered image, wherein the filtering process adopts Gaussian filtering process. Calculating gradient information of the reference image using the edge detection operator, comprising: gradient information of the filtered image is calculated using an edge detection operator. Creating a gradient guidance map according to gradient information to perform texture region division on a reference image, including: and creating a gradient guide map according to the gradient information so as to divide texture areas of the filtered image. Respectively calculating the denoised pixel value of each target pixel point on the reference image to obtain a denoised image, wherein the method comprises the following steps: and respectively calculating the denoised pixel value of each target pixel point on the filtered image to obtain a denoised image.

Preferably, calculating gradient information of the filtered image using the edge detection operator includes: and (3) obtaining gradient information of the filtered image through a Sobel convolution factor. Wherein the Sobel convolution factor employs two 3x3 convolution kernels (kernal x and kernal y), wherein,

kernal _x ＝{-1,0,1,-2,0,2,-1,0,1}；kernal _y ＝{1,2,1,0,0,0,-1,-2,-1}。

in the embodiment, the 3x3 convolution kernel is adopted, so that the calculated amount can be greatly reduced, and the real-time requirement of photographing of the digital equipment is met.

In a preferred embodiment, the types of texture regions include longitudinal texture regions, quasi-longitudinal texture regions, flat texture regions, quasi-lateral texture regions, and lateral texture regions. Judging the type of the texture area where the current target pixel point is located, comprising: calculating to obtain the transverse texture confidence G of the current target pixel point _y Longitudinal texture confidence G of absolute value of (C) and current target pixel point _x Is the difference in absolute value of (c). The difference value is smaller than a first threshold value, and the type of the texture area where the current target pixel point is located is judged to be a longitudinal texture area. The difference is greater than or equal to the first threshold value and less than the first threshold valueAnd judging the type of the texture area where the current target pixel point is positioned as a quasi-longitudinal texture area by the two thresholds. And judging the type of the texture area where the current target pixel point is positioned as a flat texture area, wherein the difference value is larger than or equal to the second threshold value and smaller than the third threshold value. And judging the type of the texture area where the current target pixel point is positioned as a quasi-transverse texture area, wherein the difference value is larger than or equal to the third threshold value and smaller than the fourth threshold value. And judging the type of the texture area where the current target pixel point is positioned as a transverse texture area according to the difference value which is larger than or equal to a fourth threshold value. Preferably, the first threshold is-128, the second threshold is-25, the third threshold is 25, and the fourth threshold is 128.

Specifically, referring to fig. 3, the reference image is divided into 5 types of texture regions according to a gradient guidance map:

wherein grad _guide A gradient guidance map is represented, 1 representing a longitudinal texture region, 2 representing a quasi-longitudinal texture region, 3 representing a flat texture region, 4 representing a quasi-lateral texture region, and 5 representing a lateral texture region.

In this embodiment, the longitudinal texture confidence C is based on the target pixel point _x And lateral texture confidence C _y The type of the texture region where the target pixel point is located can be determined, and different fusion modes are adopted for different texture regions, so that noise is reduced to the greatest extent, image details are reserved, and the denoising effect is improved.

In a preferred embodiment, calculating the fusion weight of each current pixel to be fused according to the pixel value and the position information of each current pixel to be fused includes: and calculating the pixel weight of each current pixel to be fused according to the pixel value of each current pixel to be fused. And calculating the distance weight of each current pixel to be fused according to the position information of each current pixel to be fused. And calculating the fusion weight of each current pixel point to be fused according to the pixel weight and the distance weight of each current pixel point to be fused. In this embodiment, when calculating the fusion weight of the target pixel point, the pixel value weight and the distance weight are considered, where the pixel value weight measures the pixel value contribution of each current pixel point to be fused, and the distance weight considers the influence of the position of each current pixel point to be fused on the space domain on the fusion.

In a specific embodiment, according to the fusion weight and the pixel value of each current pixel to be fused, a denoised pixel value of a current target pixel is calculated, including: and accumulating the fusion weight of each current pixel point to be fused to obtain the weight sum of the current target pixel point. And calculating the product value of the fusion weight of each current pixel point to be fused and the corresponding pixel value, and accumulating each product value to obtain the fusion sum of the current target pixel points. And calculating the ratio of the fusion sum and the weight sum of the current target pixel point to obtain the denoised pixel value of the current target pixel point.

Referring to fig. 3, the following explanation will be made on the fusion denoising process of the reference image and the N-frame aligned image, which are determined as the image to be fused for convenience of explanation.

Specifically, the denoised pixel value of the current target pixel point in the longitudinal texture region satisfies the following calculation formula:

wherein, merge_out ₁ Representing a denoised pixel value of a current target pixel located in the longitudinal texture region, p _i,x,y Pixel values representing the (x, y) position of the i-th image to be fused, (x, y) representing the relative position to the current target pixel point located in the longitudinal texture region,the pixel weight for the vertical texture region may be a parameter that is adjustable and a value other than 0.d, d _i,x,y Representing the distance of the (x, y) position of the ith image to be fused from the current target pixel point in the longitudinal texture region,/for>The distance weight for the longitudinal texture region may be a parameter that is adjustable and a value other than 0.w (w) _sum Representing the sum of weights of the current target pixel points located in the longitudinal texture region. r is (r) _x 、r _y The method is used for limiting the value ranges of x and y respectively, namely the setting range of the current target pixel point in the longitudinal texture area.

Referring to fig. 3, it can be seen that the first row is a longitudinal texture region located at the same position on the reference image and the N-frame aligned image, where black pixels in the first row represent current target pixels located in the longitudinal texture region, and gray pixels represent current surrounding pixels located in the longitudinal texture region; the black pixel points in the first row of aligned images represent pixel points located at corresponding positions on the aligned images of the current target pixel points in the longitudinal texture region, and the gray pixel points represent pixel points located at corresponding positions on the aligned images of the current surrounding pixel points in the longitudinal texture region.

Specifically, the denoised pixel value of the current target pixel located in the quasi-longitudinal texture region satisfies the following calculation formula:

wherein, merge_out ₂ Representing the denoised pixel value of the current target pixel located in the quasi-longitudinal texture region, p represents the denoised pixel value of the current target pixel located in the quasi-longitudinal texture region, p _i,x,y Representing the ith map to be fusedPixel values for the (x, y) position of the image, (x, y) represents the relative position to the current target pixel point located in the quasi-longitudinal texture region,the pixel weight for the quasi-longitudinal texture region may be a parameter that is adjustable and a value other than 0.d, d _i,x,y Representing the distance of the (x, y) position of the ith image to be fused from the current target pixel point in the quasi-longitudinal texture region,/v>The distance weight for the quasi-longitudinal texture region may be a parameter that is adjustable and a value other than 0.w (w) _sum Representing the sum of weights of the current target pixel points located in the quasi-longitudinal texture region. r is (r) _x 、r _y The method is used for limiting the value ranges of x and y respectively, namely the setting range of the current target pixel point in the quasi-longitudinal texture area.

Referring to fig. 3, the second row is a quasi-longitudinal texture region located at the same position on the reference image and the N-frame aligned image, wherein black pixel points in the second row of reference images represent current target pixel points located in the quasi-longitudinal texture region, and gray pixel points represent current surrounding pixel points located in the quasi-longitudinal texture region; the black pixel points in the second row of aligned images represent the pixel points at the corresponding positions of the current target pixel points in the quasi-longitudinal texture area on the aligned images, and the gray pixel points represent the pixel points at the corresponding positions of the current surrounding pixel points in the quasi-longitudinal texture area on the aligned images.

Specifically, the denoised pixel value of the current target pixel point in the flat texture region satisfies the following calculation formula:

wherein, merge_out ₃ Representing a denoised pixel value of a current target pixel located in the flat texture region, p _i,x,y Pixel values representing the (x, y) position of the i-th image to be fused, (x, y) representing the relative position to the current target pixel point located in the flat texture region,the pixel weight for a flat texture region may be a parameter that is adjustable and a value other than 0.d, d _i,x,y Representing the distance of the (x, y) position of the ith image to be fused from the current target pixel point in the flat texture region,/for>The distance weight for a flat texture region may be a parameter that is adjustable and a value other than 0.w (w) _sum Representing the sum of weights of the current target pixel points located in the flat texture region. r is (r) _x 、r _y The method is used for limiting the value ranges of x and y respectively, namely the setting range of the current target pixel point in the flat texture area.

Referring to fig. 3, the third row is a flat texture region located at the same position on the reference image and the N-frame aligned image, where black pixels in the third row represent current target pixels located in the flat texture region, and gray pixels represent current surrounding pixels located in the flat texture region; the black pixel points in the third row of aligned images represent the pixel points located at the corresponding positions of the current target pixel points of the flat texture area on the aligned images, and the gray pixel points represent the pixel points located at the corresponding positions of the current surrounding pixel points of the flat texture area on the aligned images.

Specifically, the denoised pixel value of the current target pixel located in the quasi-lateral texture region satisfies the following calculation formula:

wherein, merge_out ₄ Representing denoised pixel values of a current target pixel located in the quasi-lateral texture region, p _i,x,y Pixel values representing the (x, y) position of the i-th image to be fused, (x, y) representing the relative position to the current target pixel point located in the quasi-lateral texture region,the pixel weight for the quasi-lateral texture region may be a parameter that is adjustable and has a value other than 0.d, d _i,x,y Representing the distance of the (x, y) position of the ith image to be fused from the current target pixel point in the quasi-lateral texture region,/v>The distance weight for the quasi-lateral texture region may be an adjustable parameter and a value other than 0.w (w) _sum Representing the sum of weights of the current target pixel points located in the quasi-lateral texture region. r is (r) _x 、r _y The method is used for limiting the value ranges of x and y respectively, namely the setting range of the current target pixel point in the quasi-transverse texture area.

Referring to fig. 3, the fourth row is a quasi-lateral texture region located at the same position on the reference image and the N-frame aligned image, wherein black pixel points in the fourth row of reference image represent current target pixel points located in the quasi-lateral texture region, and gray pixel points represent current surrounding pixel points located in the quasi-lateral texture region; the black pixel points in the fourth row of aligned images represent pixel points positioned at the corresponding positions of the current target pixel points in the quasi-transverse texture area on the aligned images, and the gray pixel points represent pixel points positioned at the corresponding positions of the current surrounding pixel points in the quasi-transverse texture area on the aligned images.

Specifically, the denoised pixel value of the current target pixel point in the transverse texture region satisfies the following calculation formula:

wherein, merge_out ₅ Representing a denoised pixel value of a current target pixel located in the lateral texture region, p _i,x,y Pixel values representing the (x, y) position of the i-th image to be fused, (x, y) representing the relative position to the current target pixel point located in the lateral texture region,the pixel weight for the lateral texture region may be a parameter that is adjustable and a value other than 0.d, d _i,x,y Representing the distance of the (x, y) position of the ith image to be fused from the current target pixel point in the lateral texture region,/for>The distance weight for the lateral texture region may be a parameter that is adjustable and a value other than 0.w (w) _sum Representing the sum of weights of the current target pixel points located in the lateral texture region. r is (r) _x 、r _y The method is used for limiting the value ranges of x and y respectively, namely the setting range of the current target pixel point in the transverse texture area.

Referring to fig. 3, it can be seen that the fifth row is a lateral texture region located at the same position on the reference image and the N-frame aligned image, where black pixels in the fifth row of reference image represent current target pixels located in the lateral texture region, and gray pixels represent current surrounding pixels located in the lateral texture region; black pixel points in the fifth row of aligned images represent pixel points positioned at corresponding positions of the current target pixel points of the transverse texture area on the aligned images, and gray pixel points represent pixel points positioned at corresponding positions of the current surrounding pixel points of the transverse texture area on the aligned images.

According to gradient guidance map grad _guide Dividing the reference image into different texture areas, and using different fusion modes for target pixel points of the different texture areas to obtain a denoised image:

where merge_out represents the denoised image, grad _guide A gradient guidance map is represented, 1 representing a longitudinal texture region, 2 representing a quasi-longitudinal texture region, 3 representing a flat texture region, 4 representing a quasi-lateral texture region, 5 representing a lateral texture region, merge_out ₁ Denoised pixel value representing a target pixel located in a longitudinal texture region, merge_out ₂ Denoised pixel value representing a target pixel located in a quasi-longitudinal texture region, merge_out ₃ Denoised pixel value representing target pixel located in flat texture region, merge_out ₄ Denoised pixel value representing a target pixel located in a quasi-lateral texture region, merge_out ₅ Representing denoised pixel values for target pixel points located in the lateral texture region.

The gradient-guided multi-frame image denoising method acts on the YUV domain of the image, and has a certain denoising effect on three channels of Y/U/V. In a frame of image, pixels in different texture areas are spatially fused by pixels with different distributions. Meanwhile, in the time-domain multi-frame image, for each pixel point, the same distribution of pixel points on different images are fused in the time domain. Based on different texture region division, the pixel points in different texture regions are respectively subjected to time domain and space domain weight calculation by using the pixel points with different space domain distributions, each pixel point can obtain a denoised clean image through weighted average, the image denoising effect is improved, and the image details are reserved. Meanwhile, the calculation complexity is reduced, so that the real-time requirement of photographing of the digital equipment is met.

In addition, the invention also provides a gradient-guided multi-frame image denoising device which can be applied to digital equipment with photographing function and an image signal processor chip. Referring to fig. 4, the apparatus includes: an acquisition unit 401 for acquiring a plurality of frame images continuously photographed for the same photographing scene; a determining unit 402, configured to select a frame of reference image from multiple frames of images, and determine N frames of other images, where N is a positive integer; an alignment unit 403, configured to perform alignment processing on N frames of other images according to the reference image, to obtain N frames of aligned images; an edge detection unit 404 for calculating gradient information of the reference image using an edge detection operator, wherein the gradient information includes a longitudinal texture confidence and a lateral texture confidence; a region dividing unit 405 for creating a gradient guidance map according to the gradient information to perform texture region division on the reference image; and the denoising unit 406 is configured to calculate denoised pixel values of each target pixel point on the reference image, respectively, so as to obtain a denoised image. All relevant contents of each step related to the above method embodiment may be cited to the functional descriptions of the corresponding functional modules, which are not described herein.

In a preferred embodiment, when calculating the denoised pixel value of the current target pixel, the denoising unit 406 determines the type of the texture area where the current target pixel is located, determines the current target pixel, the current surrounding pixel within the set range around the current target pixel, and the pixel at the corresponding position of the current target pixel and the current surrounding pixel on the N-frame aligned image as the current pixel to be fused according to the type of the texture area where the current target pixel is located, calculates the fusion weight of each current pixel to be fused according to the pixel value and the position information of each current pixel to be fused, and calculates the denoised pixel value of each current target pixel according to the fusion weight and the pixel value of each current pixel to be fused.

In a preferred embodiment, the types of texture regions include longitudinal texture regions, quasi-longitudinal texture regions, flat texture regions, quasi-lateral texture regions, and lateral texture regions.

The denoising unit 406 determines the type of texture region where the current target pixel point is located, and is specifically configured to:

and calculating to obtain the difference value between the absolute value of the transverse texture confidence coefficient of the current target pixel point and the absolute value of the longitudinal texture confidence coefficient of the current target pixel point.

The difference value is smaller than a first threshold value, and the type of the texture area where the current target pixel point is located is judged to be a longitudinal texture area.

The difference value is larger than or equal to a first threshold value and smaller than a second threshold value, and the type of the texture area where the current target pixel point is located is judged to be a quasi-longitudinal texture area.

And judging the type of the texture area where the current target pixel point is positioned as a flat texture area, wherein the difference value is larger than or equal to the second threshold value and smaller than the third threshold value.

And judging the type of the texture area where the current target pixel point is positioned as a quasi-transverse texture area, wherein the difference value is larger than or equal to the third threshold value and smaller than the fourth threshold value.

And judging the type of the texture area where the current target pixel point is positioned as a transverse texture area according to the difference value which is larger than or equal to a fourth threshold value.

In a preferred embodiment, the denoising unit 406 calculates a fusion weight of each current pixel to be fused according to the pixel value and the position information of each current pixel to be fused, which is specifically configured to:

In a preferred embodiment, the denoising unit 406 calculates a denoised pixel value of the current target pixel according to the fusion weight and the pixel value of each current pixel to be fused, which is specifically configured to:

and accumulating the fusion weight of each current pixel point to be fused to obtain the weight sum of the current target pixel point.

And respectively calculating the product value of the fusion weight of each current pixel point to be fused and the corresponding pixel value, and accumulating each product value to obtain the fusion sum of the current target pixel points.

In a preferred embodiment, the determining unit 402 selects one frame of reference image from the multiple frames of images, and determines N frames of other images, specifically for:

and calculating gradient information of the multi-frame images to evaluate the definition of the multi-frame images, selecting the image with the highest definition from the multi-frame images to be determined as a reference image, removing images with unsatisfactory definition from the multi-frame images, and determining the rest N frames of images as other images.

In a preferred embodiment, the alignment unit 403 performs alignment processing on N frames of other images according to the reference image to obtain N frames of aligned images, which is specifically used for:

the reference image and the N frames of other images are respectively downsampled into a multi-layer Gaussian pyramid image, wherein the multi-layer Gaussian pyramid image represents a plurality of image layers with different resolutions, and each image layer comprises a plurality of image blocks.

And respectively carrying out layer-by-layer image block alignment processing on other images of each frame and the reference image from the highest image layer to the lowest image layer to obtain N frames of aligned images.

In other embodiments of the present invention, embodiments of the present invention disclose an electronic device, see fig. 5, which may include: one or more processors 501; a memory 502; a display 503; one or more applications (not shown); and one or more computer programs 504, which may be connected via one or more communication buses 505. Wherein the one or more computer programs 504 are stored in the memory 502 and configured to be executed by the one or more processors 501, the one or more computer programs 504 comprise instructions that may be used to perform the various steps as in fig. 1 and 4 and the corresponding embodiments.

From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.

The functional units in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform all or part of the steps of the method described in the embodiments of the present invention. And the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic or optical disk, and the like.

While embodiments of the present invention have been described in detail hereinabove, it will be apparent to those skilled in the art that various modifications and variations can be made to these embodiments. It is to be understood that such modifications and variations are within the scope and spirit of the present invention as set forth in the following claims. Moreover, the invention described herein is capable of other embodiments and of being practiced or of being carried out in various ways. Unless otherwise defined, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. As used herein, the word "comprising" and the like means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof without precluding other elements or items.

Claims

1. A gradient-guided multi-frame image denoising method, comprising:

acquiring multi-frame images continuously shot aiming at the same shooting scene;

selecting one frame of reference image from the multi-frame images, and determining N frames of other images, wherein N is a positive integer;

according to the reference image, carrying out alignment processing on the N frames of other images to obtain N frames of aligned images;

Calculating gradient information of the reference image by utilizing an edge detection operator, wherein the gradient information comprises longitudinal texture confidence and transverse texture confidence;

creating a gradient guide map according to the gradient information so as to divide texture areas of the reference image;

2. The method according to claim 1, wherein when calculating the denoised pixel value of a current target pixel, determining the type of a texture area where the current target pixel is located, determining the current target pixel, current surrounding pixels within a set range around the current target pixel, and pixels at corresponding positions of the current target pixel and the current surrounding pixels on the aligned image of N frames as current pixels to be fused according to the type of the texture area, calculating fusion weights of each current pixel to be fused according to pixel values and position information of each current pixel to be fused, and calculating the denoised pixel value of each current target pixel according to the fusion weights and the pixel values of each current pixel to be fused.

3. The method of claim 2, wherein the types of texture regions include a longitudinal texture region, a quasi-longitudinal texture region, a flat texture region, a quasi-lateral texture region, and a lateral texture region;

calculating to obtain a difference value between the absolute value of the transverse texture confidence coefficient of the current target pixel point and the absolute value of the longitudinal texture confidence coefficient of the current target pixel point;

the difference value is smaller than a first threshold value, and the type of the texture area where the current target pixel point is located is judged to be the longitudinal texture area;

the difference value is larger than or equal to the first threshold value and smaller than a second threshold value, and the type of the texture area where the current target pixel point is located is judged to be the quasi-longitudinal texture area;

the difference value is larger than or equal to the second threshold value and smaller than a third threshold value, and the type of the texture area where the current target pixel point is located is judged to be the flat texture area;

the difference value is larger than or equal to the third threshold value and smaller than a fourth threshold value, and the type of the texture area where the current target pixel point is located is judged to be the quasi-transverse texture area;

4. A method according to claim 2 or 3, wherein calculating the fusion weight of each current pixel to be fused according to the pixel value and the position information of each current pixel to be fused comprises:

calculating the pixel weight of each current pixel to be fused according to the pixel value of each current pixel to be fused;

calculating the distance weight of each current pixel to be fused according to the position information of each current pixel to be fused;

5. The method of claim 4, wherein calculating the denoised pixel value of the current target pixel according to the fusion weight and the pixel value of each current pixel to be fused comprises:

accumulating the fusion weight of each current pixel point to be fused to obtain the weight sum of the current target pixel points;

Calculating the product value of the fusion weight of each current pixel point to be fused and the corresponding pixel value, and accumulating each product value to obtain the fusion sum of the current target pixel points;

6. A method according to any one of claims 1-3, wherein selecting a frame of reference image from the plurality of frames of images and determining N frames of other images comprises:

7. A method according to any one of claims 1-3, wherein the aligning the N frames of the other images according to the reference image to obtain N frames of aligned images comprises:

respectively downsampling the reference image and the N frames of other images into a multi-layer Gaussian pyramid image, wherein the multi-layer Gaussian pyramid image represents a plurality of image layers with different resolutions, and each image layer comprises a plurality of image blocks;

8. A gradient-guided multi-frame image denoising apparatus, comprising:

an acquisition unit configured to acquire a plurality of frame images continuously shot for the same shooting scene;

the determining unit is used for selecting one frame of reference image from the multi-frame images and determining N frames of other images, wherein N is a positive integer;

the alignment unit is used for carrying out alignment processing on the N frames of other images according to the reference image to obtain N frames of aligned images;

an edge detection unit for calculating gradient information of the reference image by using an edge detection operator, wherein the gradient information comprises longitudinal texture confidence and transverse texture confidence;

the region dividing unit is used for creating a gradient guide image according to the gradient information so as to divide texture regions of the reference image;

and the denoising unit is used for respectively calculating the denoised pixel value of each target pixel point on the reference image to obtain a denoised image.

9. An electronic device, comprising: a processor and a memory for storing a computer program; the processor is configured to execute the computer program stored in the memory, to cause the electronic device to perform the method of any one of claims 1 to 7.

10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, implements the method of any of claims 1 to 7.