CN115018726A - U-Net-based image non-uniform blur kernel estimation method - Google Patents

U-Net-based image non-uniform blur kernel estimation method Download PDF

Info

Publication number
CN115018726A
CN115018726A CN202210640745.9A CN202210640745A CN115018726A CN 115018726 A CN115018726 A CN 115018726A CN 202210640745 A CN202210640745 A CN 202210640745A CN 115018726 A CN115018726 A CN 115018726A
Authority
CN
China
Prior art keywords
image
kernel
fuzzy
convolution
uniform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210640745.9A
Other languages
Chinese (zh)
Inventor
张克廷
黄建杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cntv Wuxi Co ltd
Original Assignee
Cntv Wuxi Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cntv Wuxi Co ltd filed Critical Cntv Wuxi Co ltd
Priority to CN202210640745.9A priority Critical patent/CN115018726A/en
Publication of CN115018726A publication Critical patent/CN115018726A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/73
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention relates to an image non-uniform blurring kernel estimation method based on U-Net, which comprises the following steps: acquiring a plurality of clear natural images; generating a non-uniform fuzzy core, and carrying out fuzzy processing on the clear image by using the fuzzy core to obtain a corresponding fuzzy image to form a training data set; building a network model for estimating a non-uniform fuzzy core based on U-Net; designing a loss function for network model training; selecting an optimization algorithm, and optimizing a network model by using a training data set; and inputting the fuzzy image to be processed into the model to obtain the non-uniform fuzzy kernel of the fuzzy image. The invention has the advantages that: the situation of non-uniform and fuzzy images in reality is relatively met; the overall information of the image is well utilized when the local motion core is estimated; the respective blurring situation of different areas is better estimated.

Description

U-Net-based image non-uniform blur kernel estimation method
Technical Field
The invention relates to an image non-uniform blurring kernel estimation method based on U-Net, belonging to the technical field of image processing.
Background
Image blur is a condition of image quality degradation often encountered during image acquisition in the field of digital image processing. There are various causes for image blur, among which motion blur is a typical cause, mainly caused by the long exposure time of the camera due to the relative motion between the camera and the scene or different objects in the scene. From the aspect of image generation, different degrees and types of relative motion have different motion blur kernels, thereby causing different degrees and different types of blur in the captured image.
In addition, the image blurring condition can be divided into a uniform blurring condition and a non-uniform blurring condition according to whether the blurring of different areas of the image is the same or not. Uniform blurring means that the motion blur kernel of the whole image is consistent, which is relatively simple; the non-uniform blurring is that blurring kernels of different areas of the image are different, which is more complicated and more consistent with the situation we encounter in practice.
The invention mainly considers the condition of non-uniform blurring and provides a method for estimating a non-uniform blurring kernel from a blurred image. Whether the image blurring kernel is estimated accurately or not plays a crucial role in recovering a clear image from a blurred image. However, in real life, there is often only a single blurred image, and therefore, estimating the kernel of motion blur from the single blurred image is often a difficult problem.
In the prior art, two types of methods mainly exist for estimating the blur kernel, one is a traditional iterative optimization method based on natural image statistical prior, and the other is a method for learning from a large amount of data based on deep learning. In recent years, the algorithm model based on the deep neural network is widely applied to various problems in the field of computer vision, and great progress is made. In view of the problem of image deblurring, various models of deep networks have been developed to obtain a good effect of estimating a kernel from a blurred image or recovering a sharp image directly from a blurred image.
However, in the existing estimation methods of the image non-uniform blur kernel based on the deep learning, some methods use image blocks to train the blur kernel of a part of a learning image, and some methods use the whole image to learn, but do not consider the correlation between the blur kernels of adjacent image blocks, thereby affecting the accuracy of the estimation of the image blur kernel.
Disclosure of Invention
The invention provides a method for estimating a non-uniform image blur kernel based on U-Net, which aims to overcome the defects in the prior art and improve the accuracy of blur kernel estimation.
The technical solution of the invention is as follows: a method for estimating a nonuniform image blurring kernel based on U-Net comprises the following steps:
step S1: acquiring a plurality of clear natural images: the data set is constructed using these sharp natural images to train the model.
Step S2: generating a non-uniform fuzzy core, and carrying out fuzzy processing on the clear image by using the fuzzy core to obtain a corresponding fuzzy image to form a training data set: firstly, generating a plurality of different non-uniform fuzzy kernels through manual sampling, and then generating corresponding fuzzy images from clear natural images through convolution operation so as to form a data set required by a training model;
step S3: building a network model for estimating a non-uniform fuzzy kernel based on U-Net: estimating a nonuniform blur kernel of the blurred image by using a U-Net type network model, wherein the U-Net type network model comprises an encoder module, a decoder module and a layer jump connection, the input of the U-Net type network model is a 3-channel nonuniform blurred color image, and the output of the U-Net type network model is a nonuniform blur kernel causing the image blur and is a 1-channel gray level image;
step S4: designing a loss function of network model training: the model-optimized loss function is
Figure BDA0003682141210000022
Is a cost function for a given blurred image, in order to deblur the corresponding sharp image using the estimated kernel, the resulting blurred image being as identical as possible to the original blurred image, and the calculation formula is:
Figure BDA0003682141210000021
where i is the position of the pixel in the sharp image u, k i Is a local blurring kernel at pixel i, u i Is a kernel k centered on a pixel i in a sharp image i Local regions of the same size, v is the blurred image corresponding to sharp image u, v is the convolutionOperating the sign;
step S5: selecting an optimization algorithm, and optimizing a network model by using a training data set: optimizing a cost function by adopting a gradient descent method, specifically using an Adam algorithm, wherein the algorithm dynamically adjusts the learning rate of parameters by utilizing first moment estimation and second moment estimation of a gradient, so that the parameters are updated more stably; the learning rate is 0.0001 and after every 200 epochs of training set iteration, the learning rate is halved, for a total of 800 epochs of training set iteration;
step S6: and inputting the fuzzy image to be processed into the model to obtain the non-uniform fuzzy kernel of the fuzzy image.
Preferably, the encoder module in step S3 includes a convolutional layer and a downsampling layer, and performs the function of encoding the input by gradually reducing the size of the feature map and increasing the number of channels to extract features from the input image, where the basic unit and the network structure parameter included in the encoder module are set as follows:
(1) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 16;
(2) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(3) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 32;
(4) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(5) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 64;
(6) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(7) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 128;
(8) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(9) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 256;
the decoder module comprises an up-sampling layer and a convolution layer, the original resolution is restored step by increasing the size of the characteristic diagram step by step and reducing the number of channels, so that the non-uniform fuzzy kernel image corresponding to the input image is finally obtained, and the basic unit and the network structure parameter contained in the decoder module are set as follows:
(1) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(2) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 128;
(3) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(4) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 64;
(5) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(6) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 32;
(7) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(8) a convolution layer, the size of convolution kernel is 3X 3, step length is 1, channel number is 16;
(9) a convolution layer, the size of convolution kernel is 1 × 1, step length is 1, channel number is 1;
the purpose of layer jump connection is to perform feature fusion, namely, in the process of up-sampling by a decoder module, a feature graph obtained in the process of down-sampling at the same layer of an encoder is fused with the current feature graph, namely, channels of the feature graph are overlapped, and then the next operation is performed; the basic operation of 4 hop layer connections is employed in this network:
(1) connecting the 16-channel feature map obtained by the unit 1 in the encoder to the 32-channel input feature map of the decoder unit 8;
(2) the 32-channel signature obtained by unit 3 in the encoder is connected to the 64-channel input signature of decoder unit 6;
(3) the 64-channel signature obtained by unit 5 in the encoder is connected to the 128-channel input signature of decoder unit 4;
(4) connecting the 128-channel signature obtained by unit 7 in the encoder to the 256-channel input signature of decoder unit 2;
in the whole network model, except the last 1 × 1 convolution layer, all nonlinear excitation functions adopted behind the convolution layers are modified linear units, and in order to enable the numerical range of model output, namely estimated fuzzy kernels, to be within [0,1], the excitation function behind the last 1 × 1 convolution layer is a Sigmoid function.
Preferably, in step S2, the generating a non-uniform blur kernel, and performing a blur process on the sharp image by using the blur kernel to obtain a corresponding blurred image, and the process of forming the training data set specifically includes:
step S21: constructing a generation model of the motion blur kernel: modeling a local blur kernel k of an image as a local motion vector, i.e. k ═ k (k) ι ,k θ ) Wherein k is ι Is the length of the motion vector, represents the magnitude of the motion vector, and ranges from 0 to 20 pixels in length; k is a radical of θ The angle of the motion vector represents the direction of the motion vector, the range is 0 to 180 degrees, and the interval of sampling is 15 degrees; in order to represent the blur kernel in a two-dimensional image region, it needs to be written in the form of a cartesian coordinate system:
k x =k l cos(k θ ),k y =k l sin(k θ )
wherein k is x And k y The coordinates of the x-axis and y-axis of vector k, respectively;
step S22: sampling parameters in the fuzzy kernel generation model to generate different fuzzy kernels of adjacent regions, and finally obtaining an integral non-uniform fuzzy kernel; only a single local kernel can be generated by using the model for generating the blur kernel in step S21, since the blur condition of the entire image region is different, kernels of different regions need to be generated when constructing the training data set, but these kernels cannot be completely random, when the blur kernel sample of the entire image region is generated, the parameters of these kernels need to be constrained to reflect the real condition,
for two cores k of adjacent regions m =(k ι m ,k θ m ) And k n =(k ι n ,k θ n ) When generating these composite kernels, for the difference of their parameters ^ k ι =k ι m -k ι n V,. and & θ =k θ m -k θ n Obtained by sampling the laplace distribution f (x | μ, b) with the parameters μ ═ 0 and b ═ 1:
Figure BDA0003682141210000051
the distribution is sparse, most of the values generated by sampling are near 0, and a few of the values are large; by using the method, the parameter values of the fuzzy kernels of most adjacent regions are relatively close, and only the fuzzy kernels of less adjacent regions are changed remarkably;
step S23: for a given sharp image, generating a corresponding blurred image by using a non-uniform blurring kernel; blurring a sharp image using the non-uniform blurring kernel k generated in step S22; given a sharp image u, the pixels at the location of the corresponding blurred image are:
v i =u i *k i +n i
wherein k is i Is the local blur kernel, u, at pixel i obtained in step S22 i Is a kernel k centered on a pixel i in a sharp image i Local regions of equal size, n i Additive noise which is a Gaussian distribution with 0 intensity of 0.01 in the mean value, a fuzzy kernel k at each pixel position in a specific local small area i Are the same;
step S24: constructing a training data set by the non-uniform fuzzy kernel and the corresponding fuzzy image; the blurred kernel k generated in step S22, the corresponding blurred image v, and the original sharp image u form a piece of training data, and the sharp images collected in step S1 are all subjected to similar operations to obtain the entire training data set.
Preferably, in the step S3, the step S4 and the step S5, the PyTorch is used to build a network model, design a loss function of the model and select an optimization algorithm to train the model.
The invention has the advantages that: 1) when motion vectors of adjacent image regions are modeled, the difference values of the parameters of the adjacent image regions are subjected to sparse distribution, so that the motion of local regions of the image is consistent, and the motion of different regions can have large change, which is more consistent with the condition of non-uniform blurring of the image in reality;
2) the whole blurred image is used for estimating the blurred kernel to learn the distribution of the motion vectors on the whole image instead of using a single image block, so that the whole information of the image can be well utilized when the local blurred kernel is estimated;
3) U-Net is an effective classic network model in the field of image segmentation, and is used for estimating motion blur kernels of different regions, so that the advantages of the model on region segmentation can be utilized, and respective blur conditions of the different regions can be better estimated.
Drawings
FIG. 1 is a schematic flow chart of the estimation method of the image non-uniform blur kernel based on U-Net in the invention.
FIG. 2 is a schematic diagram of one embodiment of a non-uniformly blurred image.
FIG. 3 is a schematic structural diagram of a network model for estimating the non-uniform blur kernel based on U-Net in step S3 in FIG. 1.
Fig. 4 is a schematic flowchart of the training data set construction in step S2 in fig. 1.
FIG. 5 is a schematic diagram of one embodiment of a non-uniform blur kernel.
FIG. 6 is a diagram of one embodiment of a blurred image and corresponding region blur kernel.
Detailed Description
The present invention will be described in further detail with reference to examples and specific embodiments.
Aiming at the problems in the prior art, the invention provides a method for estimating a non-uniform image blur kernel based on U-Net. In order to better utilize the overall information of the blurred image, a U-Net deep neural network structure is adopted, and the whole blurred image is used as an input to train a learning blur kernel. U-Net is a classic model in the field of image segmentation, and the model can better utilize the integral prior information of an image, so that the distribution condition of a motion blur kernel on the whole image can be learned. Moreover, the motion blur kernel of different regions is estimated by using U-Net, and the capability of segmenting the different regions by using U-Net can be utilized, so that the respective blur kernels of the different regions can be estimated more accurately.
In addition, in order to better learn the relevance between the blur kernels of the adjacent regions in the image, when training data are constructed, the difference values of the vector parameters of the motion blur kernels of the adjacent image block regions are subjected to sparse distribution, so that the motions of the local regions of the image are consistent, the motions of different regions may have larger changes, the condition of the non-uniform blur kernels of the image in reality can be better modeled, and the accuracy of the estimation of the blur kernels is improved.
As shown in fig. 1, a method for estimating a non-uniform blur kernel of an image based on U-Net includes the following steps:
step S1: some clear natural images were acquired: the data set is constructed mainly using these sharp natural images to train the model.
Step S2: generating a non-uniform fuzzy core, and carrying out fuzzy processing on the clear image by using the fuzzy core to obtain a corresponding fuzzy image to form a training data set: there are many reasons for image blur, such as defocus blur, motion blur, etc., and the process of image blur can be mathematically expressed as a sharp image and a blur kernel obtained by convolution operations. If the blur kernel is different in different regions of the image, it is called non-uniform blur kernel, which is also a common situation in reality. Firstly, different non-uniform fuzzy kernels are generated through artificial sampling, and then clear natural images are generated into corresponding fuzzy images through convolution operation, so that a data set required by a training model is formed.
In a specific embodiment, a non-uniformly blurred image is shown in fig. 2, and it can be seen that in this image, the blur is more severe in the two side portions, while the blur is relatively more sharp in the middle portion, so that the kernels causing the blur in the different regions of the image are not uniform. It is an object of the invention to estimate from the blurred image a non-uniform blur kernel that causes blurring thereof.
Step S3: building a network model for estimating a non-uniform fuzzy kernel based on U-Net: U-Net is a classic full convolution deep neural network based on a U-shaped structure of an encoder and a decoder, and is widely applied to tasks such as semantic segmentation and image restoration of images.
A non-uniform blurring kernel of a blurred image is estimated by using a network model of a U-Net type, and the model structure is shown in FIG. 3.
The network model mainly comprises an encoder module of a left half part, a decoder module of a right half part and a layer-skipping connection between the encoder module and the decoder module, wherein the input of the network model is a 3-channel non-uniform blurred color image, and the output of the network model is a non-uniform blurring kernel causing the image blurring and is a 1-channel gray-scale image.
The encoder module is composed of basic units such as a convolutional layer and a downsampling layer, and the features are extracted from the input image by gradually reducing the size of the feature map and increasing the number of channels, so that the function of encoding the input is achieved. The basic unit and network structure parameter settings contained therein are as follows:
(1) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 16;
(2) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(3) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 32;
(4) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(5) two convolution layers, the size of convolution kernel is 3 x 3, step length is 1, channel number is 64;
(6) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(7) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 128;
(8) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(9) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 256;
the decoder module is composed of basic units such as an up-sampling layer and a convolution layer, and the original resolution is gradually restored by gradually increasing the size of the feature map and reducing the number of channels, so that the non-uniform fuzzy kernel image corresponding to the input image is finally obtained. The basic unit and network structure parameter settings contained therein are as follows:
(1) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(2) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 128;
(3) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(4) two convolution layers, the size of convolution kernel is 3 x 3, step length is 1, channel number is 64;
(5) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(6) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 32;
(7) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(8) a convolution layer, the size of convolution kernel is 3X 3, step length is 1, channel number is 16;
(9) one convolution layer, the size of convolution kernel is 1 × 1, the step size is 1, and the number of channels is 1.
The purpose of layer jump connection is to perform feature fusion, that is, in the process of up-sampling by a decoder module, a feature map obtained in the process of down-sampling at the same layer of an encoder is fused with a current feature map, that is, channels of the feature map are overlapped, and then the next operation is performed. The basic operation of 4 hop layer connections is employed in this network:
(1) connecting the 16-channel feature map obtained by the unit 1 in the encoder to the 32-channel input feature map of the decoder unit 8;
(2) the 32-channel signature obtained by unit 3 in the encoder is connected to the 64-channel input signature of decoder unit 6;
(3) the 64-channel signature obtained by unit 5 in the encoder is connected to the 128-channel input signature of decoder unit 4;
(4) connecting the 128-channel signature obtained by unit 7 in the encoder to the 256-channel input signature of decoder unit 2;
in addition, in the entire network model, except for the last 1 × 1 convolution layer, all nonlinear excitation functions used after the convolution layer are modified Linear units (relus), and in order to make the value range of the model output, i.e., the estimated blur kernel, within [0,1], the excitation function after the last 1 × 1 convolution layer is a Sigmoid function.
Step S4: designing a loss function of network model training: the model-optimized loss function is
Figure BDA0003682141210000092
It is a cost function for a given blurred image in order to deblur the corresponding sharp image using the estimated kernel, the resulting blurred image being as identical as possible to the original blurred image. The calculation formula is as follows:
Figure BDA0003682141210000091
where i is the position of the pixel in the sharp image u, k i Is a local blurring kernel at pixel i, u i Is a kernel k centered on a pixel i in a sharp image i And v is a blurred image corresponding to the sharp image u, and v is a convolution operation symbol.
Step S5: selecting an optimization algorithm, and optimizing a network model by using a training data set: a gradient descent method is adopted to optimize the cost function, and particularly an Adam algorithm is used, wherein the Adam algorithm dynamically adjusts the learning rate of the parameters by utilizing the first moment estimation and the second moment estimation of the gradient, so that the parameters are updated more stably. The learning rate is 0.0001 and after every 200 epochs of the training set are iterated, the learning rate is halved for a total of 800 epochs of the training set.
Step S6: and inputting the fuzzy image to be processed into the model to obtain the non-uniform fuzzy kernel of the fuzzy image.
In step S2, a non-uniform blur kernel is generated, and the blurred image is blurred by using the blur kernel to obtain a corresponding blurred image, and the flow of constructing the training data set is specifically as shown in fig. 4, and includes:
step S21: constructing a generation model of the motion blur kernel: modeling a local blur kernel k of an image as a local motion vector, i.e. k ═ k (k) ι ,k θ ) Wherein k is ι Is the length of the motion vector, represents the magnitude of the motion vector, and ranges from 0 to 20 pixels in length; k is a radical of θ The angle of the motion vector indicates the direction of the motion vector, and the range is 0 to 180 degrees, and the interval of sampling is 15 degrees. In order to represent the blur kernel in a two-dimensional image region, it needs to be written in the form of a cartesian coordinate system:
k x =k l cos(k θ ),k y =k l sin(k θ )
wherein k is x And k y The x-axis and y-axis coordinates of vector k, respectively.
Step S22: and sampling parameters in the fuzzy kernel generation model to generate different fuzzy kernels of adjacent regions, and finally obtaining the integral non-uniform fuzzy kernel. Only a single local kernel can be generated by using the model for generating the blur kernel in the previous step, and since the blur condition of the whole image region is different, kernels of different regions need to be generated when constructing the training data set, and the kernels cannot be completely random, because the blur conditions of adjacent regions are generally similar, and the blur conditions of different regions may have a large difference. Therefore, when the blurred kernel samples of the whole image area are generated, certain constraints need to be made on the parameters of the kernels so as to reflect the real situation as much as possible.
For two cores k of adjacent regions m =(k ι m ,k θ m ) And k n =(k ι n ,k θ n ) When generating these composite kernels, for the difference of their parameters ^ k ι =k ι m -k ι n V,. and & θ =k θ m -k θ n Obtained by sampling the laplacian distribution f (x | μ, b) with the parameters μ ═ 0 and b ═ 1:
Figure BDA0003682141210000101
the distribution is a sparse distribution, and the values generated by sampling are mostly around 0, and a few values are larger. In this way, the parameter values of the blur kernels of most adjacent regions can be relatively close, and only the blur kernels of less adjacent regions are changed remarkably. In a specific embodiment, a sample of a non-uniform blur kernel is shown in FIG. 5. As can be seen from this figure, the motion vectors characterizing the blur kernel of neighboring regions vary slowly from one region to another, while the blur kernel varies significantly from one region to another.
Step S23: for a given sharp image, a corresponding blurred image is generated using a non-uniform blur kernel. With the non-uniform blur kernel k generated in the previous step, it can be used to blur sharp images. Given a sharp image u, the pixels at the location of the corresponding blurred image are:
v i =u i *k i +n i
wherein k is i Is the local blur kernel, u, at pixel i obtained in the previous step i Is a kernel k centered on a pixel i in a sharp image i Local regions of equal size, n i Additive noise is a gaussian distribution with a mean 0 and an intensity of 0.01. It should be noted that in a particular local area, the size is smallIn the region, a blur kernel k at each pixel position i Are the same.
Step S24: the non-uniform blur kernel and the corresponding blurred image are formed into a training data set. The blurred kernel k generated in the previous step, the corresponding blurred image v and the original sharp image u constitute a piece of training data, and the sharp images collected in step S1 are all similarly operated to obtain the whole training data set.
In steps S3, S4, and S5, a network model is built using PyTorch, a loss function of the model is designed, and an optimization algorithm is selected to train the model. The PyTorch is an open-source deep learning framework, is simple, flexible, rapid and efficient, can construct deep neural networks with various structures and automatic derivation functions, and can realize calculation acceleration by using a GPU (graphics processing unit), thereby greatly shortening the training time of the model.
In a specific embodiment, after the network model is trained, the blurred image shown in fig. 2 is input into the model to obtain its non-uniform blur kernel, and then the estimated local blur kernel is mapped to the corresponding region in the image to see the correspondence between the blurred image and the kernel, as shown in fig. 6. It can be seen from the figure that the scale of the blur kernel estimated in the two side regions is larger, and the scale of the blur kernel estimated in the middle region is smaller, which is consistent with the visual perception of observing the blurred image shown in fig. 2, which indicates that the method of the present proposal achieves a better effect of estimating the non-uniform blur kernel of the image. After the blur kernel of the image is obtained, some classical image deblurring algorithms can be invoked to restore a sharp image.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various changes and modifications without departing from the inventive concept, and these changes and modifications are all within the scope of the present invention.

Claims (4)

1. A method for estimating a nonuniform image blurring kernel based on U-Net is characterized by comprising the following steps:
step S1: acquiring a plurality of clear natural images: constructing a data set by using the clear natural images to train a model;
step S2: generating a non-uniform fuzzy core, and carrying out fuzzy processing on the clear image by using the fuzzy core to obtain a corresponding fuzzy image to form a training data set: firstly, generating a plurality of different non-uniform fuzzy kernels through manual sampling, and then generating corresponding fuzzy images from clear natural images through convolution operation so as to form a data set required by a training model;
step S3: building a network model for estimating the non-uniform fuzzy core based on U-Net: estimating a nonuniform blur kernel of the blurred image by using a U-Net type network model, wherein the U-Net type network model comprises an encoder module, a decoder module and a layer jump connection, the input of the U-Net type network model is a 3-channel nonuniform blurred color image, and the output of the U-Net type network model is a nonuniform blur kernel causing the image blur and is a 1-channel gray level image;
step S4: designing a loss function for network model training: the model-optimized loss function is
Figure FDA0003682141200000012
Is a cost function for a given blurred image, in order to deblur the corresponding sharp image using the estimated kernel, the resulting blurred image being as identical as possible to the original blurred image, and the calculation formula is:
Figure FDA0003682141200000011
where i is the position of the pixel in the sharp image u, k i Is a local blurring kernel at pixel i, u i Is a kernel k centered on a pixel i in a sharp image i The local areas with the same size, v is a blurred image corresponding to the clear image u, and v is a convolution operation symbol;
step S5: selecting an optimization algorithm, and optimizing a network model by using a training data set: optimizing a cost function by adopting a gradient descent method, specifically using an Adam algorithm, wherein the Adam algorithm dynamically adjusts the learning rate of parameters by utilizing first moment estimation and second moment estimation of gradients, so that the parameters are updated more stably; the learning rate is 0.0001 and after every 200 epochs of training set iteration, the learning rate is halved, for a total of 800 epochs of training set iteration;
step S6: and inputting the fuzzy image to be processed into the model to obtain the non-uniform fuzzy kernel of the fuzzy image.
2. The method as claimed in claim 1, wherein the encoder module in step S3 includes a convolution layer and a down-sampling layer, and the encoder module performs the function of encoding the input by gradually reducing the size of the feature map and increasing the number of channels to extract features from the input image, and the encoder module includes the following basic units and network structure parameters:
(1) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 16;
(2) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(3) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 32;
(4) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(5) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 64;
(6) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(7) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 128;
(8) a down-sampling layer, adopting maximum pooling with window size of 2 × 2, with step length of 2;
(9) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 256;
the decoder module comprises an up-sampling layer and a convolution layer, the original resolution is restored step by increasing the size of the characteristic diagram step by step and reducing the number of channels, so that the non-uniform fuzzy kernel image corresponding to the input image is finally obtained, and the basic unit and the network structure parameter contained in the decoder module are set as follows:
(1) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(2) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 128;
(3) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(4) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 64;
(5) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(6) two convolution layers, the size of convolution kernel is 3X 3, step length is 1, channel number is 32;
(7) an up-sampling layer, using bilinear difference with amplification scale of 2 × 2;
(8) a convolution layer, the size of convolution kernel is 3X 3, step length is 1, channel number is 16;
(9) a convolution layer, the size of convolution kernel is 1 × 1, step length is 1, channel number is 1;
the purpose of layer jump connection is to perform feature fusion, namely, in the process of up-sampling by a decoder module, a feature graph obtained in the process of down-sampling at the same layer of an encoder is fused with the current feature graph, namely, channels of the feature graph are overlapped, and then the next operation is performed; the basic operation of 4 hop layer connections is employed in this network:
(1) connecting the 16-channel feature map obtained by the unit 1 in the encoder to the 32-channel input feature map of the decoder unit 8;
(2) the 32-channel signature obtained by unit 3 in the encoder is connected to the 64-channel input signature of decoder unit 6;
(3) the 64-channel signature obtained by unit 5 in the encoder is connected to the 128-channel input signature of decoder unit 4;
(4) connecting the 128-channel signature obtained by unit 7 in the encoder to the 256-channel input signature of decoder unit 2;
in the whole network model, except the last 1 × 1 convolution layer, all nonlinear excitation functions adopted behind the convolution layers are modified linear units, and in order to enable the numerical range of model output, namely estimated fuzzy kernels, to be within [0,1], the excitation function behind the last 1 × 1 convolution layer is a Sigmoid function.
3. The method as claimed in claim 2, wherein the step S2 of generating the non-uniform blur kernel and using the blur kernel to blur the sharp image to obtain a corresponding blurred image, and the process of constructing the training data set specifically includes:
step S21: constructing a generation model of the motion blur kernel: modeling a local blur kernel k of an image as a local motion vector, i.e. k ═ k (k) ι ,k θ ) Wherein k is ι Is the length of the motion vector, represents the magnitude of the motion vector, and ranges from 0 to 20 pixels in length; k is a radical of θ The angle of the motion vector represents the direction of the motion vector, the range of the angle is 0 to 180 degrees, and the interval of sampling is 15 degrees; in order to represent the blur kernel in a two-dimensional image region, it needs to be written in the form of a cartesian coordinate system:
k x =k l cos(k θ ),k y =k l sin(k θ )
wherein k is x And k y The coordinates of the x-axis and y-axis of vector k, respectively;
step S22: sampling parameters in the fuzzy kernel generation model to generate different fuzzy kernels of adjacent regions, and finally obtaining an integral non-uniform fuzzy kernel; only a single local kernel can be generated by using the model for generating the blur kernel in step S21, since the blur condition of the entire image region is different, kernels of different regions need to be generated when constructing the training data set, but these kernels cannot be completely random, when the blur kernel sample of the entire image region is generated, the parameters of these kernels need to be constrained to reflect the real condition,
for two cores k of adjacent regions m =(k ι m ,k θ m ) And k n =(k ι n ,k θ n ) When generating these composite kernels, for the difference of their parameters ^ k ι =k ι m -k ι n V,. and & θ =k θ m -k θ n Obtained by sampling the laplacian distribution f (x | μ, b) with the parameters μ ═ 0 and b ═ 1:
Figure FDA0003682141200000041
the distribution is sparse, most of the values generated by sampling are near 0, and a few of the values are large; by using the method, the parameter values of the fuzzy kernels of most adjacent regions are relatively close, and only the fuzzy kernels of less adjacent regions are changed remarkably;
step S23: for a given sharp image, generating a corresponding blurred image by using a non-uniform blurring kernel; blurring a sharp image using the non-uniform blurring kernel k generated in step S22; given a sharp image u, the pixels at the location of the corresponding blurred image are:
v i =u i *k i +n i
wherein k is i Is the local blur kernel, u, at pixel i obtained in step S22 i Is a kernel k centered on a pixel i in a sharp image i Local areas of equal size, n i Additive noise which is a Gaussian distribution with 0 intensity of 0.01 in the mean value, a fuzzy kernel k at each pixel position in a specific local small area i Are the same;
step S24: constructing a training data set by the non-uniform fuzzy kernel and the corresponding fuzzy image; the blurred kernel k generated in step S22, the corresponding blurred image v, and the original sharp image u form a piece of training data, and the sharp images collected in step S1 are all subjected to similar operations to obtain the entire training data set.
4. The method for estimating the non-uniform image blur kernel based on U-Net as claimed in any one of claims 1-3, wherein in the steps S3, S4 and S5, PyTorch is used to build a network model, design a loss function of the model and select an optimization algorithm to train the model.
CN202210640745.9A 2022-06-07 2022-06-07 U-Net-based image non-uniform blur kernel estimation method Pending CN115018726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210640745.9A CN115018726A (en) 2022-06-07 2022-06-07 U-Net-based image non-uniform blur kernel estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210640745.9A CN115018726A (en) 2022-06-07 2022-06-07 U-Net-based image non-uniform blur kernel estimation method

Publications (1)

Publication Number Publication Date
CN115018726A true CN115018726A (en) 2022-09-06

Family

ID=83073184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210640745.9A Pending CN115018726A (en) 2022-06-07 2022-06-07 U-Net-based image non-uniform blur kernel estimation method

Country Status (1)

Country Link
CN (1) CN115018726A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116228607A (en) * 2023-05-09 2023-06-06 荣耀终端有限公司 Image processing method and electronic device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116228607A (en) * 2023-05-09 2023-06-06 荣耀终端有限公司 Image processing method and electronic device
CN116228607B (en) * 2023-05-09 2023-09-29 荣耀终端有限公司 Image processing method and electronic device

Similar Documents

Publication Publication Date Title
CN109389552B (en) Image super-resolution algorithm based on context-dependent multitask deep learning
CN109102462B (en) Video super-resolution reconstruction method based on deep learning
CN108921799B (en) Remote sensing image thin cloud removing method based on multi-scale collaborative learning convolutional neural network
CN111709895A (en) Image blind deblurring method and system based on attention mechanism
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN111091503A (en) Image out-of-focus blur removing method based on deep learning
CN112541877B (en) Defuzzification method, system, equipment and medium for generating countermeasure network based on condition
CN108346133B (en) Deep learning network training method for super-resolution reconstruction of video satellite
CN112288632A (en) Single image super-resolution method and system based on simplified ESRGAN
CN114004766A (en) Underwater image enhancement method, system and equipment
CN113658040A (en) Face super-resolution method based on prior information and attention fusion mechanism
CN116681584A (en) Multistage diffusion image super-resolution algorithm
CN116664450A (en) Diffusion model-based image enhancement method, device, equipment and storage medium
CN114638768B (en) Image rain removing method, system and equipment based on dynamic association learning network
CN109993701B (en) Depth map super-resolution reconstruction method based on pyramid structure
CN115018726A (en) U-Net-based image non-uniform blur kernel estimation method
Saleem et al. A non-reference evaluation of underwater image enhancement methods using a new underwater image dataset
CN112270691B (en) Monocular video structure and motion prediction method based on dynamic filter network
CN112200752B (en) Multi-frame image deblurring system and method based on ER network
CN113096032A (en) Non-uniform blur removing method based on image area division
CN113724134A (en) Aerial image blind super-resolution reconstruction method based on residual distillation network
CN112509144A (en) Face image processing method and device, electronic equipment and storage medium
Park et al. False contour reduction using neural networks and adaptive bi-directional smoothing
CN114862699B (en) Face repairing method, device and storage medium based on generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination