CN114240764B

CN114240764B - De-blurring convolutional neural network training method, device, equipment and storage medium

Info

Publication number: CN114240764B
Application number: CN202111342163.4A
Authority: CN
Inventors: 丁贵广; 王泽润
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-11-12
Filing date: 2021-11-12
Publication date: 2024-04-23
Anticipated expiration: 2041-11-12
Also published as: CN114240764A

Abstract

The application provides a method, a device, equipment and a storage medium for training a defuzzified convolutional neural network, wherein the method comprises the following steps: acquiring a fuzzy image training set, wherein the fuzzy image training set comprises a local fuzzy training set and a global fuzzy training set; constructing an initial deblurring convolutional neural network, including a fuzzy region sensing network and a deblurring network; wherein the defuzzification network comprises a fuzzy region awareness module and a defuzzification module; training the local fuzzy training set on the fuzzy area sensing network and the fuzzy area sensing attention module respectively, and inputting the fuzzy image training set to the defuzzification module for training to obtain an intermediate defuzzification convolutional neural network; and (3) alternately inputting the local fuzzy training set and the global fuzzy training set into the middle deconvolution neural network to perform joint training, so as to obtain the final deconvolution neural network. The method enables the deblurring convolutional neural network to meet the actual application scene better and improves the deblurring effect.

Description

De-blurring convolutional neural network training method, device, equipment and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to the technical fields of deep learning, image processing and the like, and particularly relates to a method, a device, equipment and a storage medium for training a defuzzified convolutional neural network.

Background

Image blur is typically due to the relative motion within the camera and scene during the time that the camera is exposed, which can be broadly divided into two categories: one is global blurring, mainly because the camera shakes in the exposure time, relative motion with a shooting picture is formed, and the shot image becomes blurred; the other is local blurring, mainly because of the existence of moving people or objects in the shooting scene, so that relative motion is formed with the camera, and the local blurring in the shot image is caused. At present, the optical anti-shake function of the photographic equipment is mature gradually, and the deblurring treatment of the local blurred image becomes a wider blurred scene.

With the rapid development of deep learning and neural networks, convolutional neural networks are widely used in image deblurring tasks due to the capability of modeling complex blur kernels, so that the image deblurring effect is greatly improved. However, the existing convolutional neural network design and data set acquisition work mainly focuses on how to solve the problem of global blurring, cannot generalize to actual use scenes, and can generate the phenomena that blurring cannot be removed or clear parts are damaged. At the same time, the processing capacity of the current work for the common phenomenon of local blurring cannot be evaluated due to the lack of a corresponding data set.

Disclosure of Invention

Based on the problems, the application provides a method, a device, equipment and a storage medium for training a defuzzified convolutional neural network in order to improve the defuzzification effect.

According to a first aspect of the present application, there is provided a defuzzified convolutional neural network training method comprising:

Acquiring a fuzzy image training set, wherein the fuzzy image training set comprises a local fuzzy training set and a global fuzzy training set; the fuzzy images in the local fuzzy training set are obtained by calculating background images and foreground image blocks which move in a preset mode;

constructing an initial deblurring convolutional neural network, including a fuzzy region sensing network and a deblurring network; wherein the defuzzification network comprises a fuzzy region awareness module and a defuzzification module;

Inputting the local fuzzy training set into the fuzzy region sensing network, outputting a fuzzy region prediction result, and training the fuzzy region sensing network according to the fuzzy region prediction result and a first loss function;

inputting the blurred image training set into the deblurring module, outputting a deblurring result, and training the deblurring module according to the deblurring result and a second loss function;

Inputting the local fuzzy training set into the fuzzy region perception attention module, outputting an attention result, and training the fuzzy region perception attention module according to the attention result and a third loss function;

Constructing an intermediate defuzzified convolutional neural network according to the trained fuzzy region sensing network, the defuzzified module and the fuzzy region sensing attention module;

And alternately inputting the local fuzzy training set and the global fuzzy training set into the middle defuzzified convolutional neural network, outputting a prediction result, and carrying out combined training according to the prediction result and a fourth loss function to obtain a final defuzzified convolutional neural network.

Wherein the local fuzzy training set comprises a plurality of fuzzy images, and each fuzzy image is obtained by the following modes:

acquiring a clear image from the deblurred data set as a background image;

acquiring at least one image from the semantic segmentation data set, and extracting all foreground image blocks in each image;

Acquiring a motion sequence with the length of n, wherein n is an odd number greater than or equal to 3;

moving all foreground image blocks in the at least one image on the background image at the same time according to the motion sequence to obtain a motion image of continuous frames;

and averaging the moving images of the continuous frames to obtain the blurred image.

In an embodiment of the present application, the acquiring moving images of consecutive frames includes:

According to the motion sequence, all foreground image blocks in the at least one image are moved on the background image once to obtain a combined image; wherein the combined image comprises all foreground image blocks and the background image in the at least one image;

And after the motion sequence is completed, taking a plurality of combined images as the motion images of the continuous frames.

In some embodiments of the present application, the number of the blurred region aware attention modules in the initial deblurring network is i, wherein i is an integer greater than or equal to 1; the calculation formula of the third loss function is as follows:

Where L _BSA is the third loss function, And for the attention output result of the i-th fuzzy region perception attention module, M _seg is fuzzy region labeling information generated for the fuzzy image in the local fuzzy training set.

In some embodiments of the application, when the local fuzzy training set is input to an intermediate defuzzified convolutional neural network, the intermediate defuzzified convolutional neural network comprises the trained fuzzy region-aware network and the trained defuzzified network; when the global fuzzy training set is input to the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network only comprises the trained defuzzification module.

And performing joint training according to the prediction result and a fourth loss function to obtain a final deblurring convolutional neural network, wherein the method comprises the following steps of:

acquiring a first prediction result when the local fuzzy training set is input to the middle defuzzification convolutional neural network;

Acquiring the fourth loss function; wherein the fourth loss function is a weighted sum of the first, second, and third loss functions;

And performing joint training according to the first prediction result and the fourth loss function to obtain a final deblurring convolutional neural network.

In addition, the performing the joint training according to the prediction result and the fourth loss function to obtain a final deblurring convolutional neural network includes:

acquiring a second prediction result when the global fuzzy training set is input to the middle defuzzification convolutional neural network;

acquiring the fourth loss function; wherein the fourth loss function is the second loss function;

And performing joint training according to the second prediction result and the fourth loss function to obtain a final deblurring convolutional neural network.

According to a second aspect of the present application, there is provided a defuzzified convolutional neural network training device comprising:

The acquisition module is used for acquiring a fuzzy image training set, wherein the fuzzy image training set comprises a local fuzzy training set and a global fuzzy training set; the fuzzy images in the local fuzzy training set are obtained by calculating background images and foreground image blocks which move in a preset mode;

the first construction module is used for constructing an initial deblurring convolutional neural network, and comprises a fuzzy region sensing network and a deblurring network; wherein the defuzzification network comprises a fuzzy region awareness module and a defuzzification module;

the first training module is used for inputting the local fuzzy training set into the fuzzy region sensing network, outputting a fuzzy region prediction result, and training the fuzzy region sensing network according to the fuzzy region prediction result and a first loss function;

the second training module is used for inputting the blurred image training set into the deblurring module, outputting a deblurring result and training the deblurring module according to the deblurring result and a second loss function;

The third training module is used for inputting the local fuzzy training set into the fuzzy region perception attention module, outputting an attention result, and training the fuzzy region perception attention module according to the attention result and a third loss function;

The second construction module is used for constructing an intermediate defuzzified convolutional neural network according to the trained fuzzy region sensing network, the defuzzified module and the fuzzy region sensing attention module;

And the fourth training module is used for alternately inputting the local fuzzy training set and the global fuzzy training set into the middle defuzzified convolutional neural network, outputting a prediction result, and carrying out combined training according to the prediction result and a fourth loss function to obtain the final defuzzified convolutional neural network.

In an embodiment of the present application, the local fuzzy training set includes a plurality of fuzzy images, and the acquiring module is configured to acquire each of the fuzzy images, including:

acquiring a clear image from the deblurred data set as a background image;

In some embodiments of the application, the acquisition module is further to:

In some embodiments of the application, the fourth training module is configured to:

When the local fuzzy training set is input to an intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network comprises the trained fuzzy region sensing network and the trained defuzzification network;

When the global fuzzy training set is input to the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network only comprises the trained defuzzification module.

In some embodiments of the application, the fourth training module is further configured to:

According to a third aspect of the present application, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the embodiment of the first aspect of the present application when executing the computer program.

According to a fourth aspect of the present application there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method of the first aspect of the present application.

According to the technical scheme provided by the embodiment of the application, the local fuzzy image is obtained by calculating the foreground image block and the background image which move in a preset mode, so that the training set of the local fuzzy image is obtained, a new method is provided for obtaining the local fuzzy image, and the problem of lack of the local fuzzy image data set at present is solved. In addition, a fuzzy region sensing network and a fuzzy region sensing attention module are introduced into the defuzzification convolutional neural network, and the global fuzzy training set and the local fuzzy training set are used for carrying out a partial-first-last combined training method, so that the model training efficiency is improved, the defuzzification convolutional neural network can realize defuzzification processing of local fuzzy images and global fuzzy images, the image quality is greatly improved, and the scene applicability of the convolutional neural network for defuzzification is improved.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flowchart of a method for training a defuzzified convolutional neural network according to an embodiment of the present application;

FIG. 2 is a flowchart of acquiring a blurred image according to an embodiment of the present application;

FIG. 3 is a schematic diagram of acquiring a blurred image according to an embodiment of the present application;

FIG. 4 is a flowchart of a combined training of a deconvolution neural network according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a deconvolution neural network in an embodiment of the present application;

FIG. 6 is a comparative graph of deblurring effects of the present application on an actual acquired blurred image;

fig. 7 is a block diagram of a device for training a defuzzified convolutional neural network according to an embodiment of the present application.

Fig. 8 is a block diagram of a computer device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.

It should be noted that, the local blur of the image, that is, the image blur mainly caused by the rapid motion of the object, is a common blur scene to be solved, and with the maturation of the optical anti-shake function of the photographing device, the deblurring process of the local blur of the image will be a requirement of wider practical application scenes.

At present, a deep learning method is adopted to perform image deblurring, clear images and blurred images are required to be used as training data, a designed convolutional neural network is trained, and the mean square error of a network output image and an actual clear image is used as a loss function, so that network parameters are continuously adjusted until training is completed. However, the existing convolutional neural network is mainly focused on the deblurring process of global blurred images, the image blurring is mainly caused by the rapid motion of a camera, and a corresponding data set is lacking for a local blurred scene, so that the phenomena that the blurring cannot be removed or clear parts are destroyed and the like can occur in an actual application scene.

Based on the above problems, the application provides a method, a device, equipment and a storage medium for training a defuzzified convolutional neural network.

Fig. 1 is a flowchart of a method for training a defuzzified convolutional neural network according to an embodiment of the present application. It should be noted that, the method for training a defuzzified convolutional neural network according to the embodiment of the present application may be applied to the device for training a defuzzified convolutional neural network according to the embodiment of the present application, and the device may be configured in a computer device. As shown in fig. 1, the method comprises the steps of:

Step 101, obtaining a blurred image training set, wherein the blurred image training set comprises a local blurred training set and a global blurred training set.

In the embodiment of the application, each training sample in the blurred image training set is a blurred image pair, that is, in the blurred image training set, the blurred image and the clear image exist in pairs, and each blurred image and the clear image are in one-to-one correspondence. Furthermore, to adapt the deblurred convolutional neural network to the actual scene, the blurred image training set includes both a local blurred training set and a global blurred training set. The global training set comprises a large number of global blurred images and clear images, and in the embodiment of the application, the global training set can use the existing deblurring data set. The local blur training set comprises a plurality of local blur images and clear images, wherein the local blur images are calculated by background images and foreground image blocks moving in a preset mode.

Step 102, constructing an initial deblurring convolutional neural network, which comprises a fuzzy region sensing network and a deblurring network; wherein the defuzzification network includes a fuzzy region aware attention module and a defuzzification module.

It can be appreciated that in order to meet the practical application scenario of the network structure, a fuzzy region sensing network and a fuzzy region sensing attention module are introduced into the initial defuzzification convolutional neural network. The fuzzy region sensing network is used for identifying fuzzy regions in the image, and the output result of the fuzzy region sensing network is input to the defuzzification network through convolution to obtain a clear picture. The fuzzy region perception attention module performs key processing on the fuzzy region based on the deblurring effect of the deblurring module. In addition, the backbone network structure of the initial deblurred convolutional neural network may be a conventional U-Net structure including a plurality of convolutional layers and an active layer.

It should be noted that, the number of the fuzzy region sensing attention modules in the deblurring network is i, where i is an integer greater than or equal to 1, and the specific number can be determined according to practical situations. Each fuzzy region perception attention module adopts a spatial attention mechanism, and the following formula is adopted:

M_SA＝Sigmoid(Conv_7*7(F)) (1)

F_Att＝F⊙M_SA (2)

wherein M _SA is the output result of the active layer in each fuzzy region sensing attention module, F is the input feature tensor, conv _7*7 () is convolution operation, the convolution kernel size is 7×7, sigmoid () is normalized activation function, F _Att is the output result of each fuzzy region sensing attention module.

Specifically, the working principle of the fuzzy region perception attention module is as follows: the input feature tensor is subjected to feature extraction through convolution operation, and then normalized through an activation function, which is equivalent to obtaining each space weight value corresponding to the input feature, and then the result is subjected to dot multiplication with the input feature, so that attention is applied to the input feature. Therefore, during network learning, the positions which are the fuzzy areas can be known according to the weights, that is, the network can pay more attention to the deblurring processing of the fuzzy areas, the network is helped to learn main contents, redundant expression is avoided, and the deblurring processing effect of the local fuzzy images is improved.

And step 103, inputting the local fuzzy training set into a fuzzy region sensing network, outputting a fuzzy region prediction result, and training the fuzzy region sensing network according to the fuzzy region prediction result and the first loss function.

Because the fuzzy region aware network is used to identify fuzzy regions in an image, a local fuzzy training set is used for training in order to pertinently train the network. And inputting the local blurred image into a blurred region sensing network to obtain a blurred region prediction result. And calculating a loss value through a first loss function according to the fuzzy region prediction result, the fuzzy region label corresponding to the local fuzzy image and the region consistency, and continuously adjusting parameters of the fuzzy region sensing network until the prediction result meets the expectations. Wherein the first loss function may use a cross entropy loss function as shown in equation (3):

L_LBP＝CrossEntropy(M_blur,M_seg) (3)

Where L _LBP is a first loss function, crossEntropy () is a cross entropy loss function at a pixel level, M _blur is a blurred region prediction result, and M _seg is blurred region labeling information corresponding to a partially blurred image.

And 104, inputting the blurred image training set into a deblurring module, outputting a deblurring result, and training the deblurring module according to the deblurring result and the second loss function.

It can be understood that the functions of each part of the networks in the initial defuzzified convolutional neural network are different, and in order to improve the training efficiency of each part of the networks, in the embodiment of the application, the corresponding training set is used for training each part of the networks independently.

The function of the deblurring module is to output clear images according to the blurred images, and the deblurring module comprises deblurring processing of global blurred images and deblurring processing of local blurred images, so that the global blurred training set and the local blurred training set are used as training sets of the deblurring module. After the blurred image training set is input into the deblurring module, a deblurring result is obtained, namely a predicted clear image is obtained. And comparing the predicted clear image with the real clear image corresponding to the blurred image, and continuously adjusting the parameter value of the deblurring module according to the loss value obtained by calculating the second loss function until the predicted clear image meets the expectations. Wherein the second loss function may be calculated using a conventional mean square error.

Step 105, inputting the local fuzzy training set into a fuzzy region perception attention module, outputting an attention result, and training the fuzzy region perception attention module according to the attention result and a third loss function.

In the embodiment of the application, the fuzzy region perception attention module is used for focusing on the local fuzzy image so as to focus the deblurring processing of the network on the fuzzy region, help the network learn main contents and strengthen the effect of the fuzzy processing. That is, the blurred region-aware attention module mainly acts in locally blurred scenes, so training for this attention module uses a locally blurred training set. And after the local fuzzy training set is input into the fuzzy region perception attention module, an attention result is obtained. And obtaining a loss value of the prediction result according to the obtained attention result and the third loss function, thereby continuously adjusting the parameters of the attention module and realizing the training of the module.

It should be noted that, in the embodiment of the present application, the calculation formula of the third loss function is shown in formula (4):

And 106, constructing an intermediate defuzzified convolutional neural network according to the trained fuzzy region sensing network, the defuzzified module and the fuzzy region sensing attention module.

That is, after training the fuzzy area sensing network, the defuzzification module and the fuzzy area sensing attention module in the initial defuzzification convolutional neural network respectively, the intermediate defuzzification convolutional neural network is obtained.

And step 107, alternately inputting the local fuzzy training set and the global fuzzy training set into the middle defuzzified convolutional neural network, outputting a prediction result, and carrying out combined training according to the prediction result and the fourth loss function to obtain the final defuzzified convolutional neural network.

It can be understood that the defuzzification convolutional neural network provided by the application combines the fuzzy region sensing network, the defuzzification module and the fuzzy region sensing attention module, and functions of all parts in the defuzzification convolutional neural network are combined with each other in actual use and jointly act on defuzzification processing of an image to obtain a clear image corresponding to a local fuzzy image and a global fuzzy image. Therefore, after the fuzzy area sensing network, the defuzzification module and the fuzzy area sensing attention module are respectively and independently trained, the constructed intermediate defuzzification convolutional neural network is required to be jointly trained so as to enable the obtained defuzzification convolutional neural network to achieve the expected defuzzification effect.

In the embodiment of the application, in order to enable the finally obtained deblurred convolutional neural network to meet the local fuzzy scene and be suitable for the global fuzzy scene, the local fuzzy training set and the global fuzzy training set are alternately input into the middle deblurred convolutional neural network in the training process. Meanwhile, according to the prediction result and the real clear image, a loss value is calculated through a fourth loss function and used for adjusting parameter values in a network structure, so that training of the deblurring convolutional neural network is achieved. The fourth loss function is calculated by the first loss function, the second loss function and the third loss function.

According to the deblurring convolutional neural network training method provided by the embodiment of the application, the local blurred image is obtained by calculating the foreground image block and the background image which move in a preset mode, so that a training set of the local blurred image is obtained, a new method is provided for obtaining the local blurred image, and the problem of lack of a local blurred image data set at present is solved. In addition, a fuzzy region sensing network and a fuzzy region sensing attention module are introduced into the defuzzification convolutional neural network, and the global fuzzy training set and the local fuzzy training set are used for carrying out a partial-first-last combined training method, so that the model training efficiency is improved, the defuzzification convolutional neural network can realize defuzzification processing of local fuzzy images and global fuzzy images, the image quality is greatly improved, and the scene applicability of the convolutional neural network for defuzzification is improved.

For the acquisition of the local fuzzy training set in the above embodiment, another embodiment is provided in the present application. Fig. 2 is a flowchart of obtaining each blurred image according to an embodiment of the present application, and as shown in fig. 2, the steps of obtaining each blurred image include:

In step 201, a clear image is obtained from the deblurred dataset as a background image.

In the embodiment of the present application, the deblurring data set may use the high quality deblurring data set REDS, or may use other existing deblurring data sets, which is not limited in the present application. In addition, the deblurred data set used in this step may be consistent with the deblurred data set used by the global fuzzy training set in embodiments of the present application.

Step 202, at least one image is obtained from the semantic segmentation dataset and all foreground image blocks in each image are extracted.

In the embodiment of the present application, the semantic segmentation data set used may be the high-quality semantic segmentation data set PASCAL VOC 2012, or may be other semantic segmentation data sets. One or more images are randomly extracted from the semantic segmentation data set, and all foreground image blocks are extracted for combining with the background image to simulate a local blurred scene.

In step 203, a motion sequence with a length n is obtained, where n is an odd number greater than or equal to 3.

It will be appreciated that in the simulation of this partially blurred scene, the foreground image blocks act as moving objects in the scene and the background image acts as the background of the captured scene. The motion of the foreground image block needs a preset motion mode, and the preset motion mode is a motion sequence. The motion sequence comprises random initial positions of foreground image blocks on background pictures and a motion method of each frame, and the length of the motion sequence is the number of times that each foreground image block needs to move.

In an embodiment of the present application, the motion method of each frame of the foreground image block may include the following categories: a movement or rest in horizontal direction of 1 pixel and a movement or rest in vertical direction of 1 pixel; zoom out 0.1 times or zoom in 0.1 times or rest, rotate 1 degree clockwise or counterclockwise or rest, and other motion categories. When the motion sequence is acquired, a motion method for each frame is randomly selected from the above categories.

And 204, moving all foreground image blocks in at least one image on the background image at the same time according to the motion sequence, and acquiring the motion images of the continuous frames.

In the embodiment of the present application, the implementation manner of acquiring the moving image of the continuous frame may be: according to the motion sequence, all foreground image blocks in at least one image are moved on a background image once to obtain a combined image; wherein the combined image comprises all foreground image blocks and background images in the at least one image; after the motion sequence is completed, the plurality of combined images are taken as the motion images of the continuous frames.

Step 205, the moving images of the continuous frames are averaged to obtain a blurred image.

It will be appreciated that locally blurred images are due to the motion of an object, so that moving images of successive frames are averaged to correspond to motion being applied to the image, resulting in blurred images.

It should be noted that, in the embodiment of the present application, each blurred image in the local blur training set corresponds to a clear image, and the clear image uses an image after a second number of movements in moving images of consecutive frames. For example, if the length of the motion sequence is 7, it is indicated that the obtained continuous frame image is composed of 7 combined images, and the combined image obtained by the 4 th motion is taken as the clear image corresponding to the obtained blurred image. It can thus also be stated why it is necessary to define the length of the motion sequence to be an odd number greater than or equal to 3.

In addition, in the embodiment of the application, because the semantic segmentation data set is used, the fuzzy region labels corresponding to the fuzzy images can be generated simultaneously for each generated fuzzy image. In order to more intuitively explain the above-described process of blurred image acquisition, this process is illustrated by way of example in fig. 3. As shown in fig. 3, (a) is 4 images among the obtained moving images of the continuous frames, only 4 images among the moving images of the continuous frames are taken here for illustration, (b) is a blurred image obtained by averaging the generated moving images of the continuous frames, and (C) is blur region labeling information corresponding to the blurred image.

According to the deblurring convolutional neural network training method provided by the embodiment of the application, a clear image is selected from the deblurring data set to serve as a background image, and foreground image blocks extracted from images in the semantic segmentation data set are moved according to a motion sequence, so that the simulation of a local blurred scene is realized. The motion image of continuous frames is obtained through the simulation of the local fuzzy scene, and the local fuzzy image is obtained through averaging, so that a new method is provided for obtaining the local fuzzy image, and the problem of lack of a local fuzzy image data set at present is solved.

To further describe the method of training the deblurring convolutional neural network in detail, a joint training thereof will be described. Fig. 4 is a flowchart of a combined training of a deblurring convolutional neural network according to an embodiment of the present application, as shown in fig. 4, including the following steps:

Step 401, alternately inputting the local fuzzy training set and the global fuzzy training set into the middle defuzzification convolutional neural network.

In the embodiment of the application, the step of alternately inputting the local fuzzy training set and the global fuzzy training set into the middle defuzzification convolutional neural network means that after the local fuzzy training set is input into the middle defuzzification convolutional neural network for training, the global fuzzy training set is input into the middle defuzzification convolutional neural network for training, the local fuzzy training set is input into the middle defuzzification convolutional neural network, and so on. Wherein a local fuzzy training set may be used initially, or a global fuzzy training set may be used, which is not limited in this regard.

In the embodiment of the application, when the local fuzzy training set is input to the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network comprises a trained fuzzy region sensing network and a trained defuzzification network; when the global fuzzy training set is input into the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network only comprises a trained defuzzification module.

Step 402, obtaining a first prediction result when the local fuzzy training set is input to the intermediate defuzzified convolutional neural network.

The first prediction result comprises a prediction result obtained by the trained fuzzy region sensing network and a prediction result of the trained deblurring network.

Step 403, obtaining a fourth loss function; wherein the fourth loss function is a weighted sum of the first, second, and third loss functions.

Wherein the fourth loss function is shown in formula (5):

L＝λ₁L_MSE+λ₂L_LBP+λ₃L_BSA (5)

Where L is the fourth loss function, L _MSE is the first loss function, L _LBP is the second loss function, L _BSA is the third loss function, and λ ₁、λ₂ and λ ₃ are coefficients determined experimentally.

Step 404, performing joint training according to the first prediction result and the fourth loss function.

It can be understood that, according to the first prediction result and the loss value obtained through the fourth loss function, the parameter value in the network is continuously adjusted, so as to realize the training of the intermediate defuzzification convolutional neural network, and until the first prediction result meets the expectation, the global defuzzification training set is input into the intermediate defuzzification convolutional neural network instead. If the global fuzzy training set is changed, the output result still can meet the expectations, and the current defuzzified convolutional neural network is the final defuzzified convolutional neural network.

Step 405, obtaining a second prediction result when the global fuzzy training set is input to the intermediate defuzzified convolutional neural network.

Step 406, obtaining a fourth loss function; the fourth loss function is the second loss function.

Because the intermediate defuzzification convolutional neural network only comprises the defuzzification module after training when the global defuzzification training set is input into the intermediate defuzzification convolutional neural network, the current fourth loss function is equivalent to the second loss function, namely, the conventional mean square error is used.

Step 407, performing joint training according to the second prediction result and the fourth loss function.

It can be appreciated that, according to the second prediction result and the mean square error, parameter values in the network are continuously adjusted, so that training of the intermediate defuzzification convolutional neural network is realized, and until the second prediction result meets the expectation, the local defuzzification training set is input into the intermediate defuzzification convolutional neural network instead. If the local fuzzy training set is changed, the output result still can meet the expectations, and the current defuzzified convolutional neural network is the final defuzzified convolutional neural network.

Step 408, obtaining the final deblurred convolutional neural network.

According to the method for training the defuzzified convolutional neural network, which is provided by the embodiment of the application, each part in the defuzzified convolutional neural network is trained independently and then is trained in a combined way, so that the network training efficiency is improved, and the network training result is also improved. In addition, the local fuzzy training set and the global fuzzy training set are alternately input into the middle defuzzification convolutional neural network for network training, so that the finally obtained defuzzification convolutional neural network is suitable for local fuzzy scenes and global fuzzy scenes, meanwhile, the network structure and the loss function are flexibly conducted for different training sets, the aim of targeted training is achieved, the network training efficiency is further improved, and the image defuzzification processing effect of the convolutional neural network is also improved.

Based on the description of the embodiments above, in order to make the structure of the deblurring convolutional neural network in some embodiments of the present application more intuitive, an illustration will be given by way of example in fig. 5. As shown in fig. 5, taking the synthesized local blurred image as an example, the blurred image is firstly input into a blurred region sensing network to predict a blurred region, the characteristic tensor after the convolution of the output result is input into a deblurring network, and a clear image is obtained through the combined action of a deblurring module and a blurred region sensing attention module.

In order to prove the deblurring effect of the neural network on the local blurred image, the application uses a high-frame-rate camera to synthesize pictures by continuous frames acquired in a static state, so that the acquired data is closer to an actual shooting scene. And simultaneously inputting the acquired blurred image into the deblurring convolutional neural network for testing, and comparing the deblurring effect with the deblurring effect of the existing method. Fig. 6 is a contrast of deblurring effects for an actually acquired blurred image, wherein in each set of images, the leftmost image is an original blurred image, the middle image is a deblurring effect of the existing method, and the rightmost image is a deblurring effect of the deblurring convolutional neural network in the present application. Through comparison, the deblurring convolutional neural network can obviously improve the local blurring of an image.

In order to realize the method, the embodiment of the application provides a device for training a defuzzified convolutional neural network.

Fig. 7 is a block diagram of a device for training a defuzzified convolutional neural network according to an embodiment of the present application. As shown in fig. 7, the defuzzification convolutional neural network training device includes:

an obtaining module 710, configured to obtain a blurred image training set, where the blurred image training set includes a local blurred training set and a global blurred training set; the fuzzy images in the local fuzzy training set are obtained by calculating background images and foreground image blocks which move in a preset mode;

A first construction module 720, configured to construct an initial deblurring convolutional neural network, including a fuzzy region-aware network and a deblurring network; wherein the defuzzification network comprises a fuzzy region awareness module and a defuzzification module;

the first training module 730 is configured to input the local fuzzy training set into the fuzzy region sensing network, output a fuzzy region prediction result, and train the fuzzy region sensing network according to the fuzzy region prediction result and the first loss function;

the second training module 740 is configured to input the blurred image training set to the deblurring module, output a deblurring result, and train the deblurring module according to the deblurring result and the second loss function;

The third training module 750 is configured to input the local fuzzy training set into the fuzzy region sensing attention module, output an attention result, and train the fuzzy region sensing attention module according to the attention result and a third loss function;

a second construction module 760, configured to construct an intermediate defuzzified convolutional neural network according to the trained fuzzy region awareness network, the defuzzified module, and the fuzzy region awareness module;

and the fourth training module 770 is configured to alternately input the local fuzzy training set and the global fuzzy training set into the intermediate defuzzified convolutional neural network, output a prediction result, and perform joint training according to the prediction result and the fourth loss function to obtain a final defuzzified convolutional neural network.

In an embodiment of the present application, the local blur training set includes a plurality of blurred images, and the acquiring module 710 is configured to acquire each blurred image, including:

acquiring a clear image from the deblurred data set as a background image;

All foreground image blocks in at least one image are moved on a background image at the same time according to a movement sequence, and a movement image of a continuous frame is obtained;

and averaging the moving images of the continuous frames to obtain a blurred image.

In some embodiments of the present application, the acquisition module 710 is further configured to:

according to the motion sequence, all foreground image blocks in at least one image are moved on a background image once to obtain a combined image; wherein the combined image comprises all foreground image blocks and background images in at least one image;

After the motion sequence is completed, the plurality of combined images are taken as the motion images of the continuous frames.

In some embodiments of the present application, the number of fuzzy region aware attention modules in the initial deblurring network is i, where i is an integer greater than or equal to 1; the calculation formula of the third loss function is shown in formula (4).

In some embodiments of the application, the fourth training module 770 is configured to:

when the local fuzzy training set is input into the middle defuzzification convolutional neural network, the middle defuzzification convolutional neural network comprises a trained fuzzy region sensing network and a trained defuzzification network;

when the global fuzzy training set is input into the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network only comprises a trained defuzzification module.

Acquiring a fourth loss function; wherein the fourth loss function is a weighted sum of the first loss function, the second loss function, and the third loss function;

and performing joint training according to the first prediction result and the fourth loss function to obtain the final deblurring convolutional neural network.

In some embodiments of the present application, the fourth training module 770 is further configured to:

Obtaining a second prediction result when the global fuzzy training set is input to the middle defuzzification convolutional neural network;

acquiring a fourth loss function; the fourth loss function is the second loss function;

and performing joint training according to the second prediction result and the fourth loss function to obtain the final deblurring convolutional neural network.

To achieve the above embodiments, the present application also provides a computer device and a computer-readable storage medium.

FIG. 8 is a block diagram of a computer device for implementing defuzzification convolutional neural network training in accordance with an embodiment of the present application. Computer devices are intended to represent various forms of digital computers, such as laptops, desktops, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The computer device may also represent various forms of mobile apparatuses, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing apparatuses.

As shown in fig. 8, the computer device includes: memory 810, processor 820, and computer program 830 stored on the memory and executable on the processor. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system).

Memory 810 is a computer-readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the deblurring convolutional neural network training method provided by the present application. The computer readable storage medium of the present application stores computer instructions for causing a computer to perform the deblurring convolutional neural network training method provided by the present application.

The memory 810 is used as a computer readable storage medium and can be used to store software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the defuzzified convolutional neural network training method in the embodiment of the present application (e.g., the acquisition module 710, the first construction module 720, the first training module 730, the second training module 740, the third training module 750, the second construction module 760, and the fourth construction module 770 shown in fig. 7). Processor 820 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in memory 820, i.e., implementing the defuzzified convolutional neural network training method in the method embodiments described above.

Memory 810 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created from the use of a computer device to deblur the convolutional neural network training method, and the like. In addition, memory 810 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 810 optionally includes memory remotely located relative to processor 820, which may be connected via a network to an electronic device for implementing the deblurring convolutional neural network training method. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The computer device to deblur the convolutional neural network training method may further comprise: an input device 840 and an output device 850. Processor 820, memory 810, input device 840, and output device 850 may be connected by a bus or otherwise, for example in fig. 8.

The input device 840 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic device used to implement the defuzzified convolutional neural network training method, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, or like input device. The output device 850 may include a display device, auxiliary lighting devices (e.g., LEDs), haptic feedback devices (e.g., vibration motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. A method for training a defuzzified convolutional neural network, comprising:

Acquiring a fuzzy image training set, wherein the fuzzy image training set comprises a local fuzzy training set and a global fuzzy training set; the fuzzy images in the local fuzzy training set are obtained by calculating background images and foreground image blocks which move in a preset mode; the local fuzzy training set comprises a plurality of fuzzy images, and each fuzzy image is obtained by the following steps: obtaining a clear image from a deblurring data set as a background image, obtaining at least one image from a semantic segmentation data set, extracting all foreground image blocks in each image, obtaining a motion sequence with the length of n, wherein n is an odd number greater than or equal to 3, simultaneously moving all the foreground image blocks in the at least one image on the background image according to the motion sequence, obtaining a combined image by moving all the foreground image blocks in the at least one image on the background image once according to the motion sequence, wherein the combined image comprises all the foreground image blocks in the at least one image and the background image, taking a plurality of combined images as motion images of continuous frames after the motion sequence is completed, and averaging the motion images of the continuous frames to obtain the blurred image;

Constructing an initial deblurring convolutional neural network, including a fuzzy region sensing network and a deblurring network; the method comprises the steps that a deblurring network comprises a fuzzy region sensing attention module and a deblurring module, wherein the number of the fuzzy region sensing attention modules in the initial deblurring convolutional neural network is i, and i is an integer greater than or equal to 1;

inputting the local fuzzy training set into the fuzzy region perception attention module, outputting an attention result, and training the fuzzy region perception attention module according to the attention result and a third loss function, wherein the calculation formula of the third loss function is as follows:

，

Wherein the method comprises the steps of As a third loss function,/>The attention output of the attention module is perceived for the i-th blurred region,Fuzzy region labeling information generated for fuzzy images in the local fuzzy training set;

The local fuzzy training set and the global fuzzy training set are alternately input into the middle defuzzification convolutional neural network, a prediction result is output, and combined training is carried out according to the prediction result and a fourth loss function to obtain a final defuzzification convolutional neural network, wherein when the local fuzzy training set is input into the middle defuzzification convolutional neural network, the middle defuzzification convolutional neural network comprises the trained fuzzy region sensing network and the trained defuzzification network; when the global fuzzy training set is input to the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network only comprises the trained defuzzification module.

2. The method of claim 1, wherein the performing the joint training based on the prediction result and a fourth loss function to obtain a final deblurring convolutional neural network comprises:

3. The method of claim 1, wherein the performing the joint training based on the prediction result and a fourth loss function to obtain a final deblurring convolutional neural network comprises:

4. A defuzzification convolutional neural network training device, comprising:

The acquisition module is used for acquiring a fuzzy image training set, wherein the fuzzy image training set comprises a local fuzzy training set and a global fuzzy training set; the fuzzy images in the local fuzzy training set are obtained by calculating background images and foreground image blocks which move in a preset mode; the local fuzzy training set comprises a plurality of fuzzy images, and each fuzzy image is obtained by the following steps: obtaining a clear image from a deblurring data set as a background image, obtaining at least one image from a semantic segmentation data set, extracting all foreground image blocks in each image, obtaining a motion sequence with the length of n, wherein n is an odd number greater than or equal to 3, simultaneously moving all the foreground image blocks in the at least one image on the background image according to the motion sequence, obtaining a combined image by moving all the foreground image blocks in the at least one image on the background image once according to the motion sequence, wherein the combined image comprises all the foreground image blocks in the at least one image and the background image, taking a plurality of combined images as motion images of continuous frames after the motion sequence is completed, and averaging the motion images of the continuous frames to obtain the blurred image;

The first construction module is used for constructing an initial deblurring convolutional neural network, and comprises a fuzzy region sensing network and a deblurring network; the method comprises the steps that a deblurring network comprises a fuzzy region sensing attention module and a deblurring module, wherein the number of the fuzzy region sensing attention modules in the initial deblurring convolutional neural network is i, and i is an integer greater than or equal to 1;

the third training module is used for inputting the local fuzzy training set into the fuzzy region perception attention module, outputting an attention result, and training the fuzzy region perception attention module according to the attention result and a third loss function, wherein the calculation formula of the third loss function is as follows:

，

The fourth training module is used for alternately inputting the local fuzzy training set and the global fuzzy training set into an intermediate defuzzification convolutional neural network, outputting a prediction result, and carrying out joint training according to the prediction result and a fourth loss function to obtain a final defuzzification convolutional neural network, wherein when the local fuzzy training set is input into the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network comprises the trained fuzzy region sensing network and the trained defuzzification network; when the global fuzzy training set is input to the intermediate defuzzification convolutional neural network, the intermediate defuzzification convolutional neural network only comprises the trained defuzzification module.

5. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the defuzzification convolutional neural network training method of any one of claims 1 to 3 when the computer program is executed by the processor.

6. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the defuzzification convolutional neural network training method of any one of claims 1 to 3.