WO2021238420A1

WO2021238420A1 - Image defogging method, terminal, and computer storage medium

Info

Publication number: WO2021238420A1
Application number: PCT/CN2021/085694
Authority: WO
Inventors: 崔永明
Original assignee: Oppo广东移动通信有限公司
Priority date: 2020-05-29
Filing date: 2021-04-06
Publication date: 2021-12-02
Also published as: CN111626960A

Abstract

An image defogging method, a terminal, and a computer storage medium. The image defogging method comprises: after determining that an image to be processed is a foggy image, determining a size parameter of the image to be processed (101); determining, according to the size parameter, a preprocessing policy corresponding to the image to be processed, wherein the preprocessing policy is used for limiting the image size (102); if the preprocessing policy is dividing the image to be processed, dividing the image to be processed to obtain sub-images corresponding to the image to be processed (103); performing defogging processing on the sub-images according to an image defogging model to obtain defogged sub-images corresponding to the sub-images (104); and performing stitching processing on the defogged sub-images according to an image stitching model to obtain a defogged image corresponding to the image to be processed (105).

Description

Image defogging method, terminal and computer storage medium

Cross-references to related applications

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 29, 2020, the application number is 202010472428.1, and the application title is "Image Dehazing Method, Terminal and Computer Storage Medium", the entire content of which is incorporated by reference In this application.

Technical field

The embodiments of the present application relate to the field of terminal technology, and in particular, to an image defogging method, terminal, and computer storage medium.

Background technique

With the development of computer vision and its application in the fields of traffic and safety monitoring, image defogging has become an important research field of computer vision. Currently, image defogging methods mainly include two categories: non-physical model defogging methods and physical model defogging methods. Among them, the non-physical model defogging method is essentially to enhance the contrast and color of the image; the physical model defogging method is to use the atmospheric physical scattering law to establish an image restoration model.

However, the non-physical model defogging method can only improve the visual effect, and cannot improve the quality of the fogged image, and may even lose some information of the image; the physical model defogging method has a long processing time and low processing efficiency. It can be seen that the existing image defogging methods are still not mature enough to take into account the effect and efficiency of defogging processing.

Summary of the invention

The embodiments of the present application provide an image defogging method, terminal, and computer storage medium, which can greatly increase the processing speed while improving processing accuracy, thereby realizing high-quality and high-efficiency image defogging processing.

The technical solutions of the embodiments of the present application are implemented as follows:

In the first aspect, an embodiment of the present application provides an image defogging method, and the method includes:

After determining that the image to be processed is a fogged image, determine the size parameter of the image to be processed;

Determine a preprocessing strategy corresponding to the image to be processed according to the size parameter; wherein, the preprocessing strategy is used to limit the size of the image;

If the preprocessing strategy is to divide the image to be processed, divide the image to be processed to obtain a sub-image corresponding to the image to be processed;

Performing defogging processing on the sub-image according to the image defogging model to obtain a defogging sub-image corresponding to the sub-image;

Performing a splicing process on the defogging sub-images according to the image splicing model to obtain a defogging image corresponding to the image to be processed.

In the second aspect, an embodiment of the present application provides a terminal. The terminal includes: a determining part, a dividing part, a dehazing part, and a dividing part,

The determining part is configured to determine the size parameter of the image to be processed after determining that the image to be processed is a fogged image; and determine the preprocessing strategy corresponding to the image to be processed according to the size parameter; wherein the preprocessing strategy is used To limit the size of the image;

The dividing part is configured to divide the image to be processed if the preprocessing strategy is to divide the image to be processed to obtain a sub-image corresponding to the image to be processed;

The defogging part is configured to perform defogging processing on the sub-image according to the image defogging model to obtain a defogging sub-image corresponding to the sub-image;

The dividing part is configured to perform stitching processing on the defogging sub-images according to an image stitching model to obtain a defogging image corresponding to the image to be processed.

In a third aspect, an embodiment of the present application provides a terminal. The terminal includes a processor and a memory storing executable instructions of the processor. When the instructions are executed by the processor, the above-mentioned Image defogging method.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium with a program stored thereon and applied to a terminal. When the program is executed by a processor, the image defogging method as described above is implemented.

The embodiments of the application provide an image defogging method, terminal, and computer storage medium. After determining that the image to be processed is a fogged image, the terminal determines the size parameter of the image to be processed; and determines the preprocessing strategy corresponding to the image to be processed according to the size parameter. ; Among them, the preprocessing strategy is used to limit the image size; if the preprocessing strategy is to divide the image to be processed, the image to be processed is divided to obtain the sub-image corresponding to the image to be processed; the sub-image is defogged according to the image defogging model Process to obtain the defogging sub-image corresponding to the sub-image; perform stitching processing on the defogging sub-image according to the image stitching model to obtain the defogging image corresponding to the image to be processed. It can be seen that, in the embodiment of the present application, the terminal can use the image defogging model obtained by deep learning to defog the image to be processed. At the same time, for the large-sized image to be processed, the terminal can also use the image defogging. After the model performs defogging processing on the sub-images after dividing the image to be processed, the image stitching model obtained by deep learning is used for stitching processing to increase the processing speed while ensuring the processing accuracy. Further, the image defogging model and the image stitching model are obtained by the terminal's minimal network design on the CNN, so the above-mentioned image defogging method can be run in the terminal in real time. That is to say, in this application, the image defogging process based on the image defogging model and the image stitching model obtained by deep learning can greatly increase the processing speed while improving the processing accuracy, thereby achieving high-quality and high-efficiency images. Defogging treatment.

Description of the drawings

Figure 1 is a schematic diagram of a current image defogging processing method;

Figure 2 is the first schematic diagram of the implementation process of the image defogging method;

Figure 3 is a second schematic diagram of the implementation process of the image defogging method;

Figure 4 is the third schematic diagram of the implementation process of the image defogging method;

Figure 5 is a fourth schematic diagram of the implementation process of the image defogging method;

Fig. 6 is a schematic diagram of conventional convolution processing;

Figure 7 is a schematic diagram of Depthwise convolution processing;

Fig. 8 is a schematic diagram of Pointwise convolution processing;

Figure 9 is a schematic diagram five of the implementation process of the image defogging method;

FIG. 10 is a sixth schematic diagram of the implementation process of the image defogging method;

Figure 11 is a first schematic diagram of the terminal structure;

Figure 12 is a second schematic diagram of the terminal structure.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. It is understandable that the specific embodiments described here are only used to explain the related application, but not to limit the application. In addition, it should be noted that, for ease of description, only the parts related to the relevant application are shown in the drawings.

Due to the influence of light refraction, reflection, self-absorption and self-imaging by floating water droplets in the atmosphere, the contrast of the image captured in foggy weather is low, and color distortion is caused, and the machine vision system may even fail to work normally. Under severe weather conditions where visibility is low due to haze weather, the collected pictures are affected by suspended particulates in the atmosphere (such as fog, haze, etc.), resulting in poor picture quality, making it difficult to distinguish the features of the objects in the picture, and even affecting, for example, The quality of pictures in outdoor surveillance, target recognition and traffic navigation. Therefore, the sharpening of foggy image features is extremely important.

With the development of computer vision and its application in the fields of traffic and safety monitoring, image defogging has become an important research field of computer vision. At present, traditional image processing methods or physical model defogging methods are mainly used to input fogged images and output clear images. Among them, the traditional image processing method is essentially to enhance the contrast and color of the image, which can only improve the visual effect, but cannot improve the quality of the fogged image, and may even lose some information of the image; the physical model defogging method is to use atmospheric physical scattering The image restoration model is established according to the law, but the processing time is long and the processing efficiency is low.

Specifically, on the one hand, image enhancement algorithms directly start from the perspective of image processing, by enhancing the contrast of the foggy image, highlighting the characteristics or effective information of the image, and improving the visual effect of the image to a certain extent. However, this type of method ignores the real cause of image degradation, so for pictures with complex scenes, the quality of the picture cannot be improved, and some information of the image may even be lost. On the other hand, the model defogging algorithm establishes an atmospheric scattering model, studies the physical principles of image degradation, and obtains the scattering effect of suspended particles in the atmosphere on the light and the influence on the picture, and restores more realistic pictures, and in complex scenes The medium dehazing effect is better, and the image information is more complete. This algorithm has a good dehazing effect for pictures in non-sky areas, but the results are not ideal for bright areas with sky, and the algorithm is too computationally expensive and low in efficiency.

Fig. 1 is a schematic diagram of a current image defogging processing method. As shown in Fig. 1, the existing technical solution is mainly to input a fogged image into a defogging algorithm model, and output a clear image after defogging. Among them, the machine learning model is generated based on traditional algorithms. It can be seen that the prior art uses traditional algorithms to process the entire fogged image, which has the following disadvantages:

1. The generalization of traditional manual design algorithms is relatively poor, and the effect is not good when the environment changes greatly.

2. When the whole image is directly defogged, if the input image is very large, such as 3000×4000, the processing speed is very slow.

That is to say, in the prior art, due to the lack of related algorithms for image defogging using deep learning algorithms, the defogging effect in complex scenes is not ideal; and the existing defogging processing methods have a problem of low processing efficiency.

Today, with the increasing development of artificial intelligence (AI), deep learning is getting better and better for image defogging. Therefore, the image defogging method proposed in this application uses AI to build an image defogging model, and at the same time , You can also assist the defogging processing of large-size images through the constructed image stitching model, so that the effect and efficiency of the defogging processing can be improved at the same time. Furthermore, in this application, both the image defogging model and the image stitching model are obtained by simplifying the design on the basis of the convolutional neural network (Convolutional Neural Networks, CNN) network structure. -Accumulates, MACs) run in the mobile terminal to achieve end-to-end real-time image defogging.

Specifically, in order to solve the above shortcomings, in the image defogging algorithm proposed in this application, the terminal can use the image defogging model obtained by deep learning to defog the image to be processed. At the same time, for the larger image to be processed, the terminal can also After the image defogging model is used to perform defogging processing on the sub-images after the image to be processed is divided, the image stitching model obtained by deep learning is used for stitching processing, so as to increase processing speed while ensuring processing accuracy. Further, the image defogging model and the image stitching model are obtained by the terminal's minimal network design on the CNN, so the above-mentioned image defogging method can be run in the terminal in real time. That is to say, in this application, the image defogging process based on the image defogging model and the image stitching model obtained by deep learning can greatly increase the processing speed while improving the processing accuracy, thereby achieving high-quality and high-efficiency images. Defogging treatment.

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application.

An embodiment of the present application provides an image defogging method. FIG. 2 is a schematic diagram of the implementation process of the image defogging method. As shown in FIG. It includes the following steps:

Step 101: After determining that the image to be processed is a fogged image, determine the size parameter of the image to be processed.

In the embodiment of the present application, after determining that the image to be processed is a fogged image, the terminal may first determine the size parameter of the image to be processed.

It should be noted that, in the embodiments of the present application, the above-mentioned terminal may be any terminal with communication and storage functions, such as: tablet computer, mobile phone, e-reader, remote control, personal computer (PC), notebook Terminals such as computers, in-vehicle equipment, Internet TV, wearable devices, personal digital assistants (PDA), portable media players (PMP), navigation devices, etc.

Further, in the embodiments of the present application, the image to be processed may be an image stored in advance by the terminal. For example, the image to be processed may be an image in a mobile phone album; the image to be processed may also be an image collected by the terminal in real time, for example, The processed image may be a preview image captured by a mobile phone; the to-be-processed image may also be a frame of an image in a video recorded in real time by the terminal. For example, the to-be-processed image may be a frame of a surveillance video recorded by a surveillance device.

That is to say, in the embodiment of the present application, the image to be processed may be pre-stored by the terminal, may also be collected by the terminal in real time, or may be sent by the terminal receiving other devices, which is not specifically limited in this application.

Step 102: Determine a preprocessing strategy corresponding to the image to be processed according to the size parameter; wherein, the preprocessing strategy is used to limit the size of the image.

In the embodiment of the present application, after determining the size parameter of the image to be processed, the terminal may further determine the preprocessing strategy corresponding to the image to be processed according to the size parameter.

It should be noted that, in the embodiment of the present application, the preprocessing strategy can be used to limit the image size. In other words, for images with different size parameters, the preprocessing strategies determined by the terminal are different.

Exemplarily, in the embodiment of the present application, for an image with a larger size, the corresponding preprocessing strategy determined by the terminal may be to perform image division preprocessing on the image first, and then perform defogging processing; For smaller images, the corresponding pre-processing strategy determined by the terminal can be directly defogging.

It can be seen that, in the embodiment of the present application, the preprocessing strategy determined by the terminal is different for images to be processed with different size parameters. Specifically, according to the size of the size parameter, the preprocessing strategy may include non-division processing and division processing.

It is understandable that, in the embodiment of the present application, the size parameter of the image to be processed may include the length and width of the image, where the length and width of the image may be in units of pixels or centimeters.

Further, in the embodiment of the present application, the terminal may preset a size parameter used to determine the preprocessing strategy of the image to be processed, that is, the preset size threshold.

Exemplarily, in the embodiment of the present application, when the terminal determines the preprocessing strategy corresponding to the image to be processed according to the size parameter, it may compare the size parameter of the image to be processed with a preset size threshold, so that it can further be based on the comparison result. Determine the preprocessing strategy of the image to be processed.

Further, in the embodiment of the present application, after the terminal compares the size parameter of the image to be processed with the preset size threshold, if the comparison result is that the size parameter is greater than or equal to the preset size threshold, the terminal can determine the preprocessing The strategy is to divide the image to be processed; if the comparison result is that the size parameter is less than the preset size threshold, the terminal can determine that the preprocessing strategy is not to divide the image to be processed.

It should be noted that, in the embodiments of the present application, the terminal can limit the height of the image to be processed, or limit the width of the image to be processed, or limit the height and width of the image to be processed; accordingly, the preset The size threshold may be the upper limit of the height size, the upper limit of the width size, or the upper limit of the height and the width.

Exemplarily, in this application, the terminal may determine the preset size threshold as 2048×1536, so that the height and width of the image to be processed may be restricted according to the size threshold of 2048×1536.

Step 103: If the preprocessing strategy is to divide the image to be processed, divide the image to be processed to obtain a sub-image corresponding to the image to be processed.

In the embodiment of the present application, after the terminal determines the preprocessing strategy corresponding to the image to be processed according to the size parameter, if the preprocessing strategy is to divide the image to be processed, the terminal can divide the image to be processed to obtain the subprocess corresponding to the image to be processed. image.

Further, in the embodiment of the present application, in order to improve the efficiency of defogging processing, for a larger-sized image to be processed, the terminal may first divide the image to be processed into multiple sub-images of equal size, and then divide the divided image into multiple sub-images. Sub-images are processed for defogging to speed up the defogging processing.

It should be noted that, in the embodiment of the present application, when the terminal divides the image to be processed, it can determine the number of sub-images obtained by the division according to the size parameter of the image to be processed, that is, for different size parameters, divide the image to be processed The number of sub-images obtained later may also be different. For example, the image to be processed with the size parameter a1 is divided into 6 sub-images, and the image to be processed with the size parameter a2 is divided into 9 sub-images, where a1 is smaller than a2.

Further, in the embodiment of the present application, when the terminal divides the image to be processed, the number of divisions may be preset, that is, for different size parameters, the number of sub-images obtained after dividing the image to be processed is the same. For example, the image to be processed with the size parameter a1 is divided into 9 sub-images, and the image to be processed with the size parameter a2 is divided into 9 sub-images, where a1 is smaller than a2.

It is understandable that, in the embodiment of the present application, if the preprocessing strategy is to divide the image to be processed, the terminal may divide the image to be processed into n sub-images of the same size, where n is an integer greater than 1.

Step 104: Perform defogging processing on the sub-image according to the image defogging model to obtain a defogging sub-image corresponding to the sub-image.

In the embodiment of the present application, if the preprocessing strategy corresponding to the image to be processed is to divide the image to be processed, the terminal can divide the image to be processed and obtain the sub-images corresponding to the image to be processed, and then pair the image according to the image defogging model. The image is processed for defogging, so that the defogging sub-image corresponding to the sub-image can be obtained.

It is understandable that, in the embodiment of the present application, since the terminal divides the image to be processed into n sub-images of the same size, correspondingly, the terminal can obtain n sub-images after defogging each sub-image. The n sub-images correspond to n sub-images after defogging, where one sub-image corresponds to one sub-image after defogging.

It should be noted that, in the embodiment of the present application, in order to improve the processing effect of image defogging, the terminal may use a deep learning image defogging model to perform defogging processing on multiple sub-images respectively. Among them, the image defogging model can be an image defogging algorithm built by the terminal based on AI.

Further, in the embodiment of the present application, the image defogging model may be obtained by the terminal training based on the convolutional neural network CNN. Specifically, in this application, the terminal can use MobileNet-V2 in CNN to determine the image defogging model.

Deep CNN networks such as ResNet and DenseNet have greatly improved the accuracy of image classification. However, in addition to accuracy, computational complexity is also an important indicator to be considered for CNN networks. Overly complex networks may be very slow. In order to meet the accurate and efficient small models required by the terminal, some lightweight CNN networks such as MobileNet have been proposed, which have a good balance between speed and accuracy.

Exemplarily, in the embodiment of the present application, on the basis of the first lightweight convolutional neural network, the terminal can modify part of the convolutional layer to Depthwise convolution + Pointwise convolution, so that the processing effect can be guaranteed at the same time Improve image processing efficiency. Among them, the first lightweight convolutional neural network can be the MobileNet-V2 network in CNN.

MobileNet-V2 is an improvement of MobileNet-V1. It is also a lightweight convolutional neural network. It is a deep learning network suitable for mobile terminals. Among them, because MobileNet-V2 uses depthwise separable convolution instead of traditional Convolution method, the original convolution is divided into two parts of Depthwise convolution + Pointwise convolution, so it has the advantages of fewer parameters, smaller models, and less accuracy than some traditional convolutions.

Depthwise, that is, different channels use different convolution kernels to convolution to extract features; Pointwise, that is, the convolution of a certain point and a certain pixel.

Depthwise convolution and Pointwise convolution are collectively called Depthwise Separable Convolution. This structure is similar to conventional convolution operations and can be used to extract features, but compared to conventional convolution operations, its parameter amount and computational cost are lower, so This structure is more suitable for lightweight networks, such as MobileNet.

It is understandable that, in the embodiment of the present application, when the terminal performs defogging processing on sub-images according to the image defogging model, in order to improve processing efficiency, the terminal can dehaze each sub-image based on the image defogging model. Fog treatment.

It is understandable that, in the embodiment of the present application, after the terminal performs defogging processing on the sub-images according to the image defogging model, it can obtain a defogging sub-image corresponding to each sub-image.

It should be noted that, in the embodiment of the present application, when the terminal constructs the image defogging model, it may first divide the first image sample set to obtain the first training data and the first test data; wherein, the first image sample The set includes the fogged image and the clear image corresponding to the fogged image; then the first network model is constructed based on the first lightweight convolutional neural network, and the first network model is trained according to the first training data to obtain the initial defogging model ; Finally, the initial defogging model is tested according to the first test data, and then the image defogging model can be obtained.

Step 105: Perform splicing processing on the defogging sub-images according to the image splicing model to obtain a defogging image corresponding to the image to be processed.

In the embodiment of the present application, after the terminal performs defogging processing on the sub-images according to the image defogging model to obtain the defogging sub-images corresponding to the sub-images, the terminal may stitch all the defogging sub-images according to the image stitching model Through processing, a clear image corresponding to the image to be processed can be obtained, that is, the image after defogging.

It is understandable that, in the embodiment of the present application, after the terminal performs defogging processing on multiple sub-images of the image to be processed, it can obtain multiple defogging sub-images corresponding to the multiple sub-images, and further, the terminal needs to pass The multiple defogging sub-images are stitched together, and finally a frame of defogging image corresponding to the image to be processed is obtained.

It should be noted that, in the embodiment of the present application, in order to improve the processing effect of image splicing, the terminal may use a deep learning image splicing model to splice multiple defogging sub-images into corresponding defogging images. Among them, the image stitching model may be an image stitching algorithm constructed by the terminal based on artificial intelligence AI.

Further, in the embodiment of the present application, the image stitching model may be obtained by the terminal training based on CNN. Specifically, in this application, the terminal can use ShuffleNet-V2 in CNN to determine the image stitching model.

In addition to MobileNet, the lightweight convolutional neural network CNN also includes ShuffleNet, which can also achieve a good balance between speed and accuracy.

Exemplarily, in the embodiment of the present application, on the basis of the second lightweight convolutional neural network, the terminal can remove part of the convolutional layer, and modify the part of the convolutional layer to Depthwise convolution + Pointwise convolution. The network is accelerated, so that a very small network design for CNN can be carried out based on this, and the efficiency of image processing can be improved while ensuring the splicing effect. Among them, the second lightweight convolutional neural network may be the ShuffleNet-V2 network in CNN.

ShuffleNet-V2 is an upgraded version of ShuffleNet-V1. Under the same complexity, ShuffleNet-V2 is more accurate than ShuffleNet-V1 and MobileNet-V2.

In order to improve the defects of ShuffleNet-V1, the ShuffleNet-V2 version introduces a new operation: Channel Split, which divides the input channels of the module into two parts, one part is passed down directly, and the other part is for real backward calculation. At the end of the module, the number of output channels on the two branches is directly connected, thereby avoiding the element-wise sum operation in ShuffleNet-V1. Then we perform Random Shuffle operation on the final output feature maps, so that the information between the channels can communicate with each other.

Specifically, at the beginning, the input feature map is divided into two branches in the channel dimension: the number of channels is c'and c-c' respectively, and in actual implementation, c'=c/2. The left branch does the same mapping, the right branch contains 3 consecutive convolutions, and the input and output channels are the same, which conforms to the G1 principle (the same channel width can minimize the memory access cost). Moreover, two 1x1 convolutions are no longer group convolutions, which conforms to the G2 principle (excessive group convolutions will increase access costs), and the other two branches are equivalent to being divided into two groups. The output of the two branches is no longer an Add element, but concat together, followed by a channle shuffle of the concat results of the two branches to ensure the exchange of information between the two branches. In fact, concat and channel shuffle can be combined with the channel split of the next module unit to form an element-level operation, which is in line with principle G4.

It should be noted that, in the embodiment of the present application, when the terminal constructs the image stitching model, it may first divide the second image sample set to obtain the second training data and the second test data; wherein, the second image sample set Including the original image and multiple decomposed images corresponding to the original image; then constructing a second network model based on the second lightweight convolutional neural network, and training the second network model according to the second training data to obtain the initial stitching model; and finally The initial stitching model is tested according to the second test data, and then the image stitching model can be obtained.

It can be seen that, in this application, if the preprocessing strategy is to divide the image to be processed, the terminal can first divide the image to be processed to obtain the sub-image corresponding to the image to be processed; then perform the defogging process on the sub-image according to the image defogging model , So as to obtain the defogging sub-images corresponding to the sub-images; finally, the defogging sub-images can be stitched based on the image stitching model to obtain the defogging images.

Figure 3 is a schematic diagram of the second implementation process of the image defogging method. As shown in Figure 3, in the embodiment of the present application, after the terminal determines the preprocessing strategy corresponding to the image to be processed according to the size parameter, that is, after step 102, the terminal The method for image defogging processing may further include the following steps:

Step 106: If the preprocessing strategy is not to divide the image to be processed, then directly perform defogging processing on the image to be processed according to the image defogging model to obtain a defogging image.

In the embodiment of the present application, after the terminal determines the preprocessing strategy corresponding to the image to be processed according to the size parameter, it can perform defogging processing on the image to be processed according to the image defogging model based on the preprocessing measurement strategy, so as to obtain the image to be processed The corresponding clear image is the image after defogging. Among them, if the preprocessing strategy is not to divide the image to be processed, the terminal can use the image defogging model to perform defogging processing on the image to be processed.

Exemplarily, in the embodiment of the present application, if the preprocessing strategy is not to divide the image to be processed, the terminal does not need to divide the image to be processed. Therefore, the terminal can directly remove the image to be processed according to the image defogging model. Fog processing to obtain the defogged image corresponding to the image to be processed.

It can be seen that, for the image defogging method proposed in this application, for images to be processed with different size parameters, the terminal can select different preprocessing strategies to perform defogging processing on the image to be processed, thereby improving the processing efficiency of the defogging processing.

Figure 4 is a schematic diagram of the third implementation process of the image defogging method. As shown in Figure 4, in the embodiment of the present application, before the terminal determines the size parameters of the image to be processed, that is, before step 101, the terminal performs image defogging processing The method can also include the following steps:

Step 107: Perform analysis processing on the image to be processed to obtain the analysis result.

In the embodiment of the present application, after acquiring the image to be processed, the terminal may first analyze and process the image to be processed to obtain the analysis result.

It is understandable that, in the embodiment of the present application, when the terminal analyzes and processes the image to be processed, it can detect feature information in the image to be processed, and then obtain the analysis result of the image to be processed based on the feature information. Among them, the analysis result can be used to characterize whether the image to be processed is foggy or the degree of fogging of the image to be processed.

Exemplarily, in the embodiment of the present application, the terminal may use a pre-learned recognition model to analyze and process the image to be processed, that is, the terminal inputs the value of the image to be processed into the recognition model, and outputs the analysis result of the image to be processed.

Step 108: Determine whether the image to be processed is a fogged image according to the analysis result.

In the embodiment of the present application, after the terminal analyzes and processes the image to be processed and obtains the analysis result, it can determine whether the image to be processed is a fogged image based on the analysis result, that is, whether it is necessary to perform defogging processing on the image to be processed.

Further, in the embodiment of the present application, since the analysis result of the image to be processed can characterize whether the image to be processed is foggy, or the degree of fogging of the image to be processed, the terminal judges the image to be processed based on the analysis result. Whether it is a fogged image, you can directly determine a foggy image to be processed as a fogged image, and determine a non-fogged image to be processed as a non-fogged image; you can also determine an image with a higher degree of fog as a fogged image A fogged image, and an image with a low degree of fogging is determined as a non-fogged image.

Further, in another example of the present application, before the terminal determines the size parameter of the image to be processed, that is, before step 101, the terminal may first extract the image to be processed to obtain the smallest RGB component of each pixel of the image to be processed. The component value is stored in a grayscale image of the same size as the image to be processed, and then the grayscale image is divided into multiple 15×15 windows, and each window is filtered with the minimum value. After replacing all the pixels of the window with the minimum pixel value of, the dark channel image is obtained. Then, the terminal can separately make differences between all the pixel values of the dark channel image and the image to be processed, accumulate all the differences to obtain the sum of the differences, and then compare the sum of the differences with the difference threshold, if the sum of the differences is less than the difference threshold , The terminal can arbitrarily consider that the image to be processed does not need to be defogged; if the sum of the differences is greater than or equal to the difference threshold, the terminal can arbitrarily consider that the image to be processed needs to be defogged, and then judge the image to be processed as fogging image.

It can be seen that, in the embodiment of this application, based on the image defogging method proposed in steps 101 to 108 above, AI is used to build an image defogging model, and at the same time, the built image stitching model can also be used to assist large-size images. The image defogging process can improve the effect and efficiency of the defogging process at the same time.

Further, the image defogging method proposed in the present application includes accurately defogging the image through the AI image defogging algorithm, which greatly improves the effect of the defogging processing; it also includes the image stitching processing through the AI image stitching algorithm, and Angles, images in multiple scenes have better stitching effects. After experimental comparison, the image defogging method using deep learning proposed in this application is more than 2 times more accurate than the traditional algorithm. At the same time, because the large-size image can be divided into multiple sub-images for parallel processing, and then the processing results are stitched together , So the processing speed of large-size images can be increased by more than 5 times.

It is understandable that the image defogging method proposed in this application can be widely used in video surveillance, image beautification and other fields. For example, in the field of surveillance, when the fog is relatively large, the camera can hardly take a clear picture of the target object. After defogging the image, the target task can be clearly seen. In terms of image beautification, when using a mobile phone to take a selfie, if the weather is bad or the shot is blurry, the image defogging algorithm can automatically defog the image to make the image clearer.

This application proposes an image defogging method. After the terminal determines that the image to be processed is a fogged image, the size parameter of the image to be processed is determined; the preprocessing strategy corresponding to the image to be processed is determined according to the size parameter; wherein the preprocessing strategy is used In order to limit the image size; if the preprocessing strategy is to divide the image to be processed, the image to be processed is divided to obtain the sub-image corresponding to the image to be processed; the sub-image is defogged according to the image defogging model to obtain the corresponding sub-image Sub-image after defogging; the sub-image after defogging is stitched according to the image stitching model to obtain the defogging image corresponding to the image to be processed. It can be seen that, in the embodiment of the present application, the terminal can use the image defogging model obtained by deep learning to defog the image to be processed. At the same time, for the large-sized image to be processed, the terminal can also use the image defogging. After the model performs defogging processing on the sub-images after dividing the image to be processed, the image stitching model obtained by deep learning is used for stitching processing to increase the processing speed while ensuring the processing accuracy. Further, the image defogging model and the image stitching model are obtained by the terminal's minimal network design on the CNN, so the above-mentioned image defogging method can be run in the terminal in real time. That is to say, in this application, the image defogging process based on the image defogging model and the image stitching model obtained by deep learning can greatly increase the processing speed while improving the processing accuracy, thereby achieving high-quality and high-efficiency images. Defogging treatment.

Based on the foregoing embodiment, in another embodiment of the present application, FIG. 5 is a schematic diagram of the fourth implementation process of the image defogging method, as shown in FIG. The sub-image is subjected to defogging processing, and before the defogging sub-image corresponding to the sub-image is obtained, that is, before step 104, the method for the terminal to perform image defogging processing may further include the following steps:

Step 109: Divide the first image sample set to obtain first training data and first test data; wherein, the first image sample set includes a fogged image and a clear image corresponding to the fogged image.

Step 1010: Construct a first network model based on the first lightweight convolutional neural network, and perform training processing on the first network model according to the first training data to obtain an initial defogging model.

Step 1011: Perform a test process on the initial defogging model according to the first test data to obtain an image defogging model.

In the embodiment of the present application, the terminal may first construct the image defogging model before performing the defogging processing on the image to be processed using the image defogging model.

Further, in the embodiment of the present application, the terminal may construct the first network model based on the first lightweight convolutional neural network, and improve the speed by modifying the processing. Specifically, on the basis of the first lightweight convolutional neural network MobileNet-V2 network, part of the convolutional layer can be modified to Depthwise convolution + Pointwise convolution. The image defogging model designed in this way can be Increase the speed without reducing the accuracy, and can run in the terminal in real time.

Depthwise Separable Convolution is to decompose a complete convolution operation into two steps, namely decomposition into Depthwise convolution and Pointwise convolution.

Fig. 6 is a schematic diagram of conventional convolution processing, as shown in Fig. 6, for a 5×5 pixel, three-channel color input picture (shape is 5×5×3). After the convolutional layer of the 3×3 convolution kernel (assuming the number of output channels is 4, the shape of the convolution kernel is 3×3×3×4), and finally output 4 Feature Maps, if there is same padding, the size and input layer Same (5×5), if not, the size becomes 3×3.

Specifically, different from conventional convolution operations, one convolution kernel of Depthwise convolution is responsible for one channel, and one channel is convolved by only one convolution kernel, while each convolution kernel of regular convolution operates each input image at the same time. aisle. Figure 7 is a schematic diagram of Depthwise convolution processing. As shown in Figure 7, for a 5×5 pixel, three-channel color input picture (shape is 5×5×3), Depthwise convolution first undergoes the first convolution operation Unlike conventional convolution, Depthwise is completely performed in a two-dimensional plane. The number of convolution kernels is the same as the number of channels in the previous layer (the channel and the convolution kernel correspond one-to-one). Therefore, a three-channel image generates 3 Feature maps (if there is same padding, the size is the same as the input layer and is 5×5).

After the depthwise convolution is completed, the number of feature maps is the same as the number of channels in the input layer, and the feature map cannot be expanded. Moreover, this operation independently performs convolution operations on each channel of the input layer, and does not effectively use the feature information of different channels at the same spatial position. Therefore, Pointwise convolution is needed to combine these feature maps to generate a new feature map.

Figure 8 is a schematic diagram of Pointwise convolution processing. As shown in Figure 8, the operation of Pointwise convolution is very similar to conventional convolution operations. The size of its convolution kernel is 1×1×M, and M is the channel of the previous layer. number. Therefore, the convolution operation here will perform a weighted combination of the feature map of the previous step in the depth direction to generate a new feature map. Among them, there are several output feature maps for several convolution kernels. For example, 4 convolution kernels correspond to output 4 feature maps.

Compared with MobileNet-V1 and MobileNet-V2:

MobileNet-V1: Depthwise+Pointwise, and then the linear rectification function (Rectified Linear Unit, ReLU) is activated; MobileNet-V2: 1×1 Pointwise channel expansion, Depthwise+Pointwise channel compression, and Linear activation. Among them, Depthwise is to perform convolution on a single channel, and the increase in the number of previous channels will not have much impact on its calculation.

Compared with ResNet, MobileNet-V2:

Residual block of ResNet: There are two convolutions in the original ResNet connection. The bottleneck structure connection of ResNet is 1×1 channel compression + convolution + 1×1 channel expansion; MobileNet-V2's residual Inverted Residual block: 1×1 channel expansion + convolution + 1×1 channel reduction.

Specifically, the network structure of MobileNet-V2 is shown in Table 1:

Table 1

InputInput	OperatorOperator	tt	cc	nn	ss
224 ²×3 224 ² ×3	conv2dconv2d	--	3232	11	22
112 ²×32 112 ² ×32	bottleneckbottleneck	11	1616	11	11
112 ²×16 112 ² ×16	bottleneckbottleneck	66	24twenty four	22	22
56 ²×24 56 ² ×24	bottleneckbottleneck	66	3232	33	22
28 ²×32 28 ² ×32	bottleneckbottleneck	66	6464	44	22
14 ²×64 14 ² ×64	bottleneckbottleneck	66	9696	33	11
14 ²×96 14 ² ×96	bottleneckbottleneck	66	160160	33	22
7 ²×160 7 ² ×160	bottleneckbottleneck	66	320320	11	11
7 ²×320 7 ² ×320	conv2d 1×1conv2d 1×1	--	12801280	11	11
7 ²×1280 7 ² ×1280	avgpool 7×7avgpool 7×7	--	--	11	--
1×1×12801×1×1280	conv2d 1×1conv2d 1×1	--	kk	--	To

Among them, t is the multiplication factor of the input channel, which means "expansion" multiple, c is the number of output channels, n is the number of repetitions of the module, and s is the stride.

Further, in the embodiment of the present application, the terminal may first use the MobileNet-V2 network in the convolutional neural network CNN to construct the first network model, and then obtain the image defogging model through training and testing of the first network model.

It is understandable that, in the embodiment of the present application, the terminal may first obtain the first image sample set, where the first image sample set includes a fogged image and a clear image corresponding to the fogged image. That is to say, in the first image sample set, there is a one-to-one correspondence between the fogged image and the clear image.

Exemplarily, in the embodiment of the present application, the first image sample set includes 100 fogged images and clear images corresponding to different scenes, that is, the first image sample set stores 200 frames of images.

It should be noted that, in the embodiment of the present application, the first image sample set may be collected by the terminal, or may be sent by the terminal receiving other image collection devices, which is not specifically limited in this application.

Further, in the embodiment of the present application, after acquiring the first image sample set, the terminal may divide the first image sample set to obtain the first training data and the first test data. Among them, the first training data is used to train the first network model, and the first test data is used to test the initial defogging model.

It can be understood that, in the embodiment of the present application, the first training data and the first test data both include a one-to-one correspondence between fog images and clear images. However, the first training data and the first test data The images are not the same, that is, any group of fogged images and clear images in the first image sample set can only be divided into the first training data or the first test data, and cannot be used as the training image and the first training data at the same time. The test image in the first test data.

Further, in the embodiment of the present application, after the terminal constructs the first network model based on the first lightweight convolutional neural network, and divides the first image sample set to obtain the first training data and the first test data, the terminal can First use the first training data to train the first network model to obtain the initial defogging model; then use the first test data to test the initial defogging model to obtain the final image defogging model.

It should be noted that, in the embodiment of the present application, when the terminal uses the first training data to train the first network model, it can label the first training data, and after repeated iterations of the first training data, the recognition accuracy is finally obtained. High initial dehazing model.

Specifically, in the embodiment of the present application, in the process of using the first training data to train the terminal to obtain the initial defogging model, and using the first test data set to obtain the image defogging model, multiple loss functions may be used for joint processing Exemplarily, in this application, the terminal may use Softmax loss and focal loss for joint training,

Softmax loss is one of the most commonly used loss functions and is widely used in image classification and segmentation tasks. Softmax loss is a combination of softmax and cross-entropy loss, so the full name is Softmax with cross-entropy loss. In the implementation of open source frameworks such as caffe and tensorflow, the two are directly placed in one layer. Instead of separating different layers, you can make numerical calculations more stable, because the positive exponential probability may have a very large value.

Focal loss is mainly to solve the problem of a serious imbalance in the ratio of positive and negative samples in one-stage target detection. This loss function reduces the weight of a large number of simple negative samples in training, and can also be understood as a kind of difficult sample mining.

It should be noted that, in the embodiment of the present application, the terminal tests the initial defogging model according to the first test data to obtain the image defogging model, and may first test the initial defogging model according to the first test data. Process to generate a first test result; then modify the initial defogging model according to the first test result to obtain an image defogging model.

That is to say, in this application, after the terminal uses the initial defogging model trained by the first training data, it can test the defogging processing effect of the initial defogging model based on the first test data. Specifically, the terminal can input the fogged image in the first test data into the initial defogging model, output the defogging image, and then compare the defogging image with the corresponding clear image to obtain the first image information. Test Results.

It is understandable that, in the embodiment of the present application, after obtaining the first test result, the terminal can modify the initial defogging model obtained by training based on the first test result, thereby improving the effect of defogging processing and obtaining optimization The latter model is the image dehazing model.

It can be seen that in the embodiment of this application, in the CNN network structure design process, the terminal can cut the network and optimize the network to make the image defogging model run quickly in the terminal with very low MACs, which can fully satisfy Real-time detection requirements of the terminal. In the process of designing the algorithm, it can be further optimized for a variety of image scenes to improve the effect of the defogging algorithm.

This application proposes an image defogging method. The terminal can use the image defogging model obtained by deep learning to perform defogging processing on the image to be processed. At the same time, for larger-sized images to be processed, the terminal can also use the image defogging model After defogging the sub-images after dividing the image to be processed, the image stitching model obtained by deep learning is used for stitching processing, so as to improve the processing speed while ensuring the processing accuracy. Further, the image defogging model and the image stitching model are obtained by the terminal's minimal network design on the CNN, so the above-mentioned image defogging method can be run in the terminal in real time. That is to say, in this application, the image defogging process based on the image defogging model and the image stitching model obtained by deep learning can greatly increase the processing speed while improving the processing accuracy, thereby achieving high-quality and high-efficiency images. Defogging treatment.

Based on the foregoing embodiment, in another embodiment of the present application, FIG. 9 is a schematic diagram of the implementation process of the image defogging method 5. As shown in FIG. The post-fogging sub-images are stitched together, and before the defogging image corresponding to the image to be processed is obtained, that is, before step 105, the method for the terminal to perform image defogging processing may further include the following steps:

Step 1012: Divide the second image sample set to obtain second training data and second test data; where the second image sample set includes the original image and multiple decomposed images corresponding to the original image.

Step 1013: Construct a second network model based on the second lightweight convolutional neural network, and perform training processing on the second network model according to the second training data to obtain an initial splicing model.

Step 1014: Perform test processing on the initial stitching model according to the second test data to obtain an image stitching model.

In the embodiment of the present application, the terminal may first construct the image stitching model before using the image stitching model to stitch the multiple defogging sub-images corresponding to the image to be processed.

Further, in the embodiment of the present application, the terminal may construct a second network model based on the second lightweight convolutional neural network, and improve the speed by modifying the processing. Specifically, on the basis of the second lightweight convolutional neural network ShuffleNet-V2 network, part of the convolutional layer can be removed, and part of the convolutional layer can be modified to Depthwise convolution + Pointwise convolution. The designed image splicing model can obtain better image splicing effects and can run in real time in the terminal.

Since ShuffleNet-V1 uses a large number of 1x1 group convolutions, it violates the G2 principle (excessive group convolution will increase access costs). In addition, ShuffleNet-V1 uses a bottleneck layer similar to ResNet, input and output channels The number is different, which violates the G1 principle (the same channel width can minimize the memory access cost); using too many groups at the same time also violates the G3 principle (network fragmentation will reduce the degree of parallelism); there are a large number of element levels in short-circuit connections The Add operation violates the G4 principle (element-level operations cannot be ignored).

ShuffleNet-V2 improves the defects of ShuffleNet-V1 through the introduction of channel split. ShuffleNet-V2 no longer has a channel split for the downsampling module. Instead, each branch directly copies a copy of the input. Each branch has a downsampling of stride=2. After concat together, the feature map space is reduced. Half, but the number of channels doubled.

The overall network structure of ShuffleNet-V2 is very similar to ShuffleNet-V1. An additional 1x1 convolutional layer is used before the GlobalPool layer to mix channel characteristics. The number of channels in each block is scaled to obtain a network with different FLOPS, that is, the number of channels for each block is set, such as 0.5x, 1x , Which can adjust the complexity of the model.

Specifically, the network structure of ShuffleNet-V2 is shown in Table 2:

Table 2

Further, in the embodiment of the present application, the terminal may first use the ShuffleNet-V2 network in the convolutional neural network CNN to construct the second network model, and then obtain the image stitching model through training and testing of the second network model.

It is understandable that, in the embodiment of the present application, the terminal may first obtain the second image sample set, where the second image sample set includes the original image before decomposition, and multiple decompositions corresponding to the original image after the original image is divided. image. That is, in the second image sample set, the corresponding large-size image before decomposition and the small-size image after decomposition are stored.

Exemplarily, in the embodiment of the present application, the second image sample set includes original images and decomposed images corresponding to 100 different scenes, where one frame of original image may correspond to m frames of decomposed images, and m is an integer greater than 1.

It is understandable that, in the embodiment of the present application, the number of decomposed images corresponding to different original images in the second image sample set may be different. For example, the original image A corresponds to 6 decomposed images a1-a6, which means that Image A is divided into decomposed images a1-a6; the original image B corresponds to 9 decomposed images b1-b9, that is, the decomposed images b1-b9 obtained after the original image B is divided.

It should be noted that, in the embodiment of the present application, the second image sample set may be collected and divided by the terminal, or may be sent by the terminal receiving other image collection devices, which is not specifically limited in this application.

Further, in the embodiment of the present application, after obtaining the second image sample set, the terminal may divide the second image sample set to obtain the second training data and the second test data. Among them, the second training data is used to train the second network model, and the second test data is used to test the initial splicing model.

It can be understood that, in the embodiment of the present application, the second training data and the second test data both include corresponding original images and decomposed images. However, the images in the second training data and the second test data are different. The same, that is, any set of original images and corresponding decomposed images in the second image sample set can only be divided into the second training data or the second test data, and cannot be used as the training image and the second test data in the second training data at the same time. The test image in the test data.

Further, in the embodiment of the present application, after the terminal constructs the second network model based on the second lightweight convolutional neural network, and divides the second image sample set to obtain the second training data and the second test data, the terminal can First use the second training data to train the second network model to obtain the initial stitching model; then use the second test data to test the initial stitching model to obtain the final image stitching model.

It should be noted that, in the embodiment of the present application, when the terminal uses the second training data to train the second network model, it can mark the second training data, and after repeated iterations of the second training data, the recognition accuracy is finally obtained. High initial stitching model.

Specifically, in the embodiment of the present application, in the process of using the second training data to train the terminal to obtain the initial splicing model and using the second test data set to obtain the image splicing model, multiple loss functions may be used for joint processing. Example In this application, the terminal can use Softmax loss and focal loss for joint training,

It should be noted that, in the embodiment of the present application, when the terminal performs test processing on the initial splicing model according to the second test data, and obtains the image splicing model, it may first perform the test processing on the initial splicing model according to the second test data to generate The second test result; then the initial stitching model is corrected according to the second test result to obtain an image stitching model.

That is, in this application, after the terminal uses the initial splicing model obtained by training with the second training data, it may test the splicing processing effect of the initial splicing model based on the second test data. Specifically, the terminal may input the decomposed image in the second test data into the initial stitching model, output the stitched image, and then compare the stitched image with the corresponding original image to obtain the second test result. .

It is understandable that, in the embodiment of the present application, after obtaining the second test result, the terminal can modify the initial stitching model obtained by training based on the second test result, thereby improving the effect of stitching processing and obtaining optimized Model, that is, image stitching model.

It can be seen that in the embodiment of this application, in the CNN network structure design process, the terminal can cut the network and optimize the network, so that the image stitching model can run quickly in the terminal with very low MACs, which can fully satisfy the terminal Real-time detection requirements. In the process of designing the algorithm, it can be further optimized for a variety of image scenes, so as to improve the effect of the stitching algorithm.

Based on the foregoing embodiment, in another embodiment of the present application, FIG. 10 is a sixth flowchart of image defogging processing. As shown in FIG. 10, the method for the terminal to perform defogging processing on the image to be processed may include the following steps:

Step 201: Determine whether the image to be processed is a fogged image, if yes, execute step 202, otherwise, execute step 207.

In the embodiment of the present application, after acquiring the image to be processed, the terminal may first determine whether the image to be processed is a fogged image. Specifically, the terminal may analyze and process the image to be processed to obtain the analysis result, and then determine whether the image to be processed is a fogged image according to the analysis result.

Step 202: Determine the size parameter of the image to be processed.

In the embodiment of this application, if it is determined that the image to be processed is a fogged image, the terminal needs to perform a defogging process for the image to be processed. First, the terminal needs to determine the image size of the image to be processed, that is, determine the size of the image to be processed. Size parameters.

Step 203: Whether the size parameter of the image to be processed is greater than or equal to the preset size threshold, if yes, step 204 is executed, otherwise, step 208 is executed.

In the embodiment of the present application, the terminal may determine the preprocessing strategy corresponding to the image to be processed according to the size parameter. Specifically, the terminal may compare the size parameter of the image to be processed with a preset size threshold, so as to further determine the corresponding Pretreatment strategy. Among them, the preprocessing strategy can be used to limit the image size. In other words, for images with different size parameters, the preprocessing strategies determined by the terminal are different.

Step 204: Divide the image to be processed, and obtain sub-images corresponding to the image to be processed.

In the embodiment of the present application, if the size parameter of the image to be processed is greater than or equal to the preset size threshold, that is, the preprocessing strategy is to divide the image to be processed, the terminal can divide the image to be processed to obtain the sub-image corresponding to the image to be processed .

In order to improve the efficiency of defogging processing, the terminal can first divide the to-be-processed image into multiple sub-images of equal size, and then perform defogging processing on the divided sub-images to speed up the processing. Fog treatment.

Step 205: Obtain a defogging sub-image corresponding to the sub-image based on the image defogging model.

In the embodiment of the present application, the terminal performs defogging processing on the sub-image according to the image defogging model, and obtains the defogging sub-image corresponding to the sub-image. Among them, in order to improve the processing effect of image defogging, the terminal may use a deep learning image defogging model to perform defogging processing on multiple sub-images respectively.

Exemplarily, in the embodiment of the present application, based on the MobileNet-V2 network, the terminal may modify part of the convolutional layer to Depthwise convolution + Pointwise convolution, which can improve the image processing efficiency while ensuring the processing effect .

Step 206: Use the image stitching model to stitch the defogging sub-images to obtain the defogging image.

In the embodiment of the present application, the terminal may perform splicing processing on all the sub-images after defogging according to the image splicing model, so as to obtain a clear image corresponding to the image to be processed, that is, the defogging image. Among them, in order to improve the processing effect of image splicing, the terminal may use a deep learning image splicing model to splice multiple defogging sub-images into corresponding defogging images.

Exemplarily, in the embodiment of the present application, based on the ShuffleNet-V2 network, the terminal can remove part of the convolutional layer, and perform network acceleration by modifying part of the convolutional layer to Depthwise convolution + Pointwise convolution. Therefore, a minimal network design for CNN can be carried out based on this, and the efficiency of image processing can be improved while ensuring the splicing effect.

Step 207: Jump over the fog processing flow.

In the embodiment of the present application, if it is determined that the image to be processed is not a fogged image, then the terminal does not need to perform a defogging process on the image to be processed.

Step 208: Obtain a defogging image corresponding to the image to be processed based on the image defogging model.

In the embodiment of the present application, if the size parameter of the image to be processed is less than the preset size threshold, that is, the preprocessing strategy is not to divide the image to be processed, the terminal can directly perform defogging processing on the image to be processed according to the image defogging model to obtain The dehazing image corresponding to the image to be processed.

Further, in the embodiment of the present application, before the terminal performs step 205 or step 208, the method for the terminal to defog the image to be processed may further include the following steps:

Step 209: Perform deep learning based on the first image sample set to obtain an image defogging model.

In the embodiment of the present application, the first image sample set includes a fogged image and a clear image corresponding to the fogged image. The terminal divides the first image sample set into the first training data and the first test data, uses the first training data for model training to obtain the initial defogging model, and then uses the first test data to test the initial defogging model to obtain the image Dehazing model.

Further, in the embodiment of the present application, before the terminal performs step 206, the method for the terminal to defog the image to be processed may further include the following steps:

Step 2010: Perform deep learning based on the second image sample set to obtain an image mosaic model.

In the embodiment of the present application, the second image sample set includes the original image and multiple decomposed images corresponding to the original image. The terminal divides the second image sample set into second training data and second test data, uses the second training data for model training to obtain the initial splicing model, and then uses the second test data to test the initial splicing model to obtain the image splicing model .

It is understandable that the image defogging method proposed in this application can not only be used for image beautification in mobile phone albums, but also can be used in the field of video surveillance to improve face recognition and other scenarios. For example, in video surveillance, when the fog is relatively large, the quality of the image captured by the surveillance equipment is very poor, and it is difficult to clearly see the target person in the video. After the image is defogged, the face of the target person can be better restored.

Further, the image defogging method proposed in this application can also be used as an auxiliary means in face recognition to improve the effect of face recognition. When the face recognition camera is in backlight or other bad weather conditions, the use of the defogging algorithm can be significant To improve the recognition effect.

Based on the foregoing embodiment, in another embodiment of the present application, FIG. 11 is a schematic diagram 1 of the terminal structure. As shown in FIG. 11, the terminal 10 proposed in this embodiment of the present application may include a determining part 11, a dividing part 12, The fog part 13, the splicing part 14, the analysis part 15, the judgment part 16, and the acquisition part 17.

The determining part 11 is configured to determine a size parameter of the image to be processed after determining that the image to be processed is a fogged image; and determine a preprocessing strategy corresponding to the image to be processed according to the size parameter; wherein, the preprocessing strategy Used to limit the image size;

The dividing part 12 is configured to divide the image to be processed if the preprocessing strategy is to divide the image to be processed to obtain a sub-image corresponding to the image to be processed;

The defogging part 13 is configured to perform defogging processing on the sub-image according to an image defogging model to obtain a defogging sub-image corresponding to the sub-image;

The splicing part 14 is configured to perform splicing processing on the defogging sub-images according to an image splicing model to obtain a defogging image corresponding to the image to be processed.

Further, in the embodiment of the present application, the analysis part 15 is configured to perform analysis processing on the image to be processed before determining the size parameter of the image to be processed to obtain an analysis result;

The judging part 16 is configured to judge whether the image to be processed is a fogged image according to the analysis result.

Further, in the embodiment of the present application, the determining part 11 is specifically configured to determine that the preprocessing strategy is to divide the image to be processed if the size parameter is greater than or equal to a preset size threshold; If the size parameter is less than the preset size threshold, it is determined that the preprocessing strategy is not to divide the image to be processed.

Further, in the embodiment of the present application, the acquiring part 17 is configured to perform defogging processing on the sub-image according to the image defogging model, and before obtaining the defogging sub-image corresponding to the sub-image, The first image sample set is divided to obtain first training data and first test data; wherein, the first image sample set includes a fogged image and a clear image corresponding to the fogged image; and based on the first lightweight The convolutional neural network constructs a first network model, and performs training processing on the first network model according to the first training data to obtain an initial defogging model; and performing the initial defogging model on the basis of the first test data Perform test processing to obtain the image defogging model.

Further, in the embodiment of the present application, the acquiring part 17 is specifically configured to perform a test process on the initial dehazing model according to the first test data to generate a first test result; and according to the first test data The test result corrects the image defogging model to obtain the image defogging model.

Further, in the embodiment of the present application, the acquiring part 17 is further configured to perform stitching processing on the defogging sub-images according to the image stitching model, and before obtaining the defogging image corresponding to the image to be processed, Divide the second image sample set to obtain second training data and second test data; wherein, the second image sample set includes an original image and a plurality of decomposed images corresponding to the original image; and based on the second lightweight The convolutional neural network constructs a second network model, and performs training processing on the second network model according to the second training data to obtain an initial stitching model; and testing the initial stitching model according to the second test data Processing to obtain the image stitching model.

Further, in the embodiment of the present application, the acquiring part 17 is specifically configured to perform test processing on the initial splicing model according to the second test data to generate a second test result; and according to the second test As a result, the initial stitching model is corrected, and the image stitching model is obtained.

Further, in the embodiment of the present application, the dehazing part 13 is further configured to determine the preprocessing strategy corresponding to the image to be processed according to the size parameter, if the preprocessing strategy is not to divide the For the image to be processed, the image to be processed is defogged directly according to the image defogging model to obtain the defogging image.

Fig. 12 is a second schematic diagram of the composition structure of the terminal. As shown in Fig. 12, the terminal 10 proposed in the embodiment of the present application may further include a processor 18 and a memory 19 storing executable instructions of the processor 18. Further, the terminal 10 may also It includes a communication interface 110 and a bus 111 for connecting the processor 18, the memory 19, and the communication interface 110.

In the embodiment of the present application, the aforementioned processor 18 may be an application specific integrated circuit (ASIC), a digital signal processor (Digital Signal Processor, DSP), a digital signal processing device (Digital Signal Processing Device, DSPD). ), Programmable Logic Device (ProgRAMmable Logic Device, PLD), Field Programmable Gate Array (Field ProgRAMmable Gate Array, FPGA), Central Processing Unit (CPU), Controller, Microcontroller, Microprocessor At least one of. It is understandable that, for different devices, the electronic devices used to implement the above-mentioned processor functions may also be other, which is not specifically limited in the embodiment of the present application. The terminal 10 may also include a memory 19, which may be connected to the processor 18. The memory 19 is used to store executable program code, the program code includes computer operation instructions, the memory 19 may include a high-speed RAM memory, or may also include Non-volatile memory, for example, at least two disk memories.

In the embodiment of the present application, the bus 111 is used to connect the communication interface 110, the processor 18 and the memory 19, and to communicate with each other among these devices.

In the embodiment of the present application, the memory 19 is used to store instructions and data.

Further, in the embodiment of the present application, the above-mentioned processor 18 is configured to determine the size parameter of the image to be processed after determining that the image to be processed is a fogged image; determine the preprocessing corresponding to the image to be processed according to the size parameter Strategy; wherein, the preprocessing strategy is used to limit the image size; if the preprocessing strategy is to divide the image to be processed, divide the image to be processed to obtain the sub-image corresponding to the image to be processed; Perform defogging processing on the sub-images according to the image defogging model to obtain the defogging sub-images corresponding to the sub-images; performing stitching processing on the defogging sub-images according to the image stitching model to obtain the to-be-processed image The corresponding image after defogging.

In practical applications, the above-mentioned memory 19 may be a volatile memory (volatile memory), such as a random-access memory (Random-Access Memory, RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory. A storage device (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); or a combination of the above types of memory, and Instructions and data are provided to the processor 18.

In addition, the functional modules in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be realized in the form of hardware or software function module.

If the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this embodiment is essentially or correct The part that the prior art contributes or all or part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium and includes a number of instructions to enable a computer device (which can be a personal computer). A computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the method in this embodiment. The aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.

An embodiment of the present application proposes a terminal that can use the image defogging model obtained by deep learning to perform defogging processing on the image to be processed. At the same time, for larger-sized images to be processed, the terminal can also use the image defogging model After defogging the sub-images after dividing the image to be processed, the image stitching model obtained by deep learning is used for stitching processing, so as to improve the processing speed while ensuring the processing accuracy. Further, the image defogging model and the image stitching model are obtained by the terminal's minimal network design on the CNN, so the above-mentioned image defogging method can be run in the terminal in real time. That is to say, in this application, the image defogging process based on the image defogging model and the image stitching model obtained by deep learning can greatly increase the processing speed while improving the processing accuracy, thereby achieving high-quality and high-efficiency images. Defogging treatment.

The embodiment of the present application provides a computer-readable storage medium on which a program is stored, and when the program is executed by a processor, the image defogging method as described above is realized.

Specifically, the program instructions corresponding to an image defogging method in this embodiment can be stored on storage media such as optical disks, hard disks, USB flash drives, etc., when the program instructions corresponding to an image defogging method in the storage medium When being read or executed by an electronic device, it includes the following steps:

Those skilled in the art should understand that the embodiments of the present application can be provided as a method, a terminal, or a computer program product. Therefore, this application may adopt the form of hardware embodiments, software embodiments, or embodiments combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) containing computer-usable program codes.

This application is described with reference to the schematic diagrams and/or block diagrams of the implementation process of the methods, equipment (systems), and computer program products according to the embodiments of the application. It should be understood that each process and/or block in the schematic flow diagram and/or block diagram can be realized by computer program instructions, and a combination of processes and/or blocks in the schematic flow diagram and/or block diagram can be realized. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are used to generate A device for realizing the functions specified in one process or multiple processes in the schematic flow chart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device realizes the functions specified in one process or multiple processes in the realization process schematic diagram and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in one or more processes in the schematic diagram and/or one block or more in the block diagram.

The above are only preferred embodiments of the present application, and are not used to limit the protection scope of the present application.

Industrial applicability

Claims

An image defogging method, the method comprising:

After determining that the image to be processed is a fogged image, determine the size parameter of the image to be processed;

Determine a preprocessing strategy corresponding to the image to be processed according to the size parameter; wherein, the preprocessing strategy is used to limit the size of the image;

If the preprocessing strategy is to divide the image to be processed, divide the image to be processed to obtain a sub-image corresponding to the image to be processed;

Performing defogging processing on the sub-image according to the image defogging model to obtain a defogging sub-image corresponding to the sub-image;

Performing a splicing process on the defogging sub-images according to the image splicing model to obtain a defogging image corresponding to the image to be processed.
The method according to claim 1, wherein before the determining the size parameter of the image to be processed, the method further comprises:

Perform analysis processing on the to-be-processed image to obtain an analysis result;

According to the analysis result, it is determined whether the image to be processed is a fogged image.
The method according to claim 1, wherein the determining a preprocessing strategy corresponding to the image to be processed according to the size parameter comprises:

If the size parameter is greater than or equal to a preset size threshold, determining that the preprocessing strategy is to divide the image to be processed;

If the size parameter is less than the preset size threshold, it is determined that the preprocessing strategy is not to divide the image to be processed.
The method according to claim 1, wherein, before performing defogging processing on the sub-image according to the image defogging model, and obtaining the defogging sub-image corresponding to the sub-image, the method further comprises:

Dividing the first image sample set to obtain first training data and first test data; wherein the first image sample set includes a fogged image and a clear image corresponding to the fogged image;

Constructing a first network model based on the first lightweight convolutional neural network, and training the first network model according to the first training data to obtain an initial defogging model;

Perform test processing on the initial defogging model according to the first test data to obtain the image defogging model.
The method according to claim 4, wherein the performing test processing on the initial defogging model according to the first test data to obtain the image defogging model comprises:

Performing test processing on the initial defogging model according to the first test data to generate a first test result;

The image defogging model is corrected according to the first test result to obtain the image defogging model.
The method according to claim 5, wherein said performing a test process on said initial defogging model according to said first test data to generate a first test result comprises:

Inputting the fogging image in the first test data into the initial defogging model, and outputting the defogging image;

The image information of the defogging image and the clear image are compared to obtain the first test result.
The method according to claim 1, wherein the method further comprises: before performing the stitching process on the defogging sub-images according to the image stitching model, and obtaining the defogging image corresponding to the image to be processed, the method further comprises:

Dividing the second image sample set to obtain second training data and second test data; wherein the second image sample set includes an original image and a plurality of decomposed images corresponding to the original image;

Constructing a second network model based on the second lightweight convolutional neural network, and training the second network model according to the second training data to obtain an initial splicing model;

Performing test processing on the initial stitching model according to the second test data to obtain the image stitching model.
8. The method according to claim 7, wherein said performing test processing on said initial stitching model according to said second test data to obtain said image stitching model comprises:

Performing test processing on the initial splicing model according to the second test data to generate a second test result;

Correcting the initial stitching model according to the second test result to obtain the image stitching model.
8. The method according to claim 8, wherein said performing test processing on said initial splicing model according to said second test data to generate a second test result comprises:

Inputting the decomposed image in the second test data into the initial stitching model, and outputting the stitched image;

The image information of the spliced image is compared with the original image to obtain the second test result.
The method according to claim 3, wherein, after the preprocessing strategy corresponding to the image to be processed is determined according to the size parameter, the method further comprises:

If the preprocessing strategy is not to divide the image to be processed, the image to be processed is defogged directly according to the image defogging model to obtain the defogging image.
The method of claim 3, wherein:

The preset size threshold is the upper limit of the height size, or the upper limit of the width size, or the upper limit of the size of the height and the width.
The method according to claim 1, wherein the dividing the image to be processed to obtain a sub-image corresponding to the image to be processed comprises:

Determine the number of divisions corresponding to the image to be processed;

Perform division processing according to the number of divisions to obtain the sub-images.
The method according to claim 1, wherein the dividing the image to be processed to obtain a sub-image corresponding to the image to be processed comprises:

Determine the number of preset divisions;

Perform division processing according to the preset number of divisions to obtain the sub-image.
The method according to claim 2, wherein, after determining whether the image to be processed is a fogged image according to the analysis result, the method further comprises:

If it is determined that the image to be processed is not a fogged image, skip to the fog processing flow.
The method according to claim 1, wherein the image defogging model is an image defogging algorithm based on AI.
The method according to claim 1, wherein the image stitching model is an image stitching algorithm based on AI.
The method according to any one of claims 1 to 16, wherein:

The image to be processed is a pre-stored image; or,

The image to be processed is an image acquired in real time; or,

The image to be processed is an image sent by another device.
A terminal, the terminal includes: a determination part, a division part, a defogging part, a splicing part,

The determining part is configured to determine the size parameter of the image to be processed after determining that the image to be processed is a fogged image; and determine the preprocessing strategy corresponding to the image to be processed according to the size parameter; wherein the preprocessing strategy is used To limit the size of the image;

The dividing part is configured to divide the image to be processed if the preprocessing strategy is to divide the image to be processed to obtain a sub-image corresponding to the image to be processed;

The defogging part is configured to perform defogging processing on the sub-image according to the image defogging model to obtain a defogging sub-image corresponding to the sub-image;

The splicing part is configured to perform splicing processing on the defogging sub-images according to an image splicing model to obtain a defogging image corresponding to the image to be processed.
A terminal comprising a processor and a memory storing executable instructions of the processor, and when the instructions are executed by the processor, the method according to any one of claims 1-8 is implemented.
A computer-readable storage medium with a program stored thereon and applied to a terminal. When the program is executed by a processor, the method according to any one of claims 1-8 is implemented.