CN113989261A

CN113989261A - Unmanned aerial vehicle visual angle infrared image photovoltaic panel boundary segmentation method based on Unet improvement

Info

Publication number: CN113989261A
Application number: CN202111321680.3A
Authority: CN
Inventors: 刘刚; 沙万里; 苏践; 戴超超; 戴铭; 郑恩辉
Original assignee: China Jiliang University; Zhejiang Zheneng Jiahua Power Generation Co Ltd
Current assignee: China Jiliang University; Zhejiang Zheneng Jiahua Power Generation Co Ltd
Priority date: 2021-11-09
Filing date: 2021-11-09
Publication date: 2022-01-28

Abstract

The invention discloses a photovoltaic panel semantic segmentation method applied to infrared images based on deep learning. Establishing a photovoltaic panel data set under the condition of visual angle and infrared light of the unmanned aerial vehicle and preprocessing the photovoltaic panel data set; constructing an improved Unet semantic segmentation deep learning model; putting the training set into an improved Unet semantic segmentation deep learning model batch by batch for iteration, and testing the performance of the model obtained by real-time training by adopting a test set; and inputting the photovoltaic panel image to be detected under the infrared light condition into the model corresponding to the minimum loss, processing and outputting to obtain a segmentation result. The method applies the deep learning method to the boundary detection of the infrared photovoltaic panel, and improves the more obvious shallow layer characteristic provided by the Unet network model to improve the accuracy of the photovoltaic panel segmentation.

Description

Unmanned aerial vehicle visual angle infrared image photovoltaic panel boundary segmentation method based on Unet improvement

Technical Field

The invention relates to an image processing method, belonging to the field of image processing applied to an unmanned aerial vehicle, in particular to a photovoltaic panel boundary segmentation method of an infrared image under the view angle of the unmanned aerial vehicle.

Background

In recent years, because of carrying the camera on unmanned aerial vehicle can acquire wider visual angle to unmanned aerial vehicle vision's real-time and flexibility are all higher relatively, and unmanned aerial vehicle vision field has obtained positive popularization and application. The current stage has already broken through the detection of the photovoltaic panel of visible light, and for infrared images, the edge boundary is not obvious due to the thermal domain continuity, so that the boundary of the photovoltaic panel cannot be well judged. The boundary segmentation of the infrared image still depends on manual work at present, the number of infrared monitoring personnel at present is small, and the labeling work of a huge number of infrared images cannot be met, so that the waste of human resources is caused. In order to solve the problems, the invention provides a photovoltaic panel boundary detection method under an infrared scene based on deep learning.

For promoting the intelligent level of infrared image photovoltaic board segmentation under the unmanned aerial vehicle visual angle, can adopt some intelligent image processing methods to cut apart infrared image, draw the photovoltaic board region, be convenient for fix a position the photovoltaic scope of action. In the field of computer vision, image segmentation methods are endless. The traditional image segmentation method is mainly suitable for the conditions of obvious image characteristics and low density, the occurrence rate of misjudgment is high along with the change of the background, and the accuracy of the segmentation result is low. In addition, due to the characteristic of continuous thermal domains of the infrared images, the boundaries of the infrared images are difficult to define, so that great uncertainty exists, and therefore, image segmentation based on the traditional mode is not beneficial to photovoltaic segmentation of the infrared scene.

Disclosure of Invention

The invention provides a photovoltaic panel infrared image boundary segmentation method under an unmanned aerial vehicle visual angle, which is improved based on a deep learning model Unet, overcomes the defects of the traditional method, improves the anti-interference capability of the model and improves the segmentation precision.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

step S1: collecting an infrared photovoltaic panel image under an infrared light condition through a visual angle of the unmanned aerial vehicle, establishing a photovoltaic panel data set under the infrared light condition of the visual angle of the unmanned aerial vehicle, and marking, as shown in fig. 3;

step S2: the method comprises the steps that a photovoltaic panel data set is subjected to forward propagation through a super-resolution model to obtain an infrared data set image with 2 times of resolution, and the infrared data set image is divided into a training set and a testing set after image augmentation;

step S3: constructing an improved Unet semantic segmentation deep learning model, wherein the model structure is shown in figure 2;

step S4: setting a training mode for improving a Unet semantic segmentation deep learning model, specifically setting hyper-parameters such as iteration times, learning rate and the like;

step S5: using a training set as the input of a model, putting the training set into an improved Unet semantic segmentation deep learning model batch by batch for iteration, and testing the performance of the model obtained by real-time training by adopting a test set every 3000 times of iteration; when the iteration times reach a preset iteration time threshold value, stopping training, and taking out a model corresponding to the minimum loss in the test set;

step S6: and inputting the photovoltaic panel image to be detected under the infrared light condition into the model corresponding to the minimum loss, processing and outputting to obtain a segmentation result.

The step S1: preprocessing an image of the infrared photovoltaic panel, wherein the preprocessing comprises unifying the image size, marking the occupied pixels and position information of the photovoltaic panel in the image and according to the following steps of 8: the 2 scale is divided into a training set and a test set.

In step S2, the super-resolution network is used to increase the width and height of the infrared image to 2 times that of the original infrared image, so as to obtain more abundant information, and to retain and generate detailed information.

In step S3, an improved Unet semantic segmentation deep learning model is adopted, as shown in fig. 3,

the improved Unet semantic segmentation deep learning model is improved on the basis of the original Unet semantic segmentation deep learning model according to the following modes: the output of a first convolution module in the characteristic extraction part is added with the output of a second convolution module in the characteristic extraction part after being convoluted by a hole to form a first enhanced characteristic map set, the output of a third convolution module in the characteristic extraction part is added with the output of the first enhanced characteristic map set after being convoluted by the hole to form a second enhanced characteristic map set, the output of a fourth convolution module in the characteristic extraction part is added with the second enhanced characteristic map set after being convoluted by the hole to form a third enhanced characteristic map set, and finally, adding the fourth enhanced feature atlas, the output of the first convolution module of the feature extraction part and the output of the last convolution module of the scale reduction and feature fusion part after up sampling together, and inputting the sum into the last convolution module of the feature fusion part.

The feature map set is formed by superposing a plurality of feature maps.

In the specific implementation, the improved Unet semantic segmentation deep learning model is carried out according to the following modes:

the system comprises a feature extraction part and a scale reduction and feature fusion part, wherein the feature extraction part comprises four continuous convolution modules, the scale reduction and feature fusion part also comprises four continuous convolution modules, each convolution module is formed by sequentially connecting two convolution layers, and an activation function is arranged behind each convolution layer; each convolution module in the characteristic extraction part is connected with a maximum pooling layer immediately after, and each convolution module in the scale reduction and characteristic fusion part is connected with an up-sampling layer immediately before;

the output of the fourth convolution module of the feature extraction part after passing through the maximum pooling layer is processed by an additional convolution module to obtain an intermediate feature map set, and the output of the intermediate feature map set after passing through the upsampling layer is added with the output of the fourth convolution module of the feature extraction part and then input into a first convolution module in the scale reduction and feature fusion part; then adding the output of the first convolution module of the scale reduction and feature fusion part after passing through the upsampling layer and the output of the third convolution module of the feature extraction part, inputting the added output into a second convolution module of the scale reduction and feature fusion part, adding the output of the second convolution module of the scale reduction and feature fusion part after passing through the upsampling layer and the output of the second convolution module of the feature extraction part, and inputting the added output into the third convolution module of the scale reduction and feature fusion part;

adding the output of a first convolution module in the feature extraction part after the output of the first convolution module is subjected to cavity convolution and the output of a second convolution module in the feature extraction part to form a first enhanced feature map set, adding the output of a third convolution module in the feature extraction part after the output of the first enhanced feature map set is subjected to cavity convolution to form a second enhanced feature map set, adding the output of a fourth convolution module in the feature extraction part after the output of the second enhanced feature map set is subjected to cavity convolution and the output of the fourth convolution module in the feature extraction part to form a third enhanced feature map set, and sequentially performing feature compression and size expansion on the third enhanced feature map set to obtain a fourth enhanced feature map set; and adding the output of the third convolution module of the scale reduction and feature fusion part after passing through the up-sampling layer, the output of the first convolution module of the feature extraction part and the fourth enhanced feature atlas together, and inputting the sum into the fourth convolution module of the scale reduction and feature fusion part.

The method strengthens the utilization of shallow features, strengthens the emphasis of a model on information such as a shallow color profile and the like by adding a shallow branch and utilizing the capture capacity of the cavity convolution on the space information, and adopts a balanced cross entropy loss function in a loss function part.

The invention improves the Unet network model, builds a deep learning network, and adopts a lightweight framework to realize the effect of real-time segmentation.

According to the invention, a deep learning model is applied to the infrared photovoltaic panel segmentation problem, and the characteristics of the infrared photovoltaic panel are improved on the Unet network, so that the model focuses more on information such as the color of a shallow profile and the like.

According to the unmanned aerial vehicle infrared image photovoltaic panel boundary segmentation method based on the Unet improvement, the infrared image is segmented by adopting an image processing method based on learning, the segmentation precision is improved, the thermal distribution information of the infrared image is amplified by adopting super resolution, and the edge information of the photovoltaic panel under the infrared condition is favorably acquired.

In the step S5, the threshold value of the number of iterations is set to 8000-10000 times.

In step S5, an optimizer using Adam as a model performs iterative update of weights, the batch size is set to 8, the total number of training cycles is set to 12, the initial learning rate is 0.01, and the number of training cycles is reduced to 1/10 in 7 th and 11 th cycles, the momentum (momentum) is set to 0.9, and the previous 500 iterations are set as a preheating stage, where the learning rate is 1/100 times the initial learning rate. Setting a global minimum value of the target function according to the gradient in the training process, updating the weight value of the model once after each batch iteration,

in the step S5, if the number of the prepared view-angle infrared image data sets of the unmanned aerial vehicle is less than 500 due to the limitation of the number of the data sets, a transfer learning manner is adopted, firstly, weights are obtained from the pre-training model of the ImageNet data set, and then, the training set is used for formal training of the model; thus, the model weight pre-trained in the ImageNet data set is transferred to be beneficial to better cognition of the model; and the iteration number of training is set to be about 30-40, and an Adam optimization mode is used as parameter updating, so that overfitting caused by too small data set can be prevented.

The step S5, testing the performance of the model and then:

if the accuracy rate reaches 95%, the model has better capacity of segmenting the infrared image of the photovoltaic panel, the hyper-parameter file of the training is stored, iteration is continued, whether higher accuracy can be achieved is judged, and the model does not rise any more in 6 continuous iteration cycles of the accuracy rate of the test set;

and if the accuracy is lower than 95%, continuing to perform iterative optimization.

Where accuracy refers to MPA (class mean pixel accuracy) calculation formula as follows:

wherein k represents the number of categories, p_ijIs the number of pixels that would have been classified as being of the ith class but classified as being of the jth class. That is, pii represents the number of positive cases that are correctly classified.

The prior art is referred to in the art for techniques not mentioned in the present invention.

The method for dividing the boundary of the infrared image photovoltaic panel under the unmanned aerial vehicle visual angle based on the Unet is improved, the division of the photovoltaic panel in the infrared image under the unmanned aerial vehicle visual angle is completed, compared with the traditional infrared image division method, the method is greatly improved in precision, the edge outline information of the infrared image is more concerned, the method has the greatest advantage that the problem that the boundary is not easy to distinguish due to continuous infrared image thermal domains is solved, and the method can be beneficial to the positioning and abnormal detection of the photovoltaic panel.

The method applies the deep learning method to the boundary detection of the infrared photovoltaic panel, and improves the more obvious shallow layer characteristic provided by the Unet network model to improve the accuracy of the photovoltaic panel segmentation.

Drawings

Fig. 1 is a schematic flow chart of an infrared image photovoltaic panel boundary segmentation method under an unmanned aerial vehicle viewing angle based on the improvement of the uet in the embodiment of the present invention;

fig. 2 is a schematic diagram of an improved Unet network structure in an embodiment of the present invention (the blue dotted line is the content of the improvement);

FIG. 3 is an infrared image to be measured of a photovoltaic panel under the view angle of an unmanned aerial vehicle in the embodiment of the invention;

fig. 4 is a graph of the division result of fig. 3.

Detailed Description

In order to better understand the present invention, the following examples are further provided to illustrate the present invention, but the present invention is not limited to the following examples.

As shown in fig. 1, the overall flow of the method of the present embodiment includes the following steps;

step S1; establishing an infrared image data set of a photovoltaic panel under an unmanned aerial vehicle visual angle, wherein image samples come from a patrol unmanned aerial vehicle at a power plant site, infrared images with variable scenes and different illumination and brightness under an infrared condition are respectively selected as training data sets, and image selection samples are shown in fig. 3; preprocessing the screened infrared image, firstly labeling an original image by using an image labeling tool labelme, analyzing a generated json labeling file to obtain a segmented image of a target as a segmented target, and dividing a preprocessed image data set into a training set and a testing set according to a ratio of 8: 2; and finally, unifying the images with uneven sizes to 572x572 pixel size, and carrying out normalization operation.

Step S2: obtaining a 2x high-resolution infrared data set image through the forward propagation of a super-resolution model; the essence of the super-resolution network is to generate an image with a larger resolution by an AI mode, and methods such as bilinear interpolation can be adopted to achieve the same purpose, so that richer inference can be obtained based on the AI mode, and the judgment of the boundary is facilitated.

Step S3: the method is characterized in that a Unet deep learning model is constructed, the Unet network is widely applied to the medical field, the network is good at capturing tiny features, so that the network is very favorable for acquiring information of some edge details, and in order to improve extraction of shallow contour information by the Unet network, the extraction capability of the Unet network for improving the shallow contour features is improved.

The input image is of size 572x572, and is subjected to 2 convolution operations of 3x3 to obtain a 568 x 568 shallow feature atlas f₁Feature map set f₁The number of feature maps included in (1) is 64, that is, 64 dimensions, and the upper 64 numbers of each feature map set in fig. 2 represent the number of feature maps included in the feature map set.

Then dividing the image into 3 paths, wherein two paths are consistent with the original Unet, and obtaining a characteristic map set f with the scale of 280 x 280 through one-time downsampling and a plurality of times of 3x3 convolution₂Continuing to perform the downsampling and convolution operations to obtain a feature atlas f with the scale of 136 × 136₃Then, a 64 × 64 feature map set f is obtained₄。

Until the scale is compressed to 32 x 32 feature atlas

Then, up-sampling is carried out to enlarge the scale, and 56 × 56 feature atlas sets are obtained respectively

Wherein the characteristic diagram set

Feature map set f is fused₄Is expressed as:

wherein

Refer toTwo volume blocks (a 3x3 convolution and the Relu activation function) and one upsampling operation,

means that a cutting operation will f₄Cut from 64 x 64 to 56 x 56 dimensions.

According to the feature atlas

In the same manner, a 104 × 104 feature map set is generated

200 x 200 feature atlas

And 388 × 388 feature atlas

The expression is as follows:

wherein

Each representing a cutting operation, cutting the feature map in the feature map set from 136 × 136 to 104 × 104, from 280 × 280 to 200 × 200, from 568 × 568 to 388 × 388,

the other is the improved network structure of the present invention.

3x3 hole volume for 568 x 568 feature atlasProduct (scaled conv) to increase the receptive field, first 3 × 3 hole convolution and merging the feature set f of the deep layer₂Then, a 280X 280 feature atlas F is obtained₁Feature map set F₁The expression of (c) is as follows.

F₁＝D(f₁)+f₂

Where D represents a 3x3 hole convolution (3x3 scaled conv) with padding (padding).

For feature map set F₁Performing cavity convolution again and fusing the feature layer f₃Obtaining a feature atlas F with the scale of 136 x 136₂。

F₂＝D(f₂)+f₃

For feature map set F₂Performing cavity convolution again and fusing the feature map set f₄Obtaining a feature map set F with the scale of 64 multiplied by 64₃。

F₃＝D(f₃)+f₄

Obtaining a feature atlas F containing information such as rich deep semantics and shallow contour color₃Then, feature compression is carried out on the intermediate feature vector V to obtain an intermediate feature vector V, and the formula is as follows.

Wherein H and W each represent F₃High and wide (both 64).

After the feature vector is obtained, the enlarged size is 392 × 392 denoted as F4, and the original value is enlarged here, which is essentially to add the feature vector as a global feature to the final result.

F4 is obtained and added with the original T4 to obtain 392 multiplied by 392 fused feature map layer R. After two times of U operations (one 3 × 3 convolution and the Relu activation function) are performed on R, a 1 × 1 convolution is performed to finally obtain a feature atlas H with a dimension of 388 × 388 and a layer number of 2.

R＝F₄+T₄

H＝G(U(R))

Where G (×) represents one 1 × 1 operation performed on.

Step S4: setting initial hyper-parameters and iteration times of the model, setting the iteration times to be 8000-10000 times, updating parameters by adopting an Adam optimizer, and setting the batch size to be 8; after each batch of training is finished, calculating a loss value through a balance cross entropy loss function, and performing back propagation so as to update gradient information.

The equilibrium cross entropy loss function formula is as follows:

where β is a weighting factor, weighting positive and negative samples.

Representing the predicted probability value. p represents whether the current category is consistent with the real category, if so, 1, and if not, 0, cls_iThe class of the ith pixel of the predicted final feature map is represented, c represents the true class of the pixel,

representing the balanced cross-entropy loss of a single pixel.

Step S5: inputting the infrared image training set marked in the step S1 into the constructed depth model for training;

step S6: when the iteration times reach 3000 times, 6000 times and 9000 times respectively, evaluating the performance of the model obtained by current training in the step S5 by adopting the test set marked in the step S1, and measuring by using an MPA index;

if the accuracy rate in the test set reaches more than 95%, the current network has better cognitive ability, the model has better ability of segmenting the infrared photovoltaic panel image, and the hyper-parameter file of the training is stored;

step S7: and when the iteration times reach 10000 times, stopping training, comparing the performances of the models respectively obtained when the iteration times are 3000 times, 6000 times and 9000 times, screening out a primary training result with the optimal performance, and using a weight file obtained by the training to introduce the weight into the model to obtain the optimal photovoltaic panel infrared image segmentation model.

Step S8: the photovoltaic panel infrared image meeting the test condition is selected as an image to be tested (as shown in fig. 3), the image is input into the optimal photovoltaic panel infrared image model obtained in the step S7 to complete forward propagation, and the obtained segmentation result is shown in fig. 4.

In step S1 and step S4, if the number of the acquired infrared image data sets of the photovoltaic panel is small due to the factors of inconvenient acquisition or tedious labeling work, and the like, the weight of the image classification in the Imagenet data set and the trained image classification is imported into the model by adopting the idea of transfer learning; the Imagenet data set is the most huge image classification data set in the world, and comprises 1400 million images of 1000 categories, the model can have certain cognitive ability through pre-training weights, and then the photovoltaic panel infrared image data set is used for formal training in the model, so that the problem of low weight convergence speed caused by undersize data sets is solved, and the resource utilization of training can be saved; correspondingly, in order to prevent overfitting caused by too small data set, the overfitting resistance of the network is improved by adopting a regularization and dropout mode, the iteration times are reduced, the training iteration times are set to be about 30-40, and an Adam optimization mode is used as a parameter updating strategy. The regularization method adopts batch regularization.

Compared with the Unet image segmentation method, the method mainly increases the utilization of shallow features in the aspect of model structure, enables the model to pay better attention to the edge information of the photovoltaic panel, and adopts cavity convolution in improved branches in order to enable the shallow features to have better receptive field. The method has great improvement on the precision (the MPA of Unet in the test set of the invention reaches 89.4%, the MPA of the method provided herein reaches 95.7%, and is improved by 6.3%), the segmentation of the edge is obviously superior to that of a method based on Unet image segmentation, the method has the greatest advantage of overcoming the problem that the boundary is not easy to distinguish due to continuous infrared image hot domains, and can be applied to the fields of photovoltaic panel positioning, defect detection and the like under an infrared scene under the view angle of an unmanned aerial vehicle.

Claims

1. An improved unmanned aerial vehicle visual angle lower infrared image photovoltaic panel boundary segmentation method based on Unet is characterized by comprising the following steps: the method comprises the following steps:

step S1: collecting an infrared photovoltaic panel image under an infrared light condition through the visual angle of the unmanned aerial vehicle, establishing a photovoltaic panel data set under the infrared light condition of the visual angle of the unmanned aerial vehicle, and marking;

step S3: constructing an improved Unet semantic segmentation deep learning model;

step S4: setting a training mode for improving a Unet semantic segmentation deep learning model;

2. The Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: the step S1: preprocessing an image of the infrared photovoltaic panel, wherein the preprocessing comprises unifying the image size, marking the occupied pixels and position information of the photovoltaic panel in the image and according to the following steps of 8: the 2 scale is divided into a training set and a test set.

3. The Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: in step S2, the super-resolution network is used to increase the width and height of the infrared image to 2 times that of the original infrared image.

4. The Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: in the step S3, an improved Unet semantic segmentation deep learning model is adopted, which is improved based on the original Unet semantic segmentation deep learning model in the following manner: the output of a first convolution module in the characteristic extraction part is added with the output of a second convolution module in the characteristic extraction part after being convoluted by a hole to form a first enhanced characteristic map set, the output of a third convolution module in the characteristic extraction part is added with the output of the first enhanced characteristic map set after being convoluted by the hole to form a second enhanced characteristic map set, the output of a fourth convolution module in the characteristic extraction part is added with the second enhanced characteristic map set after being convoluted by the hole to form a third enhanced characteristic map set, and finally, adding the fourth enhanced feature atlas, the output of the first convolution module of the feature extraction part and the output of the last convolution module of the scale reduction and feature fusion part after up sampling together, and inputting the sum into the last convolution module of the feature fusion part.

5. The Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: in the step S5, the threshold value of the number of iterations is set to 8000-10000 times.

6. The Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: in step S5, an optimizer using Adam as a model performs iterative update of weights, the batch size is set to 8, the total number of training cycles is set to 12, the initial learning rate is 0.01, and the number of training cycles is reduced to 1/10 in 7 th and 11 th cycles, the momentum (momentum) is set to 0.9, and the previous 500 iterations are set as a preheating stage, where the learning rate is 1/100 times the initial learning rate. Setting a global minimum value to be searched according to the gradient in the training process, updating the weight value of the model once after each batch is iterated,

7. the Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: in step S5, a transfer learning manner is adopted, the weight is obtained from the pre-trained model in the ImageNet data set, and then the training set is used for formal model training; and the iteration number of training is set to be about 30-40, and an Adam optimization mode is used as parameter updating, so that overfitting caused by too small data set can be prevented.

8. The Unet-based improved unmanned aerial vehicle under-view infrared image photovoltaic panel boundary segmentation method of claim 1, wherein: the step S5, testing the performance of the model and then: