CN117011730A - Unmanned aerial vehicle image change detection method, electronic terminal and storage medium - Google Patents
Unmanned aerial vehicle image change detection method, electronic terminal and storage medium Download PDFInfo
- Publication number
- CN117011730A CN117011730A CN202311263279.8A CN202311263279A CN117011730A CN 117011730 A CN117011730 A CN 117011730A CN 202311263279 A CN202311263279 A CN 202311263279A CN 117011730 A CN117011730 A CN 117011730A
- Authority
- CN
- China
- Prior art keywords
- change
- feature map
- pixel
- image
- aerial vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008859 change Effects 0.000 title claims abstract description 131
- 238000001514 detection method Methods 0.000 title claims abstract description 54
- 238000003860 storage Methods 0.000 title claims abstract description 13
- 238000000034 method Methods 0.000 claims abstract description 58
- 230000011218 segmentation Effects 0.000 claims abstract description 31
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000010586 diagram Methods 0.000 claims description 36
- 230000006870 function Effects 0.000 claims description 29
- 238000013135 deep learning Methods 0.000 claims description 22
- 230000008569 process Effects 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 8
- 238000004891 communication Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 230000007797 corrosion Effects 0.000 claims description 3
- 238000005260 corrosion Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000002922 simulated annealing Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a detection method, an electronic terminal and a storage medium for unmanned aerial vehicle image change in the technical field of computer vision detection, and aims to solve the technical problem of low image change detection accuracy in the prior art. According to the method, preprocessing operation is carried out by adopting a sliding window segmentation method according to the characteristic of large resolution of an unmanned aerial vehicle image; and then converting the image change detection problem into a pixel level classification problem, namely dividing each pixel in the image into a change class or a non-change class, and obtaining an image change detection result according to the pixels in the change class. The method and the device can be applied to the unmanned aerial vehicle to improve the detection accuracy of the unmanned aerial vehicle image change.
Description
Technical Field
The application belongs to the technical field of computer vision change detection, and particularly relates to a detection method for unmanned aerial vehicle image change, an electronic terminal and a storage medium.
Background
Unmanned aerial vehicle image change detection is a technology for analyzing and detecting a plurality of unmanned aerial vehicle images acquired in different periods in the same geographic position. According to the technology, on the premise of eliminating non-main detection target factors such as brightness, shadow and shooting angle of the picture, a main part of the change in the picture to be detected is obtained by comparing the reference picture, so that powerful technical support is provided for subsequent decision.
The conventional image change detection method generally generates a differential image based on methods such as image difference and image ratio, and then extracts change characteristics from the differential image to obtain a main change part. And generating a difference graph by using the average ratio image and the complementary information of the logarithmic ratio image in the wavelet domain, and classifying the generated image by using a modified local neighborhood fuzzy C-means clustering algorithm to obtain a change part of the image. Also similar are methods of detection of changes based on Haar-like features and random forests, etc. The method has the advantages of high speed, but the method is severely dependent on manually constructed characteristics, has limited capability of extracting complex abstract high-level information, and can have great influence on detection effect when the characteristics of the changed class and the non-changed class overlap or the statistical distribution modeling is inaccurate.
With the continuous development of technology, the current mainstream direction is a change detection method based on deep learning, the method extracts depth features of a reference image and a to-be-detected image through a convolutional neural network, and the feature image is used as a basis for subsequent change detection. However, such convolutional neural networks as VGGNet and AlexNet generally comprise full connection layers, and the models have the defects of limited sensing area, fixed input image size requirement and the like. Similar full convolutional neural networks solve the problem of fixed size input, and the union of the jump layer connection is to independently classify each pixel by summing corresponding pixels, but the spatial and value relationship between pixels is not fully considered. Although deep learning has advanced to some extent in the field of image change detection, further improvement of the accuracy of change detection is still a major research topic in this field.
Disclosure of Invention
The application aims to overcome the defects in the prior art and provides a detection method, an electronic terminal and a storage medium for unmanned aerial vehicle image change, so as to further improve the accuracy of image change detection.
In order to achieve the above purpose, the application is realized by adopting the following technical scheme:
in a first aspect, the present application provides a method for detecting image changes of an unmanned aerial vehicle, including the following steps:
preprocessing and aligning a reference graph and a graph to be detected, and then respectively carrying out sliding window segmentation to obtain a segmentation subgraph of the reference graph and a segmentation subgraph of the graph to be detected;
performing difference operation on the segmentation subgraph of the reference graph and the segmentation subgraph of the graph to be detected to obtain a difference image;
inputting the difference image into a trained improved deep learning network model, wherein the improved deep learning network model comprises an encoder, a decoder, and a classifier; the encoder comprises a plurality of layers of encoder sub-modules which are sequentially connected from an upper layer to a lower layer, and the decoder comprises a plurality of layers of decoder sub-modules which are sequentially connected from the lower layer to the upper layer;
inputting the difference image into the encoder, and carrying out convolution and pooling operations to obtain coding feature images of different layers;
for a non-bottom layer decoder submodule, the decoding feature map output by the next layer decoder submodule is subjected to up-sampling and then is in jump joint with the same-layer coding feature map, and the feature map obtained through jump joint is used as an input feature map of the non-bottom layer decoder submodule; for a bottom layer decoder submodule, the input bottom layer coding feature map is convolved and then pooled, then is in jump joint with the bottom layer coding feature map, and the feature map obtained through jump joint is used as the input feature map of the bottom layer decoder submodule; performing deconvolution operation on the input feature map by utilizing each decoder submodule so as to obtain a shallow decoding feature map and a deep decoding feature map;
fusing the shallow decoding feature map and the deep decoding feature map to obtain a change feature map;
and dividing each pixel in the change characteristic diagram into a change pixel or a non-change pixel by using the classifier, and acquiring an image change detection result according to the change pixel.
With reference to the first aspect, further, the method for improving the deep learning network model includes:
based on a Unet model, replacing a convolution layer in the Unet model coding with a residual structure of Resnet34, and adding a BN layer in the two convolution processes to normalize data;
the FPN network is fused to the decoder of the Unet model, where the feature maps of different scales are combined by an upsampling process.
With reference to the first aspect, further, the improved deep learning network model performs model training by combining a cross entropy loss function and a Dice loss function; during training, the parameters of the improved deep learning network model are updated by using an optimizer.
With reference to the first aspect, further, the optimizer is any one of an SGD optimizer, a BGD optimizer, and an Adam optimizer, and a simulated annealing algorithm is introduced to adjust the learning rate.
In combination with the first aspect, further, before performing sliding window segmentation, an overlap ratio is set so that an overlap portion exists between adjacent segmentation subgraphs.
With reference to the first aspect, further, the performing pretreatment alignment on the reference graph and the graph to be measured includes:
simultaneously detecting key points of the reference graph and the graph to be detected, and extracting feature descriptors of the key points;
matching the key points of the reference graph with the key points of the graph to be detected based on the feature descriptors of the key points, and screening the matched key points by using a RANSAC algorithm;
and carrying out alignment operation on the to-be-detected graph and the reference graph according to the screened matched key points.
With reference to the first aspect, further, the classifying, with the classifier, each pixel in the change map into a change pixel or a non-change pixel includes:
calculating probability values of each pixel belonging to a change class and a non-change class in the change feature map by using a softmax classifier;
normalizing the probability value of each pixel in the change characteristic diagram;
comparing the probability value of each pixel after normalization processing with a preset threshold value: if the probability value of the pixel is larger than the threshold value, the pixel is of a change type, otherwise, the pixel is of a non-change type; wherein the threshold is determined based on the desired change region confidence.
With reference to the first aspect, further, the obtaining an image change detection result according to the change pixel includes: performing expansion corrosion operation on the change characteristic diagram to filter discrete noise points and fill a cavity area; extracting a communication region of the change feature map to fill a left cavity region; outputting a circumscribed rectangular frame containing a communication area in the change feature map as a change area.
In a second aspect, the present application provides an electronic terminal comprising a processor and a memory connected to the processor, in which memory a computer program is stored which, when executed by the processor, performs the steps of the method according to any of the preceding claims.
In a third aspect, the present application provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods described in the preceding claims.
Compared with the prior art, the application has the beneficial effects that:
according to the unmanned aerial vehicle image change detection method provided by the embodiment of the application, firstly, preprocessing operation is carried out on a reference image and a to-be-detected image by adopting a sliding window segmentation method according to the characteristic of large resolution of the unmanned aerial vehicle image; secondly, converting the image change detection problem into a pixel level classification problem, namely dividing each pixel in the image into a change class or a non-change class, and obtaining an image change detection result according to the pixels of the change class; according to the characteristics of the unmanned aerial vehicle image, the semantic segmentation thought is introduced into image change detection, the input image is classified by adopting the improved deep learning model, more details can be focused while deeper semantic information is acquired, and the image change detection result has good accuracy and generalization;
the application connects the coding part and the decoding part through jump connection, thereby leading the decoding part to restore the detail information of the target more accurately;
the improved deep learning model is based on a Unet model, a convolution layer in the original Unet network coding is replaced by a residual structure of Resnet34, the FPN network is fused to a Unet decoding part, the characteristics of the residual structure, the Unet and the FPN are complemented, the characteristic extraction of the Unet network is enhanced, deeper semantic information is obtained, meanwhile, the defect of small target detection is improved, and the detection accuracy is improved;
the coding part adds a BN layer to normalize the data in the two convolution processes, so that the convergence speed of the model can be increased, and the robustness of the model can be improved;
the decoding part combines the feature graphs with different scales in the up-sampling process of the decoding part by fusing the FPN network structure, so that the Unet model is optimized to detect only a single output feature graph. Therefore, more scale information can be utilized in the back propagation and weight updating of the network, and the characteristics of each layer are fully utilized for detection, namely, the detail information of the shallow layer and the semantic information of the deep layer are used for independent prediction;
the application provides a joint loss function to replace a loss function in a Unet network, namely, a classical two-classification cross entropy loss function and a Dice loss function are combined to reduce the influence of class unbalance on a model, and the learning of the model on the change characteristics when the proportion of the change pixels is smaller is improved;
according to the application, by setting the threshold value of the change feature map output by the model, different results can be output according to different actual application scenes, and a better actual application effect is achieved.
Drawings
Fig. 1 is a flowchart of a method for detecting image changes of an unmanned aerial vehicle according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for pre-processing alignment of a reference map and a map under test provided in accordance with an embodiment of the present application;
FIG. 3 is a flow chart of a method of training an improved deep learning network model provided in accordance with an embodiment of the present application;
FIG. 4 is a schematic diagram of an encoder provided according to an embodiment of the present application;
fig. 5 is a schematic diagram of a residual unit in an encoder according to an embodiment of the present application;
fig. 6 is a schematic diagram of a decoder according to an embodiment of the present application;
fig. 7 is a schematic diagram of an FPN network structure of a decoder according to an embodiment of the present application.
Description of the embodiments
The following detailed description of the technical solutions of the present application will be given by way of the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limiting the technical solutions of the present application, and that the embodiments and technical features of the embodiments of the present application may be combined with each other without conflict.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Example 1
The embodiment of the application provides a detection method for unmanned aerial vehicle image change, which comprises the steps of firstly preprocessing a reference image and a to-be-detected image by adopting a sliding window segmentation method according to the characteristic of large resolution of an unmanned aerial vehicle image; and then converting the image change detection problem into a pixel level classification problem, namely dividing each pixel in the image into a change class or a non-change class, and obtaining an image change detection result according to the pixels in the change class. Fig. 1 is a flowchart of a method for detecting image changes of an unmanned aerial vehicle according to an embodiment of the present application. The flow chart merely shows the logical sequence of the method according to the present embodiment, and the steps shown or described may be performed in a different order than shown in fig. 1 in other possible embodiments of the application without mutual conflict.
Referring to fig. 1, the method of the present embodiment specifically includes the following steps:
step one, preprocessing and aligning a reference image and a to-be-detected image, and then respectively carrying out sliding window segmentation to obtain a segmentation subgraph of the reference image and a segmentation subgraph of the to-be-detected image;
in the embodiment of the application, the image to be detected is the unmanned aerial vehicle image to be detected.
The reference image and the image to be detected are preprocessed, so that missed detection can be effectively avoided. The window size may be preset prior to sliding window segmentation. Typically, the unmanned aerial vehicle image has the characteristic of high definition resolution, and as an embodiment of the application, the window size can be set to 608×608 (the unit is pixels), and the unmanned aerial vehicle image is subjected to sliding window segmentation in the order from left to right and from top to bottom. Since the sliding window segmentation may cause the target region image to be segmented, the overlapping rate may be set to 0.3 as an embodiment of the present application, by setting the overlapping rate overlap so that there is an overlapping portion between adjacent segmentation sub-graphs, thereby better solving the problem that the target region image is segmented.
As shown in fig. 2, in the embodiment of the present application, preprocessing alignment is performed on a reference map and a map to be measured, which specifically includes the following steps:
step 101: and (3) detecting key points of images: simultaneously detecting key points of the reference graph and the graph to be detected, and extracting feature descriptors of the key points; wherein, the detection key points can adopt sift, surf, orb and other algorithms;
step 102: key point matching: matching the key points of the reference graph with the key points of the graph to be detected based on the feature descriptors of the key points, and screening the matched key points by using a RANSAC algorithm, wherein the key points are matching points meeting the requirements; wherein, the key point matching can use a KNN algorithm;
step 103: image alignment: and carrying out alignment operation on the to-be-detected graph and the reference graph according to the screened matched key points.
Step two, carrying out difference operation on the segmentation subgraph of the reference graph and the segmentation subgraph of the graph to be detected to obtain a difference image;
inputting the difference image into a trained improved deep learning network model; wherein the improved deep learning network model comprises an encoder, a decoder, and a classifier;
the encoder is used for extracting image characteristics and reducing space dimension; the decoder is used to recover the detail and spatial information of the image. The encoder comprises a plurality of layers of encoder sub-modules which are sequentially connected from an upper layer to a lower layer, and the decoder comprises a plurality of layers of decoder sub-modules which are sequentially connected from the lower layer to the upper layer. The number of layers of the decoder sub-module is the same as the number of layers of the encoder sub-module, referring to fig. 3, the decoder sub-module is in jump connection with the encoder sub-module of the same layer, and the jump connection can reduce the information loss caused by the pooling layer in the encoder sub-module, so that the decoder can restore the detailed information of the target more accurately.
In the embodiment of the application, the improved deep learning network model is based on a Unet model;
for the coding part, referring to fig. 3 and fig. 4, the res net34 network is used as a feature extraction layer (i.e. encoder) of the Unet model, specifically, a convolution layer in the Unet model coding is replaced by a res net34 residual structure, and a BN layer is added in the two convolution processes to normalize data, so that the convergence speed of the model can be increased, and the robustness of the model can be improved.
Referring to fig. 5, a schematic structural diagram of a res net34 residual unit according to an embodiment of the present application is provided. It comprises multiple convolution weight layers, and to solve gradient vanishing/explosion problem, a skip/shortcut connection is added to inputxAdded to the output after passing through the convolution weight layer, the corresponding output isH(x)= F(x)+xThe convolutional weight layer is actually learning a residual mapF(x),F(x)= H(x)-x. For the ResNet34 residual error unit provided by the embodiment of the application, even if the gradient of the convolution weight layer disappears, the ResNet34 residual error unit still has the markxSo that it can be transferred back to the earlier layer.
For the decoding part, as shown in fig. 3 and fig. 6, the FPN network may be fused into a decoder of the Unet model, and feature maps of different scales are combined through an up-sampling process, so that the Unet model is optimized to detect only a single output feature map. Therefore, more scale information can be utilized in the back propagation and weight updating of the network, and the characteristics of each layer are fully utilized for detection, namely, the detail information of the shallow layer and the semantic information of the deep layer are used for independent prediction.
The FPN network, the feature pyramid network, is used to extract feature maps of different scales to provide the latter network with predictive tasks. As shown in fig. 7, a schematic FPN network structure of a decoder according to an embodiment of the present application is shown, where the calculation process includes:
the first step: a bottom-up path;
and a second step of: a top-down path;
and a third step of: fusing the results of the first and second steps together using a transverse connection;
fourth step: the fusion results are followed by a convolution to mitigate the aliasing effects of the upsampling.
The improved deep learning network model provided by the embodiment of the application enhances the feature extraction capability of the Unet model by complementing the advantages of the residual structure, the Unet and the FPN, acquires deeper semantic information, improves the defect of small target detection and improves the detection accuracy.
Inputting the difference image into the encoder, and performing convolution and pooling operations to obtain coding feature images of different layers;
referring to fig. 3 and 4, it is assumed that the encoder includes 4 encoder sub-modules from an upper layer to a lower layer, the 4 encoder sub-modules including an encoder 1, an encoder 2, an encoder 3, and an encoder 4, and the outputs of the former encoder sub-modules are sequentially connected as inputs of the latter encoder sub-modules, and each encoder sub-module is configured to perform convolution and pooling operations on an input image, so as to obtain coding feature diagrams of different layers, where the convolution operations are used to extract feature information, and the pooling operations are used to filter some unimportant high-frequency information. The encoder may employ various CNN networks, such as: VGG16, dark 53, resnet101, etc., which network is specifically employed may be considered in terms of data set and performance.
Step five, for the non-bottom layer decoder submodule, the decoding feature map output by the next layer decoder submodule is subjected to up-sampling and then is in jump joint with the same-layer coding feature map, and the feature map obtained through jump joint is used as an input feature map of the non-bottom layer decoder submodule; for a bottom layer decoder submodule, the input bottom layer coding feature map is convolved and then pooled, then is in jump joint with the bottom layer coding feature map, and the feature map obtained through jump joint is used as the input feature map of the bottom layer decoder submodule; and performing deconvolution operation on the input feature map by utilizing each decoder submodule so as to obtain a shallow decoding feature map and a deep decoding feature map.
As shown in fig. 6, the decoder provided in the embodiment of the present application has the same number of layers as the encoder, and includes 4 decoder sub-modules from the upper layer to the lower layer, namely, decoder1, decoder2, decoder3, and decoder4, where the 4 decoder sub-modules correspond to 4 deconvolution layers. Each decoder sub-module will deconvolute the input feature map to double the scale of the input feature map and halve the dimension.
Referring to fig. 3, it is assumed that in the encoder, the encoding feature patterns outputted from the upper layer to the lower layer 4 encoder sub-modules encoding 1, encoding 2, encoding 3, encoding 4 are f1, f2, f3, f4, respectively, and for the non-bottom layer decoder sub-module decoding 1, the input feature patterns are feature patterns obtained by the jump connection of the encoding feature pattern f1 outputted from the encoder sub-module encoding 1 and the decoding feature pattern y2 outputted from the non-bottom layer decoder sub-module decoding 2, and the decoding feature pattern y1 is outputted after the deconvolution operation; correspondingly, the input of the non-bottom layer decoder submodule decoder2 is a characteristic diagram obtained by the jump connection of the coding characteristic diagram f2 and the decoding characteristic diagram y3, and the decoding characteristic diagram y2 is output after deconvolution operation; the input of the non-bottom layer decoder submodule decoder3 is a characteristic diagram obtained by the jump connection of the coding characteristic diagram f3 and the decoding characteristic diagram y4, and the characteristic diagram is output as the decoding characteristic diagram y3 after deconvolution operation; it should be noted that, for the decoder sub-module decoder4, which is located at the bottommost layer of the decoder, that is, the bottom layer decoder sub-module decoder4, it is first required to perform convolution operation on the coding feature map f4, then perform pooling operation, and then make a jump connection with the coding feature map f4, so as to obtain an input feature map of the bottom layer decoder sub-module decoder4, and the bottom layer decoder sub-module decoder4 performs deconvolution operation on the input feature map and then outputs a decoding feature map y4; the decoding feature maps y1 and y2 are shallow decoding feature maps, and the decoding feature maps y3 and y4 are deep decoding feature maps.
Step six, fusing the shallow decoding feature map with the deep decoding feature map to obtain a change feature map;
in some embodiments of the application, the change profile of the same size as the input image may be obtained by upsampling.
And step seven, dividing each pixel in the change characteristic diagram into a change pixel or a non-change pixel by using the classifier, and acquiring an image change detection result according to the change pixel.
As an embodiment of the application, the classifier can be a softmax classifier, the softmax classifier is utilized to calculate the probability value of each pixel belonging to the change class and the non-change class in the change characteristic diagram, the probability value gradually decreases from the value in the change area to the value of the change edge, and the probability value of each pixel in the change characteristic diagram is normalized. A threshold value can be set to calculate the confidence coefficient of the change area, and if the probability of the pixel is larger than the threshold value, the pixel is of a change type; otherwise, the non-variant class. The threshold may be raised if higher confidence is desired and lowered if a low omission factor is desired. And calculating the probability average value of the pixels of the change area by statistics to obtain the confidence coefficient of the change area. And (3) carrying out pixel mapping on the probability value of the pixel to obtain a change map of the change area, namely a change detection result.
The step of obtaining the image change detection result according to the change pixel comprises the steps of determining a change area and an external rectangle thereof, and the specific method is as follows:
step 701: performing expansion corrosion operation on the change characteristic diagram to filter discrete noise points and fill a cavity area;
step 702: extracting a connected region of the change feature map to fill a cavity region left in the step 801;
step 703: outputting a circumscribed rectangular frame containing a communication area in the change feature map as a change area.
In addition, since the proportion of the change part to the whole image is small or none at all in the change detection, the problem of unbalanced sample distribution exists in the detection process, and the problem can lead to higher recognition degree of the part with higher network comparison example, lower recognition degree of the category with lower occupation ratio, and the problem is that the local minimum value is easily trapped, so that the global optimum value is not obtained. Therefore, the embodiment of the application provides a joint loss function to replace the loss function in the Unet model, and particularly, the embodiment of the application reduces the influence of category imbalance on the model by combining the classical two-category cross entropy loss function and the Dice loss function, and improves the learning of the model on the change characteristics when the proportion of the change pixels is smaller. Namely, the improved deep learning network model is subjected to model training through combining a cross entropy loss function and a Dice loss function.
In order to solve the problem of unbalanced sample distribution in image change detection, only using a classical two-classification cross entropy loss function can lead to feature learning of a model which is more focused on a relatively more flat non-change part, so that a Dice loss function which only focuses on whether pixel points are correctly classified is added as a supplement of the cross entropy loss function, and adverse effects of class unbalance on model accuracy are reduced. Wherein the expressions of the cross entropy loss function and the Dice loss function are respectively as follows:
wherein,t n representing realityA tag class;nrepresenting pixel pointsnN represents the total pixel count, such as: 5*5 pixels for the image, thenN=25; when (when)t n =0, representing non-changing class, whent n =1, representing the class of variation;p n a probability value indicating that the predicted pixel n is of a change class;y n representing prediction categories, including variant and non-variant categories; l represents a joint loss function adopted when the deep learning network model is improved to train;L cross representing a cross entropy loss function;L Dice representing the Dice loss function.
In the embodiment of the application, the parameters of the model need to be updated by an optimizer in the training process of the improved deep learning network model, so that the loss function is minimized. Common optimizers are SGD, BGD, adam, etc. In the embodiment of the application, an Adam optimizer with high convergence speed and good learning effect can be selected, and in order to prevent the model from sinking into a local optimal solution and enable the model to converge faster, the embodiment of the application also introduces a simulated annealing algorithm for adjusting the learning rate.
The method for detecting the image change of the unmanned aerial vehicle provided by the embodiment can be applied to a terminal, and can be executed by a detection device for detecting the image change of the unmanned aerial vehicle, wherein the detection device can be realized by a mode of software and/or hardware, and the detection device can be integrated in the terminal, for example: any smart phone, tablet computer or computer device with communication function.
Example two
The embodiment of the application provides a detection device for unmanned aerial vehicle image change, which comprises:
pretreatment and segmentation modules: the method comprises the steps of preprocessing and aligning a reference image and a to-be-detected image, and then respectively carrying out sliding window segmentation to obtain a segmentation subgraph of the reference image and a segmentation subgraph of the to-be-detected image;
the difference value operation module: the method comprises the steps of performing difference operation on a segmentation subgraph of the reference graph and a segmentation subgraph of the graph to be detected to obtain a difference image;
network model module: for inputting the difference image into a trained improved deep learning network model, wherein the improved deep learning network model comprises an encoder, a decoder, and a classifier; the encoder comprises a plurality of layers of encoder sub-modules which are sequentially connected from an upper layer to a lower layer, and the decoder comprises a plurality of layers of decoder sub-modules which are sequentially connected from the lower layer to the upper layer;
and a coding module: the method comprises the steps of carrying out convolution and pooling operation on the difference image to obtain coding feature images of different layers;
and a decoding module: for the non-bottom layer decoder submodule, the decoding feature diagram output by the next layer decoder submodule is in jump joint with the coding feature diagram of the same layer after being up-sampled, and the feature diagram obtained by jump joint is used as the input feature diagram of the non-bottom layer decoder submodule; for a bottom layer decoder submodule, the input bottom layer coding feature map is convolved and then pooled, then is in jump joint with the bottom layer coding feature map, and the feature map obtained through jump joint is used as the input feature map of the bottom layer decoder submodule; performing deconvolution operation on the input feature map by utilizing each decoder submodule so as to obtain a shallow decoding feature map and a deep decoding feature map;
and a fusion module: the method comprises the steps of fusing the shallow decoding feature map with the deep decoding feature map to obtain a change feature map;
and a classification module: and the classifier is used for dividing each pixel in the change characteristic diagram into a change pixel or a non-change pixel, and obtaining an image change detection result according to the change pixel.
The device provided by the embodiment of the application can be used for executing the method provided by the first embodiment of the application, and has the corresponding functional modules and beneficial effects of executing the method. The detailed process of implementing the corresponding functions by each module may be referred to in the first embodiment, and will not be described herein for simplicity.
Example III
The embodiment of the application also provides an electronic terminal, which comprises a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method as described in embodiment one.
The embodiment of the application can execute the method provided by the first embodiment of the application, so the embodiment of the application has the corresponding functional modules and beneficial effects of executing the method. For simplicity of description, details are not described here.
Example IV
The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the steps of the method of the embodiment one.
Because the embodiment of the application can execute the method provided by the first embodiment, the embodiment of the application has the corresponding functional modules and beneficial effects of executing the method of the embodiment. For simplicity of description, details are not described here.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely a preferred embodiment of the present application, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present application, and such modifications and variations should also be regarded as being within the scope of the application.
Claims (10)
1. The unmanned aerial vehicle image change detection method is characterized by comprising the following steps of:
preprocessing and aligning a reference graph and a graph to be detected, and then respectively carrying out sliding window segmentation to obtain a segmentation subgraph of the reference graph and a segmentation subgraph of the graph to be detected;
performing difference operation on the segmentation subgraph of the reference graph and the segmentation subgraph of the graph to be detected to obtain a difference image;
inputting the difference image into a trained improved deep learning network model, wherein the improved deep learning network model comprises an encoder, a decoder, and a classifier; the encoder comprises a plurality of layers of encoder sub-modules which are sequentially connected from an upper layer to a lower layer, and the decoder comprises a plurality of layers of decoder sub-modules which are sequentially connected from the lower layer to the upper layer;
inputting the difference image into the encoder, and carrying out convolution and pooling operations to obtain coding feature images of different layers;
for a non-bottom layer decoder submodule, the decoding feature map output by the next layer decoder submodule is subjected to up-sampling and then is in jump joint with the same-layer coding feature map, and the feature map obtained through jump joint is used as an input feature map of the non-bottom layer decoder submodule; for a bottom layer decoder submodule, the input bottom layer coding feature map is convolved and then pooled, then is in jump joint with the bottom layer coding feature map, and the feature map obtained through jump joint is used as the input feature map of the bottom layer decoder submodule; performing deconvolution operation on the input feature map by utilizing each decoder submodule so as to obtain a shallow decoding feature map and a deep decoding feature map;
fusing the shallow decoding feature map and the deep decoding feature map to obtain a change feature map;
and dividing each pixel in the change characteristic diagram into a change pixel or a non-change pixel by using the classifier, and acquiring an image change detection result according to the change pixel.
2. The method for detecting image changes of a unmanned aerial vehicle according to claim 1, wherein the method for improving the deep learning network model comprises:
based on a Unet model, replacing a convolution layer in the Unet model coding with a residual structure of Resnet34, and adding a BN layer in the two convolution processes to normalize data;
the FPN network is fused to the decoder of the Unet model, where the feature maps of different scales are combined by an upsampling process.
3. The method for detecting unmanned aerial vehicle image changes according to claim 1 or 2, wherein the improved deep learning network model is model trained by combining a cross entropy loss function and a Dice loss function; during training, the parameters of the improved deep learning network model are updated by using an optimizer.
4. The unmanned aerial vehicle image change detection method according to claim 3, wherein the optimizer is any one of an SGD optimizer, a BGD optimizer and an Adam optimizer, and a simulated annealing algorithm is introduced to adjust the learning rate.
5. The method for detecting the image change of the unmanned aerial vehicle according to claim 1, wherein the overlapping rate is set so that an overlapping portion is provided between adjacent divided subgraphs before the sliding window division is performed.
6. The method for detecting image changes of an unmanned aerial vehicle according to claim 1, wherein the preprocessing alignment of the reference map and the map to be detected comprises:
simultaneously detecting key points of the reference graph and the graph to be detected, and extracting feature descriptors of the key points;
matching the key points of the reference graph with the key points of the graph to be detected based on the feature descriptors of the key points, and screening the matched key points by using a RANSAC algorithm;
and carrying out alignment operation on the to-be-detected graph and the reference graph according to the screened matched key points.
7. The method for detecting image changes of a drone according to claim 1, wherein said classifying each pixel in said change map into a change class pixel or a non-change class pixel by said classifier, comprises:
calculating probability values of each pixel belonging to a change class and a non-change class in the change feature map by using a softmax classifier;
normalizing the probability value of each pixel in the change characteristic diagram;
comparing the probability value of each pixel after normalization processing with a preset threshold value: if the probability value of the pixel is larger than the threshold value, the pixel is of a change type, otherwise, the pixel is of a non-change type; wherein the threshold is determined based on the desired change region confidence.
8. The method for detecting image change of unmanned aerial vehicle according to claim 1, wherein the step of acquiring the image change detection result according to the change-class pixel comprises: performing expansion corrosion operation on the change characteristic diagram to filter discrete noise points and fill a cavity area; extracting a communication region of the change feature map to fill a left cavity region; outputting a circumscribed rectangular frame containing a communication area in the change feature map as a change area.
9. An electronic terminal comprising a processor and a memory coupled to the processor, wherein a computer program is stored in the memory, which, when executed by the processor, performs the steps of the method according to any of claims 1-8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311263279.8A CN117011730A (en) | 2023-09-27 | 2023-09-27 | Unmanned aerial vehicle image change detection method, electronic terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311263279.8A CN117011730A (en) | 2023-09-27 | 2023-09-27 | Unmanned aerial vehicle image change detection method, electronic terminal and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117011730A true CN117011730A (en) | 2023-11-07 |
Family
ID=88562090
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311263279.8A Pending CN117011730A (en) | 2023-09-27 | 2023-09-27 | Unmanned aerial vehicle image change detection method, electronic terminal and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117011730A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113449690A (en) * | 2021-07-21 | 2021-09-28 | 华雁智科(杭州)信息技术有限公司 | Method and system for detecting image scene change and electronic equipment |
CN116091497A (en) * | 2023-04-07 | 2023-05-09 | 航天宏图信息技术股份有限公司 | Remote sensing change detection method, device, electronic equipment and storage medium |
-
2023
- 2023-09-27 CN CN202311263279.8A patent/CN117011730A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113449690A (en) * | 2021-07-21 | 2021-09-28 | 华雁智科(杭州)信息技术有限公司 | Method and system for detecting image scene change and electronic equipment |
CN116091497A (en) * | 2023-04-07 | 2023-05-09 | 航天宏图信息技术股份有限公司 | Remote sensing change detection method, device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112052787B (en) | Target detection method and device based on artificial intelligence and electronic equipment | |
CN111027493B (en) | Pedestrian detection method based on deep learning multi-network soft fusion | |
CN108830285B (en) | Target detection method for reinforcement learning based on fast-RCNN | |
CN111179217A (en) | Attention mechanism-based remote sensing image multi-scale target detection method | |
KR101896357B1 (en) | Method, device and program for detecting an object | |
CN110991444B (en) | License plate recognition method and device for complex scene | |
CN110569782A (en) | Target detection method based on deep learning | |
CN110929593A (en) | Real-time significance pedestrian detection method based on detail distinguishing and distinguishing | |
CN111209858B (en) | Real-time license plate detection method based on deep convolutional neural network | |
CN109886159B (en) | Face detection method under non-limited condition | |
CN116645592B (en) | Crack detection method based on image processing and storage medium | |
CN111738054A (en) | Behavior anomaly detection method based on space-time self-encoder network and space-time CNN | |
CN111931572B (en) | Target detection method for remote sensing image | |
CN113344000A (en) | Certificate copying and recognizing method and device, computer equipment and storage medium | |
CN112733942A (en) | Variable-scale target detection method based on multi-stage feature adaptive fusion | |
CN113496480A (en) | Method for detecting weld image defects | |
CN116152226A (en) | Method for detecting defects of image on inner side of commutator based on fusible feature pyramid | |
CN113989256A (en) | Detection model optimization method, detection method and detection device for remote sensing image building | |
CN111582057B (en) | Face verification method based on local receptive field | |
CN117765363A (en) | Image anomaly detection method and system based on lightweight memory bank | |
CN117975377A (en) | High-precision vehicle detection method | |
CN116740460A (en) | Pcb defect detection system and detection method based on convolutional neural network | |
KR102026280B1 (en) | Method and system for scene text detection using deep learning | |
CN113344005B (en) | Image edge detection method based on optimized small-scale features | |
CN113256528B (en) | Low-illumination video enhancement method based on multi-scale cascade depth residual error network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20231107 |