CN115497006A - Urban remote sensing image change depth monitoring method and system based on dynamic hybrid strategy - Google Patents
Urban remote sensing image change depth monitoring method and system based on dynamic hybrid strategy Download PDFInfo
- Publication number
- CN115497006A CN115497006A CN202211138291.1A CN202211138291A CN115497006A CN 115497006 A CN115497006 A CN 115497006A CN 202211138291 A CN202211138291 A CN 202211138291A CN 115497006 A CN115497006 A CN 115497006A
- Authority
- CN
- China
- Prior art keywords
- remote sensing
- pooling
- sensing image
- urban
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Abstract
The invention discloses a method and a system for monitoring the change depth of an urban remote sensing image based on a dynamic mixing strategy, wherein the method comprises the following steps: s1, preprocessing a city remote sensing image, and labeling different types of city areas to obtain a data set; s2, training the network by using the data set in the step S1 based on a DeepLabV3+ network model adopting a dynamic mixed pooling strategy and taking the Xconcept as a backbone network; s3, cutting the remote sensing images of the same city area at different times in the same proportion, inputting the remote sensing images into a trained network model, and segmenting the images; and S4, after the region classification result of the urban remote sensing image is obtained, calculating the change degree of each region class within a period of time and marking the change on the image. The invention dynamically selects different pooling modes for pooling the characteristic maps of each layer of the remote sensing image, can better grasp the global information and the local information of the image and improves the segmentation precision.
Description
Technical Field
The invention belongs to the technical field of satellite remote sensing image processing, and particularly relates to a method and a system for monitoring the change depth of an urban remote sensing image based on a dynamic mixing strategy.
Background
The semantic segmentation of the high-resolution remote sensing image is a basic task in the field of remote sensing images, and the main task is to utilize a computer to analyze the color, spectral information and spatial information of various targets in the observed remote sensing image, select characteristic information, classify each pixel in the image and segment the regional outline between the target images. The urban remote sensing image mainly comprises buildings, roads, green plant coverage areas and the like. Accurate segmentation and change detection of the urban remote sensing image can be used for analyzing index details of all parts such as seasonal changes of buildings and green plants, disaster detection, vegetation distribution changes and the like in urban management planning, and further providing basis and help for comprehensively mastering urban layout and real-time dynamic changes thereof.
In recent decades, with the development of computer artificial intelligent vision technology, the resolution of satellite remote sensing images is higher and higher, the capability of processing images is greatly improved, and the method has important significance for developing academic research and guiding production practice. Compared with the common image applied in the prior art, the remote sensing image has the characteristics of high precision, abundant context information, wider visual field range, real-time dynamic monitoring and the like, is applied in various fields in a large scale and gradually shows a refined development trend. And how to accurately extract key information from the remote sensing image is of great importance. The traditional remote sensing image segmentation technology, such as a region segmentation-based method, an edge detection segmentation method, shadow analysis and the like, generally depends on manual design to extract features, and has poor generalization when performing region segmentation on a complex scene.
The high-altitude remote sensing image is subjected to semantic segmentation by a deep learning algorithm based on edge detection, valuable information in the image can be better extracted, the segmentation precision of small-size targets in the remote sensing image can be improved, and meanwhile, data support can be provided for urban planning, desert detection management, urban greening management and control, water area supervision and the like. However, in an actual remote sensing image, due to the fact that complex environments with different conditions such as various terrains, target objects are mutually shielded, urban building types are rich, and the influence of a plurality of factors such as illumination, cloud cover and the like exists, the problems that the precision of object edge segmentation details is greatly reduced, and segmentation boundaries are fuzzy exist.
Therefore, it is very important to design a method for detecting the change of the remote sensing image in the urban area based on the dynamic mixing strategy, which can reduce the influence of the terrain environment factors and improve the accuracy of the segmentation of the edge of the area.
Disclosure of Invention
The invention aims to improve the defects in the technology and provides a method for detecting the change depth of an urban remote sensing image based on a deep LabV3+ multi-scale network model by combining various mixed pooling strategies.
The invention adopts the following technical scheme:
the method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy comprises the following steps:
s1, preprocessing a city remote sensing image, and labeling areas of different categories in a city to obtain a data set;
s2, training the network by using the data set in the step S1 based on a network model adopting a dynamic mixed pooling strategy and taking Xconcept as a backbone network;
s3, cutting the remote sensing images of the same city area at different time in the same proportion, inputting the remote sensing images into the trained network model, and segmenting the images;
and S4, after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on the image.
Further, the original remote sensing image of the urban area adopts an RSSCN7DataSet remote sensing image data set.
Further, in step S1, the original remote sensing image is cut and semantically segmented to be used as data preprocessing, the size of the original remote sensing image is cut to 256 × 256, and the cut image is labeled by using a label semantic segmentation labeling tool and is classified into five categories of roads, buildings, water areas, green plants and open spaces.
Furthermore, the Labelme technology is an open source image labeling tool and is mainly used for data set labeling work of instance segmentation, semantic segmentation, target detection and classification tasks.
Furthermore, because the mark characteristics of each region category in the urban area are generally obvious, the remote sensing image after preprocessing is directly segmented and labeled by using different colors, and then a training data set is selected by adopting a random selection mode.
Further, in step S2, the dataset is trained using a mixed pooling deplab v3+ network model. The DeepLabV3+ network model fuses multi-scale information, the model structure is divided into a coding layer and a decoding layer, and Xconcept is introduced to serve as a backbone network. The method for applying the serial hole convolution to the result by the backbone network is divided into two modules, wherein one module is directly transmitted into a decoding layer, and the other module passes through an ASPP module. The method specifically comprises the following steps:
s21, at a coding layer, extracting characteristic information of the detected urban remote sensing image through a backbone network, and sequentially changing the size of the image into 1/4,1/8 and 1/16 of the size of the original image.
S22, obtaining 60 × 2048 feature graphs from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein a general ASPP module consists of a 1 × 1 convolution, 3 × 3 expansion convolution layers with three cavity rates of 6, 12 and 18 respectively and a global average pooling layer, in order to better improve the accuracy of network model edge segmentation, on the basis of the structure, the global average pooling in the ASPP module is replaced by a mixed pooling strategy adopted in the text, the input feature graphs are subjected to mixed pooling to obtain the feature graphs with the size of 20 × 20, channel compression is performed through the convolution of 1 × 1, and finally the feature graphs are reduced to the height and the width of the input feature graphs through a deconvolution method, and the obtained results of all the parts are spliced and fused.
And S23, merging the mixed result of the low-layer features output by the corresponding hierarchy of the backbone network and the coding end at the decoding end, and recovering the prediction result of the resolution of the input image after 3-by-3 convolution kernel up-sampling, so that a result image of city remote sensing image classification is obtained, and finally, the accuracy of network model segmentation is improved.
S24, in the DeepLabV3+ network model adopting the dynamic mixed pooling strategy, training the network model by using the RSSCN7DataSet remote sensing image data set, and taking the rest samples as test data sets for testing the network model.
Further, the mixed pooling strategy in step S22 needs to optimize the 2048-level feature map:
selecting the frequency alpha of maximum pooling in the kth layer profile k Selecting the average pooled frequency beta k And selecting a frequency γ for random pooling k Comprises the following steps:
wherein i _ max is the frequency count for selecting the largest pooling method in the k-th layer feature map, i _ avg is the frequency count for selecting the average pooling method in the k-th layer feature map, i _ sto is the frequency count for selecting the random pooling method in the k-th layer feature map, and i _ total is the size of the training set for optimizing the k-th layer feature map.
Finally, the result output after the mixed pooling output of the k-th layer characteristic diagram in the network model k Comprises the following steps:
wherein x is k_max Results of maximum pooling for k-th layer profile, x k_avg Results of applying the average pooling method to represent the k-th layer feature map, x k_sto The results of the stochastic pooling method for the k-th layer profile are shown.
Further, the step of improving the global average pooling in the ASPP module to dynamic hybrid pooling comprises:
a. first stageInitializing, namely, before training a DeepLabV3+ network model, the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
b. Performing pooling processing on the layer 1 feature map by using maximum pooling, average pooling and random pooling methods respectively, performing pooling processing on the other feature maps by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average intersection ratio mIoU of a predicted value and a true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature map;
c. b, optimizing the pooling strategy of the characteristic diagram of the 2 nd to 2048 th layers with the same input by adopting the method;
d. the operations of the steps b and c are carried out on all training set samples;
e. alpha of each layer of feature map in training set by adopting different pooling strategies is obtained k 、β k And gamma k 。
Further, in step S22, an average cross-over ratio (mlou) is used to evaluate the merits of different pooling strategies, where mlou represents the ratio of the intersection and the union of the predicted values and the true values of each category, which are added together and then divided by the number of categories. The formula can be expressed as:
wherein k represents the number of non-null classes, TP represents the number of true instances, FP represents the number of false positive instances, and FN represents the number of false negative instances.
Further, in step S2, the deep labv3+ network model adopts a pixel-by-pixel cross entropy loss function softmax loss to process a multi-classification problem, selects each pixel point as a sample, and calculates the cross entropy loss function of each pixel for the prediction category and the real category. The loss function softmax converts a plurality of outputs into probability values to map in a (0, 1) interval for classification in a multi-classification process.
The output after the softmax regression process is:
wherein e is a natural constant of 2.71 i,j Prediction probability of class j for ith sample, l i,j And C is the number of original input classes for the output of the ith sample in the jth class of the neural network.
The above equation turns the output into a probability distribution of (0, 1), and the distance between the predicted probability distribution and the probability distribution of the true value is calculated by the cross entropy loss function. The method specifically comprises the following steps:
where N is the number of training set samples, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
Further, in step S3, the remote sensing images of the same city area at different times are cut in the same proportion and input into the trained model to obtain a segmented image.
Further, step S4 specifically includes:
s41, calculating the intersection ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the time of T1 and T2, wherein the intersection ratio loss is used for representing the change size of the ith category in the area, and the intersection ratio loss formula is as follows:
wherein L is i Indicating the magnitude of the change, L, of the ith region class i The larger the area change, the larger TP i Representing the number of real instances of the ith area class, FP i Indicates the number of false positive cases, FN, of the ith area class i Indicates the ith area categoryNumber of false counter-examples.
And S42, splicing the segmented images in sequence.
And S43, subtracting the urban remote sensing images spliced at the moment of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and overlapping the marked image with the original image so as to display the urban change range on the remote sensing images.
The invention also discloses a system based on the method for monitoring the change depth of the urban remote sensing image, which comprises the following modules:
a dataset acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
a training module: adopting a DeepLabV3+ network model of a dynamic mixed pooling strategy based on the Xconcept as a backbone network, and training the network by utilizing a data set;
a segmentation module: cutting remote sensing images of the same city area at different times in the same proportion, inputting the remote sensing images into a trained model, and segmenting the images;
a labeling module: and after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on a graph.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention dynamically selects different pooling modes for pooling the characteristic diagrams of each layer of the remote sensing image, can better grasp the global information and the local information of the image and improves the segmentation precision.
2. The invention takes random pooling as one of the pooling strategies, effectively reduces the risk of overfitting and improves the generalization capability of the model.
Drawings
FIG. 1 is a flow chart of the method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy.
FIG. 2 is a flow diagram of a hybrid pooling process.
FIG. 3 is a DeepLabV3+ network model diagram.
FIG. 4 is a block diagram of the system for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1 to 3, the method for monitoring the depth of change of the remote sensing image in the urban area based on the dynamic mixing strategy in the embodiment is performed according to the following steps:
s1, preprocessing an image and labeling different types of regions in a city according to RSSCN7DataSet remote sensing images released by Wuhan university in 2015 to obtain a data set;
s2, training a network by using a data set by adopting a DeepLabV3+ network model of a dynamic mixing pooling strategy based on the Xconcept as a backbone network;
and S3, cutting the remote sensing images of the same city region at different times in the same proportion, inputting the remote sensing images into the trained model, and segmenting the images.
And S4, after the region classification result of the urban remote sensing image is obtained, calculating the change degree of each region class within a period of time and marking the change on the image.
Specifically, in this embodiment, the original remote sensing image of the urban area adopts RSSCN7DataSet remote sensing image data set.
In this embodiment, in step S1, the original remote sensing image is cut and semantically segmented as data preprocessing, the original remote sensing image is cut to 256 × 256 size, and the cut image is labeled by using a label tool for semantic segmentation, and is classified into five categories, i.e., road, building, water area, green plant, and open space.
Furthermore, the Labelme technology is an open source image labeling tool and is mainly used for data set labeling work of example segmentation, semantic segmentation, target detection and classification tasks.
Since the mark characteristics of each region category in urban areas are generally significant, the remote sensing image after preprocessing is directly segmented and labeled by using different colors in the embodiment, and then a training data set is selected by adopting a random selection mode.
In S2 of this example, the dataset was trained using the mixed pooling depeplab v3+ network model. The DeepLabV3+ network model fuses multi-scale information, the model structure is divided into an encoding layer and a decoding layer, and Xceptance is introduced to serve as a backbone network. The backbone network divides the result into two modules by applying a serial hole convolution method, wherein one module is directly transmitted into a decoding layer, and the other module passes through an ASPP module. The method specifically comprises the following steps:
s21, at the coding layer, extracting characteristic information from the detected urban remote sensing image through a backbone network, and sequentially changing the size of the image into 1/4,1/8 and 1/16 of the size of the original image.
S22, obtaining 60 × 2048 feature graphs from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein a general ASPP module consists of a 1 × 1 convolution, 3 × 3 expansion convolution layers with three cavity rates of 6, 12 and 18 respectively and a global average pooling layer, in order to improve the accuracy of network edge segmentation better, on the basis of the structure, the global average pooling in the ASPP module is replaced by a mixed pooling strategy adopted in the text, the input feature graphs are subjected to mixed pooling to obtain the feature graphs with the size of 20 × 20, channel compression is performed through the convolution with 1 × 1, and finally the feature graphs are reduced to the height and the width of the input feature graphs through a deconvolution method, and the results obtained by all parts are spliced and fused.
And S23, merging the mixed result of the low-layer features output by the corresponding hierarchy of the backbone network and the coding end at the decoding end, and recovering the prediction result of the resolution of the input image after 3-by-3 convolution kernel up-sampling, so that a result image of city remote sensing image classification is obtained, and finally, the accuracy of network segmentation is improved.
And S24, in order to better improve the accuracy of network edge segmentation, in a DeepLabV3+ network structure, replacing the global average pooling in the ASPP with a mixed pooling strategy adopted in the text. And training the RSSCN7DataSet remote sensing image data set, and taking the rest samples as a test data set for testing the network.
In this embodiment, the hybrid pooling strategy in step S22 needs to optimize the 2048-level feature map:
selecting the frequency alpha of maximum pooling in the kth layer profile k Selecting the average pooled frequency beta k And selecting a frequency gamma for random pooling k Comprises the following steps:
wherein i _ max is the frequency count for selecting the largest pooling method in the k-th layer feature map, i _ avg is the frequency count for selecting the average pooling method in the k-th layer feature map, i _ sto is the frequency count for selecting the random pooling method in the k-th layer feature map, and i _ total is the size of the training set for optimizing the k-th layer feature map.
Result output after mixed pooling output of k-th layer feature map in final model k Comprises the following steps:
wherein x is k_max Representing the results of the k-th layer profile using the maximum pooling method, x k_avg Representing the results of the k-th layer profile using the average pooling method, x k_sto The result of the random pooling method is shown in the k-th layer characteristic diagram.
The step of improving the global average pooling in the ASPP module to dynamic hybrid pooling comprises:
a. initializing, namely, before training the DeepLabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
b. Performing pooling treatment on the layer 1 feature map by using maximum pooling, average pooling and random pooling methods respectively, performing pooling treatment on the rest feature maps by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average intersection ratio mIoU of a predicted value and a true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature map;
c. b, optimizing the pooling strategy of the characteristic diagram of the 2 nd to 2048 th layers with the same input by adopting the method;
d. the operations of the steps b and c are carried out on all training set samples;
e. obtaining alpha of each layer of feature map in training set by adopting different pooling strategies k 、β k And gamma k 。
In step S22 of this embodiment, the quality of different pooling strategies is evaluated by using an average intersection ratio mlou, where mlou represents adding the ratios of the intersection and the union of the predicted values and the true values of each category, and dividing the sum by the number of categories. The formula can be expressed as:
wherein k represents the number of non-null classes, TP represents the number of true instances, FP represents the number of false positive instances, and FN represents the number of false negative instances.
In step S2 in this embodiment, the deep labv3+ network model uses a pixel-by-pixel cross entropy loss function to process a multi-classification problem, selects each pixel point as a sample, and calculates the cross entropy loss function for the prediction category and the real category of each pixel. softmax converts the outputs into probability values to map to (0, 1) intervals for classification in a multi-classification process.
The output after the softmax regression process is:
wherein p is i,j Prediction probability of class j for ith sample, l i,j The output in class j for the ith sample for the network. The above equation changes the output to a probability distribution of (0, 1), and calculates the distance between the predicted probability distribution and the probability distribution of the true value by the cross entropy loss function. The method comprises the following specific steps:
wherein N is the number of training set samples, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of sample data.
In step S3 of this embodiment, remote sensing images in the same city area at different times are cut in the same proportion, and input into the trained model to obtain a segmented image.
Step S4 of this embodiment specifically includes:
s41, calculating the intersection ratio loss of the predicted value of the ith category in the urban remote sensing images of the same area at the time T1 and the time T2, wherein the intersection ratio loss is used for representing the change size of the ith category in the area, and the intersection ratio loss formula is as follows:
wherein L is i Indicating the magnitude of the change, L, of the ith region class i The larger the area change, the larger TP i Representing the number of real cases, FP, of the ith area class i Indicates the number of false positive cases, FN, of the ith area class i Indicating the number of false counterexamples for the ith area class.
And S42, splicing the segmented images in sequence.
And S43, subtracting the urban remote sensing images spliced at the moment of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and overlapping the marked image with the original image so as to display the urban change range on the remote sensing images.
As shown in fig. 4, the present embodiment discloses a system for monitoring the depth of change of remote sensing images in urban areas based on the above embodiments, which includes the following modules:
a dataset acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
a training module: adopting a DeepLabV3+ network model of a dynamic mixed pooling strategy based on the Xconcept as a backbone network, and training the network model by utilizing a data set;
a segmentation module: cutting remote sensing images of the same city area at different times in the same proportion, inputting the remote sensing images into a trained model, and segmenting the images;
a labeling module: and after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on a graph.
The invention dynamically selects different pooling modes for pooling the characteristic diagrams of each layer of the remote sensing image, can better grasp the global information and the local information of the image and improves the segmentation precision. The invention takes random pooling as one of the pooling strategies, effectively reduces the risk of overfitting and improves the generalization capability of the model.
Claims (10)
1. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy is characterized by comprising the following steps of:
s1, preprocessing a city remote sensing image, and labeling areas of different categories in a city to obtain a data set;
s2, training the network by using the data set in the step S1 based on a network model adopting a dynamic mixed pooling strategy and taking Xconcept as a backbone network;
s3, cutting the remote sensing images of the same city area at different time in the same proportion, inputting the remote sensing images into a trained model, and segmenting the images;
and S4, after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on the image.
2. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 1, wherein: in the step S1, the remote sensing image adopts an RSSCN7DataSet remote sensing image data set; or, in the step S1, the city remote sensing image is cut and semantically divided as data preprocessing, the size of the data is cut to 256 × 256, and the cut image is marked by using a label tool for semantically dividing and marking, so that the image is divided into five categories of roads, buildings, water areas, green plants and open spaces.
3. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 2, wherein: in the step S1, the preprocessed remote sensing image is segmented and labeled by adopting different colors, and then a training data set is selected in a random selection mode.
4. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 2 or 3, wherein: step S2 is specifically as follows:
s21, at a coding layer, extracting characteristic information of the detected urban remote sensing image through a backbone network, and sequentially changing the size of the image into 1/4,1/8 and 1/16 of the size of the original image;
s22, obtaining 60 × 2048 feature graphs from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein an ASPP module consists of a 1 × 1 convolution, 3 × 3 expansion convolution layers with three cavity rates of 6, 12 and 18 respectively and a global average pooling layer, on the basis of the structure, the global average pooling in the ASPP module adopts a mixed pooling strategy, the input feature graphs are subjected to mixed pooling to obtain the feature graphs with the size of 20, channel compression is performed through the convolution of 1 × 1, the feature graphs are reduced to the height and the width of the input feature graphs through a deconvolution method, and the results obtained by all parts are spliced and fused;
s23, recovering a prediction result of the resolution of an input image after a result obtained by mixing low-layer features and a coding layer output by a corresponding hierarchy of a backbone network fused at a decoding layer is subjected to 3-by-3 convolution kernel upsampling, and obtaining a result image of city remote sensing image classification;
s24, in the DeepLabV3+ network model adopting the dynamic mixed pooling strategy, training the network model by using the RSSCN7DataSet remote sensing image data set, and taking the rest samples as test data sets for testing the network model.
5. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 4, wherein: in step S22, the 2048-level feature map is optimized by the mixed pooling strategy:
selecting the frequency alpha of maximum pooling in the kth layer profile k Selecting the average pooled frequency beta k And selecting a frequency γ for random pooling k Comprises the following steps:
wherein i _ max is the frequency number of the selected largest pooling method in the k-th layer feature map, i _ avg is the frequency number of the selected average pooling method in the k-th layer feature map, i _ sto is the frequency number of the selected random pooling method in the k-th layer feature map, and i _ total is the size of the training set of the optimized k-th layer feature map;
result output after mixed pooling output of k-th layer feature map in final model k Comprises the following steps:
wherein x is k_max Representing the results of the k-th layer profile using the maximum pooling method, x k_avg Representing the results of the k-th layer profile using the average pooling method, x k_sto The results of the stochastic pooling method for the k-th layer profile are shown.
6. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 5, wherein: the method improves global average pooling in the ASPP module into dynamic mixed pooling, and comprises the following specific steps:
a. initializing, namely, before training a DeepLabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
b. Performing pooling processing on the layer 1 feature map by using maximum pooling, average pooling and random pooling methods respectively, performing pooling processing on the other feature maps by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average intersection ratio mIoU of a predicted value and a true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature map;
c. b, optimizing the pooling strategy of the same input layer 2-2048 characteristic diagrams by adopting the method in the step b;
d. the operations of the steps b and c are carried out on all training set samples;
e. obtaining alpha of each layer of feature map in training set by adopting different pooling strategies k 、β k And gamma k 。
7. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 6, wherein: in step S22, the quality of different pooling strategies is evaluated by using an average intersection-to-union ratio mlou, where mlou represents the sum of the ratios of the intersection and the union of the predicted values and the true values of each category, and the sum is divided by the number of categories, and is expressed as:
wherein k represents the number of non-null classes, TP represents the number of true positive cases, FP represents the number of false positive cases, and FN represents the number of false negative cases.
8. The method for monitoring the depth of change of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 7, wherein the method comprises the following steps:
in the step S2, the DeepLabV3+ network model adopts a pixel-by-pixel cross entropy loss function softmax loss to process a multi-classification problem, each pixel point is selected as a sample, and the cross entropy loss function of the prediction category and the real category of each pixel is calculated; in the multi-classification process, the loss function softmax converts a plurality of outputs into probability values to map in a (0, 1) interval for classification;
the output after the softmax regression process is:
wherein e is a natural constant of 2.71 i,j Prediction probability of class j for ith sample, l i,j The output of the ith sample in the jth class for the neural network, and C is the number of the original input classes;
the above equation changes the output into a probability distribution of (0, 1), and calculates the distance between the predicted probability distribution and the probability distribution of the true value by the cross entropy loss function, specifically:
where N is the number of training set samples, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
9. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 8, wherein: step S4 specifically includes:
s41, calculating the intersection ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the time of T1 and T2, wherein the intersection ratio loss is used for representing the change size of the ith category in the area, and the intersection ratio loss formula is as follows:
wherein L is i Indicates the magnitude of change, L, of the ith area class i The larger the size, the more the area changesThe larger TP i Representing the number of real instances of the ith area class, FP i Indicates the number of false positive cases, FN, of the ith area class i Representing the number of false counterexamples of the ith area category;
s42, splicing the segmented images in sequence;
and S43, subtracting the urban remote sensing images spliced at the moment of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and overlapping the marked image with the original image so as to display the urban change range on the remote sensing images.
10. The system based on the urban remote sensing image change depth monitoring method of any one of claims 1-9 is characterized by comprising the following modules:
a dataset acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
a training module: adopting a DeepLabV3+ network model of a dynamic mixing pooling strategy based on the Xconcept as a backbone network, and training the network by using a data set;
a segmentation module: cutting the remote sensing images of the same city area at different time in the same proportion, inputting the remote sensing images into a trained network model, and segmenting the images;
a labeling module: and after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on a graph.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211138291.1A CN115497006B (en) | 2022-09-19 | 2022-09-19 | Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211138291.1A CN115497006B (en) | 2022-09-19 | 2022-09-19 | Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115497006A true CN115497006A (en) | 2022-12-20 |
CN115497006B CN115497006B (en) | 2023-08-01 |
Family
ID=84471406
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211138291.1A Active CN115497006B (en) | 2022-09-19 | 2022-09-19 | Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115497006B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116343113A (en) * | 2023-03-09 | 2023-06-27 | 中国石油大学(华东) | Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016141282A1 (en) * | 2015-03-04 | 2016-09-09 | The Regents Of The University Of California | Convolutional neural network with tree pooling and tree feature map selection |
CN108053420A (en) * | 2018-01-05 | 2018-05-18 | 昆明理工大学 | A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class |
CN112069831A (en) * | 2020-08-21 | 2020-12-11 | 三峡大学 | Unreal information detection method based on BERT model and enhanced hybrid neural network |
CN112233038A (en) * | 2020-10-23 | 2021-01-15 | 广东启迪图卫科技股份有限公司 | True image denoising method based on multi-scale fusion and edge enhancement |
CN112308402A (en) * | 2020-10-29 | 2021-02-02 | 复旦大学 | Power time series data abnormity detection method based on long and short term memory network |
CN114663759A (en) * | 2022-03-24 | 2022-06-24 | 东南大学 | Remote sensing image building extraction method based on improved deep LabV3+ |
-
2022
- 2022-09-19 CN CN202211138291.1A patent/CN115497006B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016141282A1 (en) * | 2015-03-04 | 2016-09-09 | The Regents Of The University Of California | Convolutional neural network with tree pooling and tree feature map selection |
CN108053420A (en) * | 2018-01-05 | 2018-05-18 | 昆明理工大学 | A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class |
CN112069831A (en) * | 2020-08-21 | 2020-12-11 | 三峡大学 | Unreal information detection method based on BERT model and enhanced hybrid neural network |
CN112233038A (en) * | 2020-10-23 | 2021-01-15 | 广东启迪图卫科技股份有限公司 | True image denoising method based on multi-scale fusion and edge enhancement |
CN112308402A (en) * | 2020-10-29 | 2021-02-02 | 复旦大学 | Power time series data abnormity detection method based on long and short term memory network |
CN114663759A (en) * | 2022-03-24 | 2022-06-24 | 东南大学 | Remote sensing image building extraction method based on improved deep LabV3+ |
Non-Patent Citations (2)
Title |
---|
DINGJUN YU ET AL.: "Mixed Pooling for Convolutional Neural Networks", 《RESEARCHGATE》, pages 1 - 13 * |
段中兴等: "基于深度学习的盲道障碍物检测算法研究", 《计算机测量与控制》, pages 27 - 32 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116343113A (en) * | 2023-03-09 | 2023-06-27 | 中国石油大学(华东) | Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network |
Also Published As
Publication number | Publication date |
---|---|
CN115497006B (en) | 2023-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110136170B (en) | Remote sensing image building change detection method based on convolutional neural network | |
CN109033998B (en) | Remote sensing image ground object labeling method based on attention mechanism convolutional neural network | |
CN108509954A (en) | A kind of more car plate dynamic identifying methods of real-time traffic scene | |
CN110532961B (en) | Semantic traffic light detection method based on multi-scale attention mechanism network model | |
CN111738113B (en) | Road extraction method of high-resolution remote sensing image based on double-attention mechanism and semantic constraint | |
WO2023168781A1 (en) | Soil cadmium risk prediction method based on spatial-temporal interaction relationship | |
CN111191628B (en) | Remote sensing image earthquake damage building identification method based on decision tree and feature optimization | |
CN113344050B (en) | Lithology intelligent recognition method and system based on deep learning | |
CN110766690B (en) | Wheat ear detection and counting method based on deep learning point supervision thought | |
CN113449594A (en) | Multilayer network combined remote sensing image ground semantic segmentation and area calculation method | |
CN108898096B (en) | High-resolution image-oriented information rapid and accurate extraction method | |
CN114092697B (en) | Building facade semantic segmentation method with attention fused with global and local depth features | |
CN112232328A (en) | Remote sensing image building area extraction method and device based on convolutional neural network | |
CN104239890A (en) | Method for automatically extracting coastal land and earth cover information through GF-1 satellite | |
CN112949612A (en) | High-resolution remote sensing image coastal zone ground object classification method based on unmanned aerial vehicle | |
CN115497006B (en) | Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy | |
CN110956207B (en) | Method for detecting full-element change of optical remote sensing image | |
CN113128335A (en) | Method, system and application for detecting, classifying and discovering micro-body paleontological fossil image | |
CN111738052A (en) | Multi-feature fusion hyperspectral remote sensing ground object classification method based on deep learning | |
CN113033386B (en) | High-resolution remote sensing image-based transmission line channel hidden danger identification method and system | |
CN113420619A (en) | Remote sensing image building extraction method | |
CN112016845A (en) | DNN and CIM based regional economic benefit evaluation method and system | |
CN101876993A (en) | Method for extracting and retrieving textural features from ground digital nephograms | |
CN115984603A (en) | Fine classification method and system for urban green land based on GF-2 and open map data | |
CN113077438B (en) | Cell nucleus region extraction method and imaging method for multi-cell nucleus color image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |