CN115497006A - Urban remote sensing image change depth monitoring method and system based on dynamic hybrid strategy - Google Patents

Urban remote sensing image change depth monitoring method and system based on dynamic hybrid strategy Download PDF

Info

Publication number
CN115497006A
CN115497006A CN202211138291.1A CN202211138291A CN115497006A CN 115497006 A CN115497006 A CN 115497006A CN 202211138291 A CN202211138291 A CN 202211138291A CN 115497006 A CN115497006 A CN 115497006A
Authority
CN
China
Prior art keywords
remote sensing
pooling
sensing image
urban
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211138291.1A
Other languages
Chinese (zh)
Other versions
CN115497006B (en
Inventor
滕旭阳
林煜凯
冯嘉旖
蔡璐
高永盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202211138291.1A priority Critical patent/CN115497006B/en
Publication of CN115497006A publication Critical patent/CN115497006A/en
Application granted granted Critical
Publication of CN115497006B publication Critical patent/CN115497006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention discloses a method and a system for monitoring the change depth of an urban remote sensing image based on a dynamic mixing strategy, wherein the method comprises the following steps: s1, preprocessing a city remote sensing image, and labeling different types of city areas to obtain a data set; s2, training the network by using the data set in the step S1 based on a DeepLabV3+ network model adopting a dynamic mixed pooling strategy and taking the Xconcept as a backbone network; s3, cutting the remote sensing images of the same city area at different times in the same proportion, inputting the remote sensing images into a trained network model, and segmenting the images; and S4, after the region classification result of the urban remote sensing image is obtained, calculating the change degree of each region class within a period of time and marking the change on the image. The invention dynamically selects different pooling modes for pooling the characteristic maps of each layer of the remote sensing image, can better grasp the global information and the local information of the image and improves the segmentation precision.

Description

Urban remote sensing image change depth monitoring method and system based on dynamic hybrid strategy
Technical Field
The invention belongs to the technical field of satellite remote sensing image processing, and particularly relates to a method and a system for monitoring the change depth of an urban remote sensing image based on a dynamic mixing strategy.
Background
The semantic segmentation of the high-resolution remote sensing image is a basic task in the field of remote sensing images, and the main task is to utilize a computer to analyze the color, spectral information and spatial information of various targets in the observed remote sensing image, select characteristic information, classify each pixel in the image and segment the regional outline between the target images. The urban remote sensing image mainly comprises buildings, roads, green plant coverage areas and the like. Accurate segmentation and change detection of the urban remote sensing image can be used for analyzing index details of all parts such as seasonal changes of buildings and green plants, disaster detection, vegetation distribution changes and the like in urban management planning, and further providing basis and help for comprehensively mastering urban layout and real-time dynamic changes thereof.
In recent decades, with the development of computer artificial intelligent vision technology, the resolution of satellite remote sensing images is higher and higher, the capability of processing images is greatly improved, and the method has important significance for developing academic research and guiding production practice. Compared with the common image applied in the prior art, the remote sensing image has the characteristics of high precision, abundant context information, wider visual field range, real-time dynamic monitoring and the like, is applied in various fields in a large scale and gradually shows a refined development trend. And how to accurately extract key information from the remote sensing image is of great importance. The traditional remote sensing image segmentation technology, such as a region segmentation-based method, an edge detection segmentation method, shadow analysis and the like, generally depends on manual design to extract features, and has poor generalization when performing region segmentation on a complex scene.
The high-altitude remote sensing image is subjected to semantic segmentation by a deep learning algorithm based on edge detection, valuable information in the image can be better extracted, the segmentation precision of small-size targets in the remote sensing image can be improved, and meanwhile, data support can be provided for urban planning, desert detection management, urban greening management and control, water area supervision and the like. However, in an actual remote sensing image, due to the fact that complex environments with different conditions such as various terrains, target objects are mutually shielded, urban building types are rich, and the influence of a plurality of factors such as illumination, cloud cover and the like exists, the problems that the precision of object edge segmentation details is greatly reduced, and segmentation boundaries are fuzzy exist.
Therefore, it is very important to design a method for detecting the change of the remote sensing image in the urban area based on the dynamic mixing strategy, which can reduce the influence of the terrain environment factors and improve the accuracy of the segmentation of the edge of the area.
Disclosure of Invention
The invention aims to improve the defects in the technology and provides a method for detecting the change depth of an urban remote sensing image based on a deep LabV3+ multi-scale network model by combining various mixed pooling strategies.
The invention adopts the following technical scheme:
the method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy comprises the following steps:
s1, preprocessing a city remote sensing image, and labeling areas of different categories in a city to obtain a data set;
s2, training the network by using the data set in the step S1 based on a network model adopting a dynamic mixed pooling strategy and taking Xconcept as a backbone network;
s3, cutting the remote sensing images of the same city area at different time in the same proportion, inputting the remote sensing images into the trained network model, and segmenting the images;
and S4, after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on the image.
Further, the original remote sensing image of the urban area adopts an RSSCN7DataSet remote sensing image data set.
Further, in step S1, the original remote sensing image is cut and semantically segmented to be used as data preprocessing, the size of the original remote sensing image is cut to 256 × 256, and the cut image is labeled by using a label semantic segmentation labeling tool and is classified into five categories of roads, buildings, water areas, green plants and open spaces.
Furthermore, the Labelme technology is an open source image labeling tool and is mainly used for data set labeling work of instance segmentation, semantic segmentation, target detection and classification tasks.
Furthermore, because the mark characteristics of each region category in the urban area are generally obvious, the remote sensing image after preprocessing is directly segmented and labeled by using different colors, and then a training data set is selected by adopting a random selection mode.
Further, in step S2, the dataset is trained using a mixed pooling deplab v3+ network model. The DeepLabV3+ network model fuses multi-scale information, the model structure is divided into a coding layer and a decoding layer, and Xconcept is introduced to serve as a backbone network. The method for applying the serial hole convolution to the result by the backbone network is divided into two modules, wherein one module is directly transmitted into a decoding layer, and the other module passes through an ASPP module. The method specifically comprises the following steps:
s21, at a coding layer, extracting characteristic information of the detected urban remote sensing image through a backbone network, and sequentially changing the size of the image into 1/4,1/8 and 1/16 of the size of the original image.
S22, obtaining 60 × 2048 feature graphs from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein a general ASPP module consists of a 1 × 1 convolution, 3 × 3 expansion convolution layers with three cavity rates of 6, 12 and 18 respectively and a global average pooling layer, in order to better improve the accuracy of network model edge segmentation, on the basis of the structure, the global average pooling in the ASPP module is replaced by a mixed pooling strategy adopted in the text, the input feature graphs are subjected to mixed pooling to obtain the feature graphs with the size of 20 × 20, channel compression is performed through the convolution of 1 × 1, and finally the feature graphs are reduced to the height and the width of the input feature graphs through a deconvolution method, and the obtained results of all the parts are spliced and fused.
And S23, merging the mixed result of the low-layer features output by the corresponding hierarchy of the backbone network and the coding end at the decoding end, and recovering the prediction result of the resolution of the input image after 3-by-3 convolution kernel up-sampling, so that a result image of city remote sensing image classification is obtained, and finally, the accuracy of network model segmentation is improved.
S24, in the DeepLabV3+ network model adopting the dynamic mixed pooling strategy, training the network model by using the RSSCN7DataSet remote sensing image data set, and taking the rest samples as test data sets for testing the network model.
Further, the mixed pooling strategy in step S22 needs to optimize the 2048-level feature map:
selecting the frequency alpha of maximum pooling in the kth layer profile k Selecting the average pooled frequency beta k And selecting a frequency γ for random pooling k Comprises the following steps:
Figure BDA0003852275090000031
wherein i _ max is the frequency count for selecting the largest pooling method in the k-th layer feature map, i _ avg is the frequency count for selecting the average pooling method in the k-th layer feature map, i _ sto is the frequency count for selecting the random pooling method in the k-th layer feature map, and i _ total is the size of the training set for optimizing the k-th layer feature map.
Finally, the result output after the mixed pooling output of the k-th layer characteristic diagram in the network model k Comprises the following steps:
Figure BDA0003852275090000032
wherein x is k_max Results of maximum pooling for k-th layer profile, x k_avg Results of applying the average pooling method to represent the k-th layer feature map, x k_sto The results of the stochastic pooling method for the k-th layer profile are shown.
Further, the step of improving the global average pooling in the ASPP module to dynamic hybrid pooling comprises:
a. first stageInitializing, namely, before training a DeepLabV3+ network model, the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
Figure BDA0003852275090000033
b. Performing pooling processing on the layer 1 feature map by using maximum pooling, average pooling and random pooling methods respectively, performing pooling processing on the other feature maps by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average intersection ratio mIoU of a predicted value and a true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature map;
c. b, optimizing the pooling strategy of the characteristic diagram of the 2 nd to 2048 th layers with the same input by adopting the method;
d. the operations of the steps b and c are carried out on all training set samples;
e. alpha of each layer of feature map in training set by adopting different pooling strategies is obtained k 、β k And gamma k
Further, in step S22, an average cross-over ratio (mlou) is used to evaluate the merits of different pooling strategies, where mlou represents the ratio of the intersection and the union of the predicted values and the true values of each category, which are added together and then divided by the number of categories. The formula can be expressed as:
Figure BDA0003852275090000041
wherein k represents the number of non-null classes, TP represents the number of true instances, FP represents the number of false positive instances, and FN represents the number of false negative instances.
Further, in step S2, the deep labv3+ network model adopts a pixel-by-pixel cross entropy loss function softmax loss to process a multi-classification problem, selects each pixel point as a sample, and calculates the cross entropy loss function of each pixel for the prediction category and the real category. The loss function softmax converts a plurality of outputs into probability values to map in a (0, 1) interval for classification in a multi-classification process.
The output after the softmax regression process is:
Figure BDA0003852275090000042
wherein e is a natural constant of 2.71 i,j Prediction probability of class j for ith sample, l i,j And C is the number of original input classes for the output of the ith sample in the jth class of the neural network.
The above equation turns the output into a probability distribution of (0, 1), and the distance between the predicted probability distribution and the probability distribution of the true value is calculated by the cross entropy loss function. The method specifically comprises the following steps:
Figure BDA0003852275090000043
where N is the number of training set samples, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
Further, in step S3, the remote sensing images of the same city area at different times are cut in the same proportion and input into the trained model to obtain a segmented image.
Further, step S4 specifically includes:
s41, calculating the intersection ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the time of T1 and T2, wherein the intersection ratio loss is used for representing the change size of the ith category in the area, and the intersection ratio loss formula is as follows:
Figure BDA0003852275090000044
wherein L is i Indicating the magnitude of the change, L, of the ith region class i The larger the area change, the larger TP i Representing the number of real instances of the ith area class, FP i Indicates the number of false positive cases, FN, of the ith area class i Indicates the ith area categoryNumber of false counter-examples.
And S42, splicing the segmented images in sequence.
And S43, subtracting the urban remote sensing images spliced at the moment of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and overlapping the marked image with the original image so as to display the urban change range on the remote sensing images.
The invention also discloses a system based on the method for monitoring the change depth of the urban remote sensing image, which comprises the following modules:
a dataset acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
a training module: adopting a DeepLabV3+ network model of a dynamic mixed pooling strategy based on the Xconcept as a backbone network, and training the network by utilizing a data set;
a segmentation module: cutting remote sensing images of the same city area at different times in the same proportion, inputting the remote sensing images into a trained model, and segmenting the images;
a labeling module: and after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on a graph.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention dynamically selects different pooling modes for pooling the characteristic diagrams of each layer of the remote sensing image, can better grasp the global information and the local information of the image and improves the segmentation precision.
2. The invention takes random pooling as one of the pooling strategies, effectively reduces the risk of overfitting and improves the generalization capability of the model.
Drawings
FIG. 1 is a flow chart of the method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy.
FIG. 2 is a flow diagram of a hybrid pooling process.
FIG. 3 is a DeepLabV3+ network model diagram.
FIG. 4 is a block diagram of the system for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1 to 3, the method for monitoring the depth of change of the remote sensing image in the urban area based on the dynamic mixing strategy in the embodiment is performed according to the following steps:
s1, preprocessing an image and labeling different types of regions in a city according to RSSCN7DataSet remote sensing images released by Wuhan university in 2015 to obtain a data set;
s2, training a network by using a data set by adopting a DeepLabV3+ network model of a dynamic mixing pooling strategy based on the Xconcept as a backbone network;
and S3, cutting the remote sensing images of the same city region at different times in the same proportion, inputting the remote sensing images into the trained model, and segmenting the images.
And S4, after the region classification result of the urban remote sensing image is obtained, calculating the change degree of each region class within a period of time and marking the change on the image.
Specifically, in this embodiment, the original remote sensing image of the urban area adopts RSSCN7DataSet remote sensing image data set.
In this embodiment, in step S1, the original remote sensing image is cut and semantically segmented as data preprocessing, the original remote sensing image is cut to 256 × 256 size, and the cut image is labeled by using a label tool for semantic segmentation, and is classified into five categories, i.e., road, building, water area, green plant, and open space.
Furthermore, the Labelme technology is an open source image labeling tool and is mainly used for data set labeling work of example segmentation, semantic segmentation, target detection and classification tasks.
Since the mark characteristics of each region category in urban areas are generally significant, the remote sensing image after preprocessing is directly segmented and labeled by using different colors in the embodiment, and then a training data set is selected by adopting a random selection mode.
In S2 of this example, the dataset was trained using the mixed pooling depeplab v3+ network model. The DeepLabV3+ network model fuses multi-scale information, the model structure is divided into an encoding layer and a decoding layer, and Xceptance is introduced to serve as a backbone network. The backbone network divides the result into two modules by applying a serial hole convolution method, wherein one module is directly transmitted into a decoding layer, and the other module passes through an ASPP module. The method specifically comprises the following steps:
s21, at the coding layer, extracting characteristic information from the detected urban remote sensing image through a backbone network, and sequentially changing the size of the image into 1/4,1/8 and 1/16 of the size of the original image.
S22, obtaining 60 × 2048 feature graphs from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein a general ASPP module consists of a 1 × 1 convolution, 3 × 3 expansion convolution layers with three cavity rates of 6, 12 and 18 respectively and a global average pooling layer, in order to improve the accuracy of network edge segmentation better, on the basis of the structure, the global average pooling in the ASPP module is replaced by a mixed pooling strategy adopted in the text, the input feature graphs are subjected to mixed pooling to obtain the feature graphs with the size of 20 × 20, channel compression is performed through the convolution with 1 × 1, and finally the feature graphs are reduced to the height and the width of the input feature graphs through a deconvolution method, and the results obtained by all parts are spliced and fused.
And S23, merging the mixed result of the low-layer features output by the corresponding hierarchy of the backbone network and the coding end at the decoding end, and recovering the prediction result of the resolution of the input image after 3-by-3 convolution kernel up-sampling, so that a result image of city remote sensing image classification is obtained, and finally, the accuracy of network segmentation is improved.
And S24, in order to better improve the accuracy of network edge segmentation, in a DeepLabV3+ network structure, replacing the global average pooling in the ASPP with a mixed pooling strategy adopted in the text. And training the RSSCN7DataSet remote sensing image data set, and taking the rest samples as a test data set for testing the network.
In this embodiment, the hybrid pooling strategy in step S22 needs to optimize the 2048-level feature map:
selecting the frequency alpha of maximum pooling in the kth layer profile k Selecting the average pooled frequency beta k And selecting a frequency gamma for random pooling k Comprises the following steps:
Figure BDA0003852275090000071
wherein i _ max is the frequency count for selecting the largest pooling method in the k-th layer feature map, i _ avg is the frequency count for selecting the average pooling method in the k-th layer feature map, i _ sto is the frequency count for selecting the random pooling method in the k-th layer feature map, and i _ total is the size of the training set for optimizing the k-th layer feature map.
Result output after mixed pooling output of k-th layer feature map in final model k Comprises the following steps:
Figure BDA0003852275090000072
wherein x is k_max Representing the results of the k-th layer profile using the maximum pooling method, x k_avg Representing the results of the k-th layer profile using the average pooling method, x k_sto The result of the random pooling method is shown in the k-th layer characteristic diagram.
The step of improving the global average pooling in the ASPP module to dynamic hybrid pooling comprises:
a. initializing, namely, before training the DeepLabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
Figure BDA0003852275090000073
b. Performing pooling treatment on the layer 1 feature map by using maximum pooling, average pooling and random pooling methods respectively, performing pooling treatment on the rest feature maps by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average intersection ratio mIoU of a predicted value and a true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature map;
c. b, optimizing the pooling strategy of the characteristic diagram of the 2 nd to 2048 th layers with the same input by adopting the method;
d. the operations of the steps b and c are carried out on all training set samples;
e. obtaining alpha of each layer of feature map in training set by adopting different pooling strategies k 、β k And gamma k
In step S22 of this embodiment, the quality of different pooling strategies is evaluated by using an average intersection ratio mlou, where mlou represents adding the ratios of the intersection and the union of the predicted values and the true values of each category, and dividing the sum by the number of categories. The formula can be expressed as:
Figure BDA0003852275090000074
wherein k represents the number of non-null classes, TP represents the number of true instances, FP represents the number of false positive instances, and FN represents the number of false negative instances.
In step S2 in this embodiment, the deep labv3+ network model uses a pixel-by-pixel cross entropy loss function to process a multi-classification problem, selects each pixel point as a sample, and calculates the cross entropy loss function for the prediction category and the real category of each pixel. softmax converts the outputs into probability values to map to (0, 1) intervals for classification in a multi-classification process.
The output after the softmax regression process is:
Figure BDA0003852275090000081
wherein p is i,j Prediction probability of class j for ith sample, l i,j The output in class j for the ith sample for the network. The above equation changes the output to a probability distribution of (0, 1), and calculates the distance between the predicted probability distribution and the probability distribution of the true value by the cross entropy loss function. The method comprises the following specific steps:
Figure BDA0003852275090000082
wherein N is the number of training set samples, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of sample data.
In step S3 of this embodiment, remote sensing images in the same city area at different times are cut in the same proportion, and input into the trained model to obtain a segmented image.
Step S4 of this embodiment specifically includes:
s41, calculating the intersection ratio loss of the predicted value of the ith category in the urban remote sensing images of the same area at the time T1 and the time T2, wherein the intersection ratio loss is used for representing the change size of the ith category in the area, and the intersection ratio loss formula is as follows:
Figure BDA0003852275090000083
wherein L is i Indicating the magnitude of the change, L, of the ith region class i The larger the area change, the larger TP i Representing the number of real cases, FP, of the ith area class i Indicates the number of false positive cases, FN, of the ith area class i Indicating the number of false counterexamples for the ith area class.
And S42, splicing the segmented images in sequence.
And S43, subtracting the urban remote sensing images spliced at the moment of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and overlapping the marked image with the original image so as to display the urban change range on the remote sensing images.
As shown in fig. 4, the present embodiment discloses a system for monitoring the depth of change of remote sensing images in urban areas based on the above embodiments, which includes the following modules:
a dataset acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
a training module: adopting a DeepLabV3+ network model of a dynamic mixed pooling strategy based on the Xconcept as a backbone network, and training the network model by utilizing a data set;
a segmentation module: cutting remote sensing images of the same city area at different times in the same proportion, inputting the remote sensing images into a trained model, and segmenting the images;
a labeling module: and after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on a graph.
The invention dynamically selects different pooling modes for pooling the characteristic diagrams of each layer of the remote sensing image, can better grasp the global information and the local information of the image and improves the segmentation precision. The invention takes random pooling as one of the pooling strategies, effectively reduces the risk of overfitting and improves the generalization capability of the model.

Claims (10)

1. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy is characterized by comprising the following steps of:
s1, preprocessing a city remote sensing image, and labeling areas of different categories in a city to obtain a data set;
s2, training the network by using the data set in the step S1 based on a network model adopting a dynamic mixed pooling strategy and taking Xconcept as a backbone network;
s3, cutting the remote sensing images of the same city area at different time in the same proportion, inputting the remote sensing images into a trained model, and segmenting the images;
and S4, after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on the image.
2. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 1, wherein: in the step S1, the remote sensing image adopts an RSSCN7DataSet remote sensing image data set; or, in the step S1, the city remote sensing image is cut and semantically divided as data preprocessing, the size of the data is cut to 256 × 256, and the cut image is marked by using a label tool for semantically dividing and marking, so that the image is divided into five categories of roads, buildings, water areas, green plants and open spaces.
3. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 2, wherein: in the step S1, the preprocessed remote sensing image is segmented and labeled by adopting different colors, and then a training data set is selected in a random selection mode.
4. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 2 or 3, wherein: step S2 is specifically as follows:
s21, at a coding layer, extracting characteristic information of the detected urban remote sensing image through a backbone network, and sequentially changing the size of the image into 1/4,1/8 and 1/16 of the size of the original image;
s22, obtaining 60 × 2048 feature graphs from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein an ASPP module consists of a 1 × 1 convolution, 3 × 3 expansion convolution layers with three cavity rates of 6, 12 and 18 respectively and a global average pooling layer, on the basis of the structure, the global average pooling in the ASPP module adopts a mixed pooling strategy, the input feature graphs are subjected to mixed pooling to obtain the feature graphs with the size of 20, channel compression is performed through the convolution of 1 × 1, the feature graphs are reduced to the height and the width of the input feature graphs through a deconvolution method, and the results obtained by all parts are spliced and fused;
s23, recovering a prediction result of the resolution of an input image after a result obtained by mixing low-layer features and a coding layer output by a corresponding hierarchy of a backbone network fused at a decoding layer is subjected to 3-by-3 convolution kernel upsampling, and obtaining a result image of city remote sensing image classification;
s24, in the DeepLabV3+ network model adopting the dynamic mixed pooling strategy, training the network model by using the RSSCN7DataSet remote sensing image data set, and taking the rest samples as test data sets for testing the network model.
5. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 4, wherein: in step S22, the 2048-level feature map is optimized by the mixed pooling strategy:
selecting the frequency alpha of maximum pooling in the kth layer profile k Selecting the average pooled frequency beta k And selecting a frequency γ for random pooling k Comprises the following steps:
Figure FDA0003852275080000021
wherein i _ max is the frequency number of the selected largest pooling method in the k-th layer feature map, i _ avg is the frequency number of the selected average pooling method in the k-th layer feature map, i _ sto is the frequency number of the selected random pooling method in the k-th layer feature map, and i _ total is the size of the training set of the optimized k-th layer feature map;
result output after mixed pooling output of k-th layer feature map in final model k Comprises the following steps:
Figure FDA0003852275080000022
wherein x is k_max Representing the results of the k-th layer profile using the maximum pooling method, x k_avg Representing the results of the k-th layer profile using the average pooling method, x k_sto The results of the stochastic pooling method for the k-th layer profile are shown.
6. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 5, wherein: the method improves global average pooling in the ASPP module into dynamic mixed pooling, and comprises the following specific steps:
a. initializing, namely, before training a DeepLabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
Figure FDA0003852275080000023
b. Performing pooling processing on the layer 1 feature map by using maximum pooling, average pooling and random pooling methods respectively, performing pooling processing on the other feature maps by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average intersection ratio mIoU of a predicted value and a true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature map;
c. b, optimizing the pooling strategy of the same input layer 2-2048 characteristic diagrams by adopting the method in the step b;
d. the operations of the steps b and c are carried out on all training set samples;
e. obtaining alpha of each layer of feature map in training set by adopting different pooling strategies k 、β k And gamma k
7. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 6, wherein: in step S22, the quality of different pooling strategies is evaluated by using an average intersection-to-union ratio mlou, where mlou represents the sum of the ratios of the intersection and the union of the predicted values and the true values of each category, and the sum is divided by the number of categories, and is expressed as:
Figure FDA0003852275080000024
wherein k represents the number of non-null classes, TP represents the number of true positive cases, FP represents the number of false positive cases, and FN represents the number of false negative cases.
8. The method for monitoring the depth of change of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 7, wherein the method comprises the following steps:
in the step S2, the DeepLabV3+ network model adopts a pixel-by-pixel cross entropy loss function softmax loss to process a multi-classification problem, each pixel point is selected as a sample, and the cross entropy loss function of the prediction category and the real category of each pixel is calculated; in the multi-classification process, the loss function softmax converts a plurality of outputs into probability values to map in a (0, 1) interval for classification;
the output after the softmax regression process is:
Figure FDA0003852275080000031
wherein e is a natural constant of 2.71 i,j Prediction probability of class j for ith sample, l i,j The output of the ith sample in the jth class for the neural network, and C is the number of the original input classes;
the above equation changes the output into a probability distribution of (0, 1), and calculates the distance between the predicted probability distribution and the probability distribution of the true value by the cross entropy loss function, specifically:
Figure FDA0003852275080000032
where N is the number of training set samples, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
9. The method for monitoring the change depth of the urban remote sensing image based on the dynamic mixing strategy as claimed in claim 8, wherein: step S4 specifically includes:
s41, calculating the intersection ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the time of T1 and T2, wherein the intersection ratio loss is used for representing the change size of the ith category in the area, and the intersection ratio loss formula is as follows:
Figure FDA0003852275080000033
wherein L is i Indicates the magnitude of change, L, of the ith area class i The larger the size, the more the area changesThe larger TP i Representing the number of real instances of the ith area class, FP i Indicates the number of false positive cases, FN, of the ith area class i Representing the number of false counterexamples of the ith area category;
s42, splicing the segmented images in sequence;
and S43, subtracting the urban remote sensing images spliced at the moment of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and overlapping the marked image with the original image so as to display the urban change range on the remote sensing images.
10. The system based on the urban remote sensing image change depth monitoring method of any one of claims 1-9 is characterized by comprising the following modules:
a dataset acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
a training module: adopting a DeepLabV3+ network model of a dynamic mixing pooling strategy based on the Xconcept as a backbone network, and training the network by using a data set;
a segmentation module: cutting the remote sensing images of the same city area at different time in the same proportion, inputting the remote sensing images into a trained network model, and segmenting the images;
a labeling module: and after the regional classification result of the urban remote sensing image is obtained, calculating the change degree of each regional category within a period of time and marking the change on a graph.
CN202211138291.1A 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy Active CN115497006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211138291.1A CN115497006B (en) 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211138291.1A CN115497006B (en) 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy

Publications (2)

Publication Number Publication Date
CN115497006A true CN115497006A (en) 2022-12-20
CN115497006B CN115497006B (en) 2023-08-01

Family

ID=84471406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211138291.1A Active CN115497006B (en) 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy

Country Status (1)

Country Link
CN (1) CN115497006B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116343113A (en) * 2023-03-09 2023-06-27 中国石油大学(华东) Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016141282A1 (en) * 2015-03-04 2016-09-09 The Regents Of The University Of California Convolutional neural network with tree pooling and tree feature map selection
CN108053420A (en) * 2018-01-05 2018-05-18 昆明理工大学 A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class
CN112069831A (en) * 2020-08-21 2020-12-11 三峡大学 Unreal information detection method based on BERT model and enhanced hybrid neural network
CN112233038A (en) * 2020-10-23 2021-01-15 广东启迪图卫科技股份有限公司 True image denoising method based on multi-scale fusion and edge enhancement
CN112308402A (en) * 2020-10-29 2021-02-02 复旦大学 Power time series data abnormity detection method based on long and short term memory network
CN114663759A (en) * 2022-03-24 2022-06-24 东南大学 Remote sensing image building extraction method based on improved deep LabV3+

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016141282A1 (en) * 2015-03-04 2016-09-09 The Regents Of The University Of California Convolutional neural network with tree pooling and tree feature map selection
CN108053420A (en) * 2018-01-05 2018-05-18 昆明理工大学 A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class
CN112069831A (en) * 2020-08-21 2020-12-11 三峡大学 Unreal information detection method based on BERT model and enhanced hybrid neural network
CN112233038A (en) * 2020-10-23 2021-01-15 广东启迪图卫科技股份有限公司 True image denoising method based on multi-scale fusion and edge enhancement
CN112308402A (en) * 2020-10-29 2021-02-02 复旦大学 Power time series data abnormity detection method based on long and short term memory network
CN114663759A (en) * 2022-03-24 2022-06-24 东南大学 Remote sensing image building extraction method based on improved deep LabV3+

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DINGJUN YU ET AL.: "Mixed Pooling for Convolutional Neural Networks", 《RESEARCHGATE》, pages 1 - 13 *
段中兴等: "基于深度学习的盲道障碍物检测算法研究", 《计算机测量与控制》, pages 27 - 32 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116343113A (en) * 2023-03-09 2023-06-27 中国石油大学(华东) Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network

Also Published As

Publication number Publication date
CN115497006B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN110136170B (en) Remote sensing image building change detection method based on convolutional neural network
CN109033998B (en) Remote sensing image ground object labeling method based on attention mechanism convolutional neural network
CN108509954A (en) A kind of more car plate dynamic identifying methods of real-time traffic scene
CN110532961B (en) Semantic traffic light detection method based on multi-scale attention mechanism network model
CN111738113B (en) Road extraction method of high-resolution remote sensing image based on double-attention mechanism and semantic constraint
WO2023168781A1 (en) Soil cadmium risk prediction method based on spatial-temporal interaction relationship
CN111191628B (en) Remote sensing image earthquake damage building identification method based on decision tree and feature optimization
CN113344050B (en) Lithology intelligent recognition method and system based on deep learning
CN110766690B (en) Wheat ear detection and counting method based on deep learning point supervision thought
CN113449594A (en) Multilayer network combined remote sensing image ground semantic segmentation and area calculation method
CN108898096B (en) High-resolution image-oriented information rapid and accurate extraction method
CN114092697B (en) Building facade semantic segmentation method with attention fused with global and local depth features
CN112232328A (en) Remote sensing image building area extraction method and device based on convolutional neural network
CN104239890A (en) Method for automatically extracting coastal land and earth cover information through GF-1 satellite
CN112949612A (en) High-resolution remote sensing image coastal zone ground object classification method based on unmanned aerial vehicle
CN115497006B (en) Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy
CN110956207B (en) Method for detecting full-element change of optical remote sensing image
CN113128335A (en) Method, system and application for detecting, classifying and discovering micro-body paleontological fossil image
CN111738052A (en) Multi-feature fusion hyperspectral remote sensing ground object classification method based on deep learning
CN113033386B (en) High-resolution remote sensing image-based transmission line channel hidden danger identification method and system
CN113420619A (en) Remote sensing image building extraction method
CN112016845A (en) DNN and CIM based regional economic benefit evaluation method and system
CN101876993A (en) Method for extracting and retrieving textural features from ground digital nephograms
CN115984603A (en) Fine classification method and system for urban green land based on GF-2 and open map data
CN113077438B (en) Cell nucleus region extraction method and imaging method for multi-cell nucleus color image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant