CN115497006B - Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy - Google Patents

Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy Download PDF

Info

Publication number
CN115497006B
CN115497006B CN202211138291.1A CN202211138291A CN115497006B CN 115497006 B CN115497006 B CN 115497006B CN 202211138291 A CN202211138291 A CN 202211138291A CN 115497006 B CN115497006 B CN 115497006B
Authority
CN
China
Prior art keywords
pooling
remote sensing
sensing image
urban
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211138291.1A
Other languages
Chinese (zh)
Other versions
CN115497006A (en
Inventor
滕旭阳
林煜凯
冯嘉旖
蔡璐
高永盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202211138291.1A priority Critical patent/CN115497006B/en
Publication of CN115497006A publication Critical patent/CN115497006A/en
Application granted granted Critical
Publication of CN115497006B publication Critical patent/CN115497006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention discloses a method and a system for monitoring urban remote sensing image change depth based on a dynamic mixing strategy, wherein the method comprises the following steps: s1, preprocessing urban remote sensing images, and marking urban areas of different categories to obtain a data set; s2, adopting a deep LabV3+ network model of a dynamic mixing pooling strategy based on Xreception as a backbone network, and training the network by utilizing the data set of the step S1; s3, carrying out same-proportion clipping on remote sensing images of the same city region at different times, inputting the remote sensing images into a trained network model, and dividing the images; and S4, after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map. According to the invention, different pooling modes are dynamically selected for pooling the characteristic images of each layer of the remote sensing image, so that the global information and the local information of the image can be better grasped, and the segmentation precision is improved.

Description

Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy
Technical Field
The invention belongs to the technical field of satellite remote sensing image processing, and particularly relates to a method and a system for monitoring urban remote sensing image change depth based on a dynamic mixing strategy.
Background
The high-resolution remote sensing image semantic segmentation is a basic task in the field of remote sensing images, and the main task is to utilize a computer to select characteristic information by analyzing the color, spectral information and spatial information of various targets in the observed remote sensing image, classify each pixel in the image and segment out the regional outline between the target images. The urban remote sensing image mainly comprises buildings, roads, green planting coverage areas and the like. The accurate segmentation and change detection of the urban remote sensing image can analyze the index details of each part of the building, the green plant seasonal change, the disaster detection, the vegetation distribution change and the like in urban management planning, and further provide basis and help for comprehensively grasping the urban layout and the real-time dynamic change thereof.
In recent decades, with the development of computer artificial intelligence vision technology, the resolution of satellite remote sensing images is higher and higher, and the capability of processing images is greatly improved, so that the method has important significance for developing academic research and guiding production practice. Compared with the common image applied in the prior art, the remote sensing image has the characteristics of high precision, abundant context information, wider visual field range, capability of real-time dynamic monitoring and the like, is applied in various fields in a large scale and gradually shows a fine development trend. It is important to accurately extract key information from the remote sensing image. The conventional remote sensing image segmentation technology, such as a region segmentation-based method, an edge detection segmentation method, shadow analysis and the like, generally relies on manual design to extract features, and has poor generalization when region segmentation is performed on a complex scene.
The deep learning algorithm based on edge detection performs semantic segmentation on the high-altitude remote sensing image, valuable information in the image can be better extracted, segmentation accuracy of small-size targets in the remote sensing image can be improved, and data support can be provided for urban planning, desert detection management, urban greening management and water area supervision and the like. However, in an actual remote sensing image, due to complex environments with different conditions, such as various terrains, the target objects are mutually shielded, the city building types are rich, and the influence of a plurality of factors such as illumination, cloud cover and the like, the problems of great reduction of the accuracy of object edge segmentation details and blurred segmentation boundaries exist.
Therefore, it is very important to design a urban remote sensing image change detection method based on a dynamic mixing strategy, which can reduce the influence of the topographic environmental factors so as to improve the regional edge segmentation precision.
Disclosure of Invention
The invention aims to improve the defects in the technology and provides a urban remote sensing image change depth detection method based on a deep LabV3+ multi-scale network model by combining various mixed pooling strategies.
The invention adopts the following technical scheme:
the urban remote sensing image change depth monitoring method based on the dynamic mixing strategy comprises the following steps:
s1, preprocessing an urban remote sensing image, and marking areas of different categories in the city to obtain a data set;
s2, based on Xreception as a backbone network, adopting a network model of a dynamic mixing pooling strategy, and training the network by utilizing the data set in the step S1;
s3, carrying out same-proportion clipping on remote sensing images of the same city region at different times, inputting the remote sensing images into a trained network model, and dividing the images;
and S4, after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map.
Furthermore, the original remote sensing image of the urban area adopts an RSSCN7DataSet remote sensing image DataSet.
In step S1, the original remote sensing image is cut and semantically segmented into 256 x 256 images as preprocessing of data, and the cut images are marked by using a Labelme semantic segmentation marking tool and are classified into five categories of roads, buildings, water areas, green plants and air spaces.
Furthermore, labelme technology is an open source image labeling tool, which is mainly used for data set labeling work of example segmentation, semantic segmentation, target detection and classification tasks.
Further, since the sign features of each regional category in the urban area are generally more remarkable, we directly segment and label the preprocessed remote sensing image by using different colors, and then select the training data set by adopting a random selection mode.
Further, in step S2, the dataset is trained using a hybrid pooled deep labv3+ network model. The deep LabV3+ network model fuses multi-scale information, divides a model structure into a coding layer and a decoding layer, and introduces Xattention as a backbone network. The backbone network divides the method of applying serial hole convolution to the result into two modules, wherein one part is directly transmitted into the decoding layer, and the other part passes through the ASPP module. The method specifically comprises the following steps:
s21, at the encoding layer, extracting characteristic information of the detected urban remote sensing image through a backbone network, and sequentially changing the image size into 1/4,1/8 and 1/16 of the original image size.
S22, obtaining a 60 x 2048 feature map from a backbone network, entering a cavity space convolution pooling pyramid ASPP, wherein a general ASPP module consists of a 1*1 convolution layer, three 3*3 expansion convolution layers with the cavity rates of 6, 12 and 18 and a global average pooling layer, and in order to better improve the accuracy of network model edge segmentation, based on the structure, the global average pooling in the ASPP module is replaced by a mixing pooling strategy adopted in the structure, the input feature map is mixed and pooled to obtain a feature map with the size of 20 x 20, the channel compression is carried out through the convolution of 1*1, the three expansion convolution layers are restored to the height and the width of the input feature map through a deconvolution method, and the results obtained in all the parts are spliced and fused.
S23, merging the mixed result of the low-level features output by the corresponding level of the backbone network and the coding end at the decoding end, and recovering the prediction result of the resolution of the input image after up-sampling by the 3*3 convolution kernel, so that a result image of urban remote sensing image classification is obtained, and finally the precision of network model segmentation is improved.
S24, training the network model by using the RSSCN7DataSet remote sensing image DataSet in the dynamic mixing pooling strategy deep LabV3+ network model, and using the rest samples as test datasets for testing the network model.
Further, the hybrid pooling strategy in step S22 needs to optimize the 2048-layer feature map:
selecting the frequency alpha of maximum pooling in the k-th layer feature map k SelectingSelect average pooled frequency beta k And selecting a random pooled frequency gamma k The method comprises the following steps:
the method comprises the steps of selecting a maximum pooling method frequency from a k-th layer characteristic diagram, selecting an average pooling method frequency from the k-th layer characteristic diagram, selecting a random pooling method frequency from an i_sto-th layer characteristic diagram, and selecting a training set size of the optimized k-th layer characteristic diagram by i_total.
Mixed pooling result output of k-th layer feature map output in final network model k The method comprises the following steps:
wherein x is k_max Representing the result of the k-layer characteristic diagram after adopting the maximum pooling method, and x k_avg Representing the result of the k-layer characteristic diagram after adopting an average pooling method, and x k_sto And the result of the k-layer characteristic diagram after adopting a random pooling method is shown.
Further, the step of improving global average pooling in ASPP modules to dynamic hybrid pooling includes:
a. initializing, namely before training a deep LabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
b. Pooling the layer 1 feature images by using the methods of maximum pooling, average pooling and random pooling respectively, pooling the rest feature images by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average cross ratio (mIoU) of the predicted value and the true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature images;
c. b, optimizing a pooling strategy of the 2 nd-2048 th layer characteristic diagram of the same input by adopting the method;
d. b, performing the operations of the steps b and c on all training set samples;
e. obtaining alpha of each layer of feature images in training set by adopting different pooling strategies k 、β k And gamma k
Further, in step S22, an average cross-over ratio (mlou) is used to evaluate the merits of different pooling strategies, where mlou represents the ratio of the intersection and union of the predicted value and the true value of each class, and divided by the number of classes. The formula can be expressed as:
where k represents the number of non-empty categories, TP represents the number of real cases, FP represents the number of false positive cases, and FN represents the number of false negative cases.
Further, in step S2, the deep labv3+ network model uses a pixel-by-pixel cross entropy loss function softmax loss to process the multi-classification problem, and selects each pixel point as a sample, and calculates the cross entropy loss function of each pixel for the prediction category and the real category. The loss function softmax converts the plurality of outputs into probability values mapped within the (0, 1) interval for classification in the multi-classification process.
The output after softmax regression was:
wherein e is a natural constant of 2.71, p i,j For the prediction probability of the ith sample for the jth class, l i,j For the output of the neural network on the ith sample in the jth class, C is the number of original input classes.
The above equation changes the output to a probability distribution of (0, 1), and the distance between the predicted probability distribution and the probability distribution of the true value is calculated by the cross entropy loss function. The method comprises the following steps:
wherein N is the number of samples of the training set, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
In step S3, the remote sensing images of the same city area at different times are cut in the same proportion, and the cut images are input into the trained model.
Further, the step S4 specifically includes:
s41, calculating the cross ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the moment of T1 and T2, wherein the cross ratio loss is used for representing the change of the ith category in the area and has the following formula:
wherein L is i Indicating the size of the change in the ith region class, L i The larger the area, the larger the TP indicating the change of the area i Representing the number of real cases of the ith region class, FP i FN, representing the number of false positive cases of the ith region class i Representing the number of false counter examples for the ith zone category.
S42, splicing the segmented images in sequence.
S43, subtracting the urban remote sensing images spliced at the time of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and superposing the marked image with the original image, thereby displaying the urban change range on the remote sensing image.
The invention also discloses a system based on the urban remote sensing image change depth monitoring method, which comprises the following modules:
a data set acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
training module: based on Xreception as backbone network, adopting a deep LabV3+ network model of dynamic mixing pooling strategy, and utilizing a data set training network;
and a segmentation module: the remote sensing images of the same city area at different times are cut in the same proportion, and are input into a trained model to be segmented;
and the marking module is used for: and after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, different pooling modes are dynamically selected for pooling the characteristic images of each layer of the remote sensing image, so that the global information and the local information of the image can be better grasped, and the segmentation precision is improved.
2. According to the invention, random pooling is used as one of pooling strategies, so that the risk of overfitting is effectively reduced, and the generalization capability of the model is improved.
Drawings
FIG. 1 is a flow chart of a method for monitoring urban remote sensing image change depth based on a dynamic mixing strategy.
Fig. 2 is a mixed pooling flow diagram.
FIG. 3 is a DeepLabV3+ network model diagram.
Fig. 4 is a block diagram of a urban remote sensing image change depth monitoring system based on a dynamic mixing strategy.
Detailed Description
The preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
As shown in fig. 1-3, the urban remote sensing image change depth monitoring method based on the dynamic mixing strategy according to the embodiment is performed according to the following steps:
s1, preprocessing an image according to an RSSCN7DataSet remote sensing image published by the university of Wuhan in 2015, and marking different types of areas in a city to obtain a data set;
s2, a deep LabV3+ network model of a dynamic mixing pooling strategy is adopted based on Xreception as a backbone network, and a data set training network is utilized;
s3, carrying out same-proportion clipping on remote sensing images of the same city region at different times, inputting the remote sensing images into a trained model, and dividing the images.
And S4, after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map.
Specifically, in this embodiment, the original remote sensing image of the urban area adopts the RSSCN7DataSet remote sensing image DataSet.
In this embodiment, in step S1, the original remote sensing image is cut and semantically segmented as preprocessing of data, the original remote sensing image is cut into 256 x 256 sizes, and the cut image is marked by using a Labelme semantic segmentation marking tool, and is divided into five categories of roads, buildings, water areas, green plants and air spaces.
Furthermore, labelme technology is an open source image labeling tool, which is mainly used for data set labeling work of example segmentation, semantic segmentation, target detection and classification tasks.
Because the sign features of each regional category in the urban area are generally more remarkable, in the embodiment, the preprocessed remote sensing image is directly segmented and marked by using different colors, and then a training data set is selected by adopting a random selection mode.
In S2 of this embodiment, the data set is trained using the hybrid pooled deep labv3+ network model. The deep LabV3+ network model fuses multi-scale information, divides a model structure into a coding layer and a decoding layer, and introduces Xattention as a backbone network. The backbone network divides the method of applying serial hole convolution to the result into two modules, wherein one part is directly transmitted into the decoding layer, and the other part passes through the ASPP module. The method specifically comprises the following steps:
s21, at the encoding layer, extracting characteristic information of the detected urban remote sensing image through a backbone network, and sequentially changing the image size into 1/4,1/8 and 1/16 of the original image size.
S22, obtaining a 60 x 2048 feature map from a backbone network into a cavity space convolution pooling pyramid ASPP, wherein a general ASPP module consists of a 1*1 convolution layer, three 3*3 expansion convolution layers with the cavity rates of 6, 12 and 18 and a global average pooling layer, and in order to better improve the accuracy of network edge segmentation, based on the structure, the global average pooling in the ASPP module is replaced by a mixing pooling strategy adopted in the structure, the input feature map is mixed and pooled to obtain a feature map with the size of 20 x 20, the channel compression is carried out through the convolution of 1*1, the three expansion convolution layers are restored into the height and the width of the input feature map through a deconvolution method, and the results obtained in all the parts are spliced and fused.
S23, merging the mixed result of the low-level features output by the corresponding level of the backbone network and the coding end at the decoding end, and recovering the prediction result of the resolution of the input image after up-sampling by the 3*3 convolution kernel, so that a result image of urban remote sensing image classification is obtained, and finally the network segmentation precision is improved.
S24, in order to better improve the accuracy of network edge segmentation, global average pooling in ASPP is replaced by a mixed pooling strategy adopted in the process in a deep LabV3+ network structure. Training the RSSCN7DataSet remote sensing image DataSet, and using the rest samples as test datasets for a test network.
In this embodiment, the hybrid pooling strategy in step S22 needs to optimize the 2048-layer feature map:
selecting the frequency alpha of maximum pooling in the k-th layer feature map k Selecting an average pooled frequency beta k And selecting a random pooled frequency gamma k The method comprises the following steps:
the method comprises the steps of selecting a maximum pooling method frequency from a k-th layer characteristic diagram, selecting an average pooling method frequency from the k-th layer characteristic diagram, selecting a random pooling method frequency from an i_sto-th layer characteristic diagram, and selecting a training set size of the optimized k-th layer characteristic diagram by i_total.
Mixed pooling result output of k-th layer feature map output in final model k The method comprises the following steps:
wherein x is k_max Representing the result of the k-layer characteristic diagram after adopting the maximum pooling method, and x k_avg Representing the result of the k-layer characteristic diagram after adopting an average pooling method, and x k_sto And the result of the k-layer characteristic diagram after adopting a random pooling method is shown.
The step of improving global average pooling in ASPP modules to dynamic hybrid pooling comprises:
a. initializing, namely before training a deep LabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
b. Pooling the layer 1 feature images by using the methods of maximum pooling, average pooling and random pooling respectively, pooling the rest feature images by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average cross ratio (mIoU) of the predicted value and the true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature images;
c. b, optimizing a pooling strategy of the 2 nd-2048 th layer characteristic diagram of the same input by adopting the method;
d. b, performing the operations of the steps b and c on all training set samples;
e. obtaining alpha of each layer of feature images in training set by adopting different pooling strategies k 、β k And gamma k
In step S22 of this embodiment, the average cross-correlation ratio mlou is used to evaluate the merits of different pooling strategies, where mlou represents the ratio of the intersection and union of each class predictor and the true value, and divided by the number of classes. The formula can be expressed as:
where k represents the number of non-empty categories, TP represents the number of real cases, FP represents the number of false positive cases, and FN represents the number of false negative cases.
In step S2 of this embodiment, the deep labv3+ network model uses a pixel-by-pixel cross entropy loss function to process the multi-classification problem, and selects each pixel point as a sample, and calculates the cross entropy loss function for the prediction category and the real category of each pixel. softmax converts multiple outputs into probability value maps for classification within the (0, 1) interval in a multi-classification process.
The output after softmax regression was:
wherein p is i,j For the prediction probability of the ith sample for the jth class, l i,j Output for the network for the ith sample at class j. The above equation changes the output to a probability distribution of (0, 1), and the distance between the predicted probability distribution and the probability distribution of the true value is calculated by the cross entropy loss function. The method comprises the following steps:
wherein N is the number of samples of the training set, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
In step S3 of this embodiment, remote sensing images of the same city region at different times are cut in the same proportion, and input into a trained model to obtain a segmented image.
The step S4 of this embodiment specifically includes:
s41, calculating the cross ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the moment of T1 and T2, wherein the cross ratio loss is used for representing the change of the ith category in the area and has the following formula:
wherein L is i Indicating the size of the change in the ith region class, L i The larger the area, the larger the TP indicating the change of the area i Representing the number of real cases of the ith region class, FP i FN, representing the number of false positive cases of the ith region class i Representing the number of false counter examples for the ith zone category.
S42, splicing the segmented images in sequence.
S43, subtracting the urban remote sensing images spliced at the time of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and superposing the marked image with the original image, thereby displaying the urban change range on the remote sensing image.
As shown in fig. 4, this embodiment discloses a system based on the method for monitoring urban remote sensing image change depth according to the above embodiment, which includes the following modules:
a data set acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
training module: based on Xreception as backbone network, adopting a deep LabV3+ network model of dynamic mixing pooling strategy, and training the network model by utilizing a data set;
and a segmentation module: the remote sensing images of the same city area at different times are cut in the same proportion, and are input into a trained model to be segmented;
and the marking module is used for: and after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map.
According to the invention, different pooling modes are dynamically selected for pooling the characteristic images of each layer of the remote sensing image, so that the global information and the local information of the image can be better grasped, and the segmentation precision is improved. According to the invention, random pooling is used as one of pooling strategies, so that the risk of overfitting is effectively reduced, and the generalization capability of the model is improved.

Claims (7)

1. The urban remote sensing image change depth monitoring method based on the dynamic mixing strategy is characterized by comprising the following steps of:
s1, preprocessing an urban remote sensing image, and marking areas of different categories in the city to obtain a data set;
s2, based on Xreception as a backbone network, adopting a network model of a dynamic mixing pooling strategy, and training the network by utilizing the data set in the step S1;
s3, carrying out same-proportion clipping on remote sensing images of the same city region at different times, inputting the remote sensing images into a trained model, and dividing the images;
s4, after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map;
the step S2 is specifically as follows:
s21, extracting characteristic information of the detected urban remote sensing image through a backbone network at a coding layer, and sequentially changing the image size into 1/4,1/8 and 1/16 of the original image size;
s22, a 60 x 2048 feature map obtained by a backbone network enters a cavity space convolution pooling pyramid ASPP, an ASPP module consists of a 1*1 convolution, three 3*3 expansion convolution layers with the void ratio of 6, 12 and 18 and a global average pooling layer, on the basis of the structure, the global average pooling in the ASPP module adopts a mixed pooling strategy, an input feature map is subjected to mixed pooling to obtain a feature map with the size of 20 x 20, channel compression is carried out through the convolution of 1*1, and finally the feature map is restored to the height and the width of the input feature map through a deconvolution method, and the results obtained in all parts are spliced and fused;
s23, merging the mixed result of the low-layer features and the coding layers output by the corresponding layers of the backbone network at the decoding layer, and recovering the prediction result of the resolution of the input image after up-sampling through a 3*3 convolution kernel to obtain a result image of urban remote sensing image classification;
s24, training a network model by using an RSSCN7DataSet remote sensing image DataSet in a dynamic mixing pooling strategy deep LabV3+ network model, and taking the rest samples as test datasets for testing the network model;
in step S22, the hybrid pooling strategy optimizes the 2048-layer feature map:
selecting the frequency alpha of maximum pooling in the k-th layer feature map k Selecting an average pooled frequency beta k And selecting a random pooled frequency gamma k The method comprises the following steps:
the method comprises the steps of selecting a maximum pooling frequency from a k-th layer characteristic diagram, selecting an average pooling frequency from the k-th layer characteristic diagram, selecting a random pooling frequency from an i_sto-th layer characteristic diagram, and selecting a training set of the optimized k-th layer characteristic diagram by i_total, wherein i_max is the frequency of the maximum pooling method in the k-th layer characteristic diagram, i_avg is the frequency of the average pooling method in the k-th layer characteristic diagram;
mixed pooling result output of k-th layer feature map output in final model k The method comprises the following steps:
wherein x is k_max Representing the result of the k-layer characteristic diagram after adopting the maximum pooling method, and x k_avg Representing the result of the k-layer characteristic diagram after adopting an average pooling method, and x k_sto Representing the result of the k-layer feature map after adopting a random pooling method;
the global average pooling in the ASPP module is improved to dynamic mixing pooling, and the method comprises the following specific steps:
a. initializing, namely before training a deep LabV3+ network model, initializing the initialization weight alpha of each pooling strategy of each layer of feature map k 、β k And gamma k Are all arranged as
b. Pooling the layer 1 feature images by using the methods of maximum pooling, average pooling and random pooling respectively, pooling the rest feature images by using a mixed pooling method, evaluating the advantages and disadvantages of different pooling strategies by calculating the average cross ratio (mIoU) of the predicted value and the true value, and taking the pooling mode with the maximum mIoU value as the selected pooling mode of the layer 1 feature images;
c. b, optimizing a pooling strategy of the 2 nd-2048 th layer characteristic diagram of the same input by adopting the method of the step b;
d. b, performing the operations of the steps b and c on all training set samples;
e. obtaining alpha of each layer of feature images in training set by adopting different pooling strategies k 、β k And gamma k
2. The urban remote sensing image change depth monitoring method based on dynamic mixing strategy as claimed in claim 1, wherein the method is characterized by comprising the following steps: in the step S1, a remote sensing image adopts an RSSCN7DataSet remote sensing image DataSet; or in step S1, clipping and semantic segmentation are performed on the urban remote sensing image as preprocessing of data, clipping is performed to 256 x 256, and the clipped image is marked by using a Labelme semantic segmentation marking tool, so that the clipped image is classified into five categories of roads, buildings, water areas, green plants and air spaces.
3. The urban remote sensing image change depth monitoring method based on dynamic mixing strategy as claimed in claim 2, wherein the method is characterized by comprising the following steps: in step S1, the preprocessed remote sensing image is segmented and marked by adopting different colors, and then a training data set is selected by adopting a random selection mode.
4. The urban remote sensing image change depth monitoring method based on dynamic mixing strategy as claimed in claim 1, wherein the method is characterized by comprising the following steps: in step S22, the average intersection ratio mlou is used to evaluate the merits of different pooling strategies, where mlou represents the ratio of the intersection and union of each category predicted value and the true value to be added, and divided by the number of categories, and is expressed as:
where k represents the number of non-empty categories, TP represents the number of real cases, FP represents the number of false positive cases, and FN represents the number of false negative cases.
5. The urban remote sensing image change depth monitoring method based on the dynamic mixing strategy as claimed in claim 4, wherein the method is characterized by comprising the following steps:
in the step S2, the deep LabV3+ network model adopts a pixel-by-pixel cross entropy loss function softmax loss to process the multi-classification problem, each pixel point is selected as a sample, and the cross entropy loss function of each pixel is calculated for the prediction category and the real category of each pixel; the loss function softmax converts a plurality of outputs into probability values and maps the probability values in a (0, 1) interval for classification in a multi-classification process;
the output after softmax regression was:
wherein e is a natural constant of 2.71, p i,j For the prediction probability of the ith sample for the jth class, l i,j Outputting the ith sample in the jth class for the neural network, wherein C is the number of the original input classes;
the above-mentioned method changes the output into a probability distribution of (0, 1), and calculates the distance between the predicted probability distribution and the probability distribution of the true value by the cross entropy loss function, specifically:
wherein N is the number of samples of the training set, C is the number of original input classes, Y (i) is the class to which the ith sample belongs, and w is the weight of the sample data.
6. The urban remote sensing image change depth monitoring method based on dynamic mixing strategy as claimed in claim 5, wherein the method is characterized by comprising the following steps: the step S4 specifically comprises the following steps:
s41, calculating the cross ratio loss of the predicted value of the ith category in the urban remote sensing image of the same area at the moment of T1 and T2, wherein the cross ratio loss is used for representing the change of the ith category in the area and has the following formula:
wherein L is i Indicating the size of the change in the ith region class, L i The larger the area, the larger the TP indicating the change of the area i Representing the number of real cases of the ith region class, FP i FN, representing the number of false positive cases of the ith region class i A number of false counter examples representing the i-th region class;
s42, splicing the segmented images in sequence;
s43, subtracting the urban remote sensing images spliced at the time of T1 and T2, eliminating pixel points of the same region category, obtaining a part only containing a change region, marking, and superposing the marked image with the original image to display the urban change range on the remote sensing image.
7. Urban remote sensing image change depth monitoring system based on dynamic mixing strategy, which is used for executing the urban remote sensing image change depth monitoring method according to any one of claims 1-6, and is characterized in that the system comprises the following modules:
a data set acquisition module: preprocessing the urban remote sensing image, and labeling urban areas of different categories to obtain a data set;
training module: based on Xreception as backbone network, adopting a deep LabV3+ network model of dynamic mixing pooling strategy, and utilizing a data set training network;
and a segmentation module: the remote sensing images of the same city area at different times are cut in the same proportion, and are input into a trained network model to be segmented;
and the marking module is used for: and after obtaining the regional classification result of the urban remote sensing image, calculating the change degree of each regional category in a period of time and labeling the change on the map.
CN202211138291.1A 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy Active CN115497006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211138291.1A CN115497006B (en) 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211138291.1A CN115497006B (en) 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy

Publications (2)

Publication Number Publication Date
CN115497006A CN115497006A (en) 2022-12-20
CN115497006B true CN115497006B (en) 2023-08-01

Family

ID=84471406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211138291.1A Active CN115497006B (en) 2022-09-19 2022-09-19 Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy

Country Status (1)

Country Link
CN (1) CN115497006B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116343113A (en) * 2023-03-09 2023-06-27 中国石油大学(华东) Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016141282A1 (en) * 2015-03-04 2016-09-09 The Regents Of The University Of California Convolutional neural network with tree pooling and tree feature map selection
CN108053420A (en) * 2018-01-05 2018-05-18 昆明理工大学 A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class
CN112069831A (en) * 2020-08-21 2020-12-11 三峡大学 Unreal information detection method based on BERT model and enhanced hybrid neural network
CN112233038A (en) * 2020-10-23 2021-01-15 广东启迪图卫科技股份有限公司 True image denoising method based on multi-scale fusion and edge enhancement
CN112308402A (en) * 2020-10-29 2021-02-02 复旦大学 Power time series data abnormity detection method based on long and short term memory network
CN114663759A (en) * 2022-03-24 2022-06-24 东南大学 Remote sensing image building extraction method based on improved deep LabV3+

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016141282A1 (en) * 2015-03-04 2016-09-09 The Regents Of The University Of California Convolutional neural network with tree pooling and tree feature map selection
CN108053420A (en) * 2018-01-05 2018-05-18 昆明理工大学 A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class
CN112069831A (en) * 2020-08-21 2020-12-11 三峡大学 Unreal information detection method based on BERT model and enhanced hybrid neural network
CN112233038A (en) * 2020-10-23 2021-01-15 广东启迪图卫科技股份有限公司 True image denoising method based on multi-scale fusion and edge enhancement
CN112308402A (en) * 2020-10-29 2021-02-02 复旦大学 Power time series data abnormity detection method based on long and short term memory network
CN114663759A (en) * 2022-03-24 2022-06-24 东南大学 Remote sensing image building extraction method based on improved deep LabV3+

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Mixed Pooling for Convolutional Neural Networks;Dingjun Yu et al.;《ResearchGate》;第1-13页 *
基于深度学习的盲道障碍物检测算法研究;段中兴等;《计算机测量与控制》;第27-32页 *

Also Published As

Publication number Publication date
CN115497006A (en) 2022-12-20

Similar Documents

Publication Publication Date Title
CN109033998B (en) Remote sensing image ground object labeling method based on attention mechanism convolutional neural network
CN110136170B (en) Remote sensing image building change detection method based on convolutional neural network
CN108830285B (en) Target detection method for reinforcement learning based on fast-RCNN
CN109871875B (en) Building change detection method based on deep learning
CN109410171B (en) Target significance detection method for rainy image
CN110532961B (en) Semantic traffic light detection method based on multi-scale attention mechanism network model
CN110598564B (en) OpenStreetMap-based high-spatial-resolution remote sensing image transfer learning classification method
CN110766690B (en) Wheat ear detection and counting method based on deep learning point supervision thought
WO2023168781A1 (en) Soil cadmium risk prediction method based on spatial-temporal interaction relationship
CN113449594A (en) Multilayer network combined remote sensing image ground semantic segmentation and area calculation method
CN112347970A (en) Remote sensing image ground object identification method based on graph convolution neural network
CN112232328A (en) Remote sensing image building area extraction method and device based on convolutional neural network
CN115497006B (en) Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy
CN114419468A (en) Paddy field segmentation method combining attention mechanism and spatial feature fusion algorithm
CN111563408B (en) High-resolution image landslide automatic detection method with multi-level perception characteristics and progressive self-learning
CN113988147A (en) Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device
CN113435254A (en) Sentinel second image-based farmland deep learning extraction method
Yan et al. Glacier classification from Sentinel-2 imagery using spatial-spectral attention convolutional model
CN111242028A (en) Remote sensing image ground object segmentation method based on U-Net
CN113077438B (en) Cell nucleus region extraction method and imaging method for multi-cell nucleus color image
CN113128335A (en) Method, system and application for detecting, classifying and discovering micro-body paleontological fossil image
Engstrom et al. Evaluating the Relationship between Contextual Features Derived from Very High Spatial Resolution Imagery and Urban Attributes: A Case Study in Sri Lanka
CN112016845A (en) DNN and CIM based regional economic benefit evaluation method and system
Luo et al. RBD-Net: robust breakage detection algorithm for industrial leather
CN115797904A (en) Active learning method for multiple scenes and multiple tasks in intelligent driving visual perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant