CN113191213B - High-resolution remote sensing image newly-added building detection method - Google Patents

High-resolution remote sensing image newly-added building detection method Download PDF

Info

Publication number
CN113191213B
CN113191213B CN202110389794.5A CN202110389794A CN113191213B CN 113191213 B CN113191213 B CN 113191213B CN 202110389794 A CN202110389794 A CN 202110389794A CN 113191213 B CN113191213 B CN 113191213B
Authority
CN
China
Prior art keywords
building
image
resolution
phase
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110389794.5A
Other languages
Chinese (zh)
Other versions
CN113191213A (en
Inventor
孙希延
肖钰
纪元法
黄建华
付文涛
白杨
郭宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202110389794.5A priority Critical patent/CN113191213B/en
Publication of CN113191213A publication Critical patent/CN113191213A/en
Application granted granted Critical
Publication of CN113191213B publication Critical patent/CN113191213B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Abstract

The invention discloses a high-resolution remote sensing image newly-added building detection method, which comprises the steps of carrying out superpixel segmentation on an obtained second time phase GF2 image to obtain a superpixel object; then constructing a building data set by using the time phase GF2 image; inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image, wherein the multi-scale constraint coding and decoding network adopts a dual-path system structure to respectively obtain global information and local information, can better distinguish buildings from a complex background and refine details of the buildings by combining the global information and the local information, and simultaneously combines the obtained superpixel object with the building binary image to obtain a second time-phase building target object; and then, obtaining a pixel level change detection result by using an IRMAD algorithm, and carrying out spatial position superposition analysis based on the building target object and the pixel level change detection result to realize detection of the newly added building.

Description

High-resolution remote sensing image newly-added building detection method
Technical Field
The invention relates to the technical field of remote sensing image processing, in particular to a method for detecting a newly added building of a high-resolution remote sensing image.
Background
The building is used as an important artificial ground object target in a basic geographic database, and the realization of automatic change detection on the building is always a research hotspot and difficulty in the fields of artificial intelligence, photogrammetry, remote sensing and the like. The rapid and accurate building change detection has important significance in urban planning, post-earthquake disaster relief and the like.
There are three main strategies for building change detection: firstly, extracting buildings from remote sensing images in different periods, and then carrying out change detection by using an extraction result; building change detection is directly carried out by using the building characteristics; change detection is performed first, and then the part belonging to the building in the change area is judged. Change detection is directly carried out on the basis of building extraction, and the result of the change detection often excessively depends on building extraction precision. The key point of directly detecting the change of the building by adopting the characteristics of the building is the effectiveness of the characteristics of the building, and the stability and the applicability of the characteristics are still to be improved. Therefore, a strategy of synchronously extracting the building and detecting the change is adopted, the extraction of the building and the change detection of all the land features are independently completed respectively, and then the two results are combined to obtain the detection result of the change of the building.
The traditional remote sensing image building extraction method mainly expresses 'what is a building' by empirically designing an appropriate feature, and creates a corresponding feature set for automatic identification and extraction of the building. Common characteristic indicators include spectrum, length, edges, shape, texture, shading, etc., but these characteristics can vary significantly with season, lighting, atmospheric conditions, sensor quality, dimensions, building style, and environment. The method for designing features by experience can only process specific data, and cannot realize real automation, so that at present, deep learning is applied to remote sensing image building extraction, and the traditional method for designing features by artificial experience is replaced by the capability of automatically learning multi-level feature representation by deep learning. On the other hand, the deep learning-based method strongly depends on a large-capacity and high-precision sample database, and only quantitative comparison can be performed on theories and methods by using an open source data set, so that the method cannot be used for practical application.
Disclosure of Invention
The invention aims to provide a high-resolution remote sensing image newly-added building detection method, and aims to solve the technical problems that in the prior art, when deep learning is used for extracting a building, the accuracy of extracting building features is low and no proprietary data set exists in the high-resolution remote sensing image newly-added building detection.
In order to achieve the purpose, the invention adopts a method for detecting the newly added building by using the high-resolution remote sensing image, which comprises the following steps:
selecting high-resolution GF2 images available in two phases, comparing the first-phase high-resolution GF2 image with retention, and performing superpixel segmentation on the second-phase high-resolution GF2 image to obtain a superpixel object;
constructing a building extraction dataset using the second temporal high-resolution GF2 images;
inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image;
the super pixel object is combined with the second time-phase building binary image to obtain a second time-phase building target object;
performing differential processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result;
and carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building.
Wherein the construction of the building extraction data set using the second-phase high-resolution GF2 image comprises the following steps:
processing the second time-phase high-resolution GF2 image to obtain a morphological building index gray-scale map;
calculating the building index mean value in the super-pixel object, setting a threshold value and obtaining a suspected building pattern spot after segmentation;
the suspected building pattern spots are converted into final building labels through manual modification;
randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set;
the building extraction data set is divided into a training set, a verification set and a test set.
And in the process of randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set, dividing the large-size image into 512 by 512 data sets with specified sizes, and performing data expansion through horizontal overturning, vertical overturning and diagonal overturning.
Wherein the proportion of the training set, the verification set and the test set in the building extraction data set is 6:2:2.
the multi-scale constraint coding and decoding network comprises an encoder and a decoder, wherein the encoder consists of a dual-path system structure and multi-scale branches, the dual-path system structure comprises a local information path and a whole local information path, the local information path uses expansion convolution to extract features, the whole information path adopts VGG16 to extract features, and the multi-scale branches acquire multi-scale information through different down-sampling multiples.
The decoder adopts a multipath feature fusion module, and the multipath feature fusion module distributes different weights to feature maps of different perception fields.
And performing differentiation processing by using an IRMAD algorithm in the process of performing differentiation processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result.
And in the process of analyzing the spatial position of the pixel-level change detection result and the second time-phase building target object to realize the detection of the newly added building, specifically judging by using the intersection after the pixel-level change detection result and the spatial position of the second time-phase building target object are superposed.
The invention relates to a high-resolution remote sensing image newly-added building detection method, which comprises the steps of carrying out superpixel segmentation on an obtained second time phase GF2 image to obtain a superpixel object; then, building data sets are constructed by the time phase GF2 images; inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image, wherein the multi-scale constraint coding and decoding network adopts a double-path architecture to respectively obtain global information and local information, can better distinguish buildings from complex backgrounds and refine details of the buildings by combining the global information and the local information, and simultaneously combines the obtained super-pixel object with the building binary image to obtain a second time-phase building target object; and then, an IRMAD algorithm is used for obtaining a pixel level change detection result, and spatial position superposition analysis is carried out based on a building target object and the pixel level change detection result, so that detection of a newly-built building is realized, and the technical problems that in the prior art, when a building is extracted by deep learning, building feature extraction accuracy is low and a proprietary data set does not exist in detection of a newly-added building with a high-resolution remote sensing image are solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a high-resolution remote sensing image newly-added building detection method of the invention.
Fig. 2 is a diagram of a second time-phase image segmentation result according to an embodiment of the invention.
FIG. 3 is a morphological building index feature map of an embodiment of the present invention.
FIG. 4 is a suspected speckle pattern of a building, according to an embodiment of the invention.
Fig. 5 is a final building label diagram of an embodiment of the present invention.
Fig. 6 is a network structure diagram of the multi-scale constraint codec network of the present invention.
FIG. 7 is a schematic diagram of the convolutional expansion of the multi-scale constrained codec network of the present invention.
FIG. 8 is a diagram of a semantic information path structure of the multi-scale constrained codec network of the present invention.
FIG. 9 is a schematic diagram of a feature fusion module of the multi-scale constrained codec network according to the present invention.
Fig. 10 is a second phase building object diagram of an embodiment of the present invention.
FIG. 11 is a diagram of pixel level variation detection results in accordance with an embodiment of the present invention.
Fig. 12 is a diagram showing the detection result of the newly added building according to the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and are intended to be illustrative of the invention and should not be construed as limiting the invention.
In the description of the present invention, it is to be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in the orientations and positional relationships indicated in the drawings only for the convenience of description and simplicity of description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated in a particular manner, and thus, are not to be construed as limiting the present invention. In addition, in the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Referring to fig. 1, the present invention provides a method for detecting a new building with a high resolution remote sensing image, which includes the following steps:
s1: selecting high-resolution GF2 images available in two phases, comparing the first time-phase high-resolution GF2 image in a retention mode, and performing superpixel segmentation on the second time-phase high-resolution GF2 image to obtain a superpixel object;
s2: constructing a building extraction dataset using the second temporal high-resolution GF2 images;
s3: inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image;
s4: the super pixel object and the second time-phase building binary image are combined to obtain a second time-phase building target object;
s5: performing differential processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result;
s6: and carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building.
Constructing a building extraction dataset using the second-phase high-resolution GF2 images, comprising the steps of:
s21, processing the second time-phase high-resolution GF2 image to obtain a morphological building index gray scale image;
s22, calculating the building index mean value in the superpixel object, setting a threshold value, and segmenting to obtain suspected construction spots;
s23, converting the suspected building pattern spots into final building labels through manual modification;
s24, randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set;
and S25, dividing the building extraction data set into a training set, a verification set and a test set.
And in the process of randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set, dividing the large-size image into 512 by 512 data sets with specified sizes, and performing data expansion through horizontal overturning, vertical overturning and diagonal overturning.
The proportion of the training set, the verification set and the test set in the building extraction data set is 6:2:2.
the multi-scale constraint coding and decoding network comprises an encoder and a decoder, wherein the encoder is composed of a double-path architecture and multi-scale branches, the double-path architecture comprises a local information path and a global information path, the local information path uses expansion convolution to extract features, the global information path adopts VGG16 to extract features, and the multi-scale branches acquire multi-scale information through different down-sampling multiples.
The decoder adopts a multipath feature fusion module which allocates different weights to feature maps of different sensed views.
And performing differentiation processing by using an IRMAD algorithm in the process of performing differentiation processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result.
And in the process of carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building, specifically judging by using an intersection formed by superposing the pixel level change detection result and the second time-phase building target object in spatial position.
Referring to fig. 2 to 12, the present invention provides a specific embodiment of a method for detecting a new building with high resolution remote sensing images, which comprises:
the research area is a color-stacking area of Guilin city of the Guangxi Zhuang nationality autonomous area, a typical sub-area of the area is selected for analysis, data come from 2016 and 2018 two-phase GF2 images, the second-phase images, namely the 2018-year GF2 images, are subjected to superpixel segmentation, a data set is constructed, a multi-scale constraint coding and decoding network (MSCNet) extracts a second-phase building target object, the two-phase image images, namely the 2016 and 2018-year GF2 images, are subjected to change detection to obtain a pixel level change detection result, and the pixel level change detection result and the building target object are subjected to spatial position analysis to realize new building detection.
(A) Superpixel segmentation
The Mean Shift algorithm includes two steps, superpixel primary segmentation and merging of segmented regions.
The super-pixel primary segmentation is equivalent to the realization of primary segmentation by means of a module point search technology, and the specific process is as follows: (1) Setting the kernel function bandwidth of a coordinate space and a spectrum space; (2) calculating Mean-Shift vector by using a Gaussian kernel function; (3) And judging whether the modulus of the vector is larger than a specified threshold value or not, thereby carrying out iterative operation and determining the position of the modulus point.
The merging of the segmentation regions is to merge the spatially adjacent and spectrally similar regions into the same object after the primary segmentation of the superpixel is completed, so as to implement image segmentation, and the segmentation result is shown in fig. 2.
(B) Building a data set
(1) Calculating a morphological building index feature map:
the Morphological Building Index (MBI) is based on the large spectral variation at the edge of the building and the small spectral variation at its interior. The index is constructed in consideration of the shape, direction, brightness, contrast, and other characteristics of the building. The steps for establishing MBI are as follows:
1) Calculating a luminance value
Figure BDA0003016149890000071
In the formula: k is the number of bands of the visible light spectrum k (x) The luminance value of the k-th band at the pixel x is selected as the luminance value of each pixel in the visible light band because the visible light band has a large influence on the spectral information of the building.
2) Morphological white cap reconstruction
Figure BDA0003016149890000072
In the formula:
Figure BDA0003016149890000073
to perform the morphological opening operation on the luminance image b, d and s represent the direction and scale of the linear structural element, respectively. Because the spectral information of buildings and roads are relatively similar, the directions of the roads are less, the roads generally extend along one or two directions, the directions of the buildings are more, and linear structural elements with multiple directions and scales are selected for well distinguishing the roads from the buildings.
3) Calculating the morphological Profile MP
Figure BDA0003016149890000074
4) Calculating a differential morphology Profile DMP
DMP W-TH (d,s)=|WP W-TH (d,s+Δs)-MP W-TH (d,s) (4)
In the formula: s min ≤Δs≤S max
5) Calculating a morphological building index MBI
Figure BDA0003016149890000075
In the formula: s = ((S) max -S min ) And the value of/. DELTA.s) +1,D is the number of directions in calculating the cross section of the building.
The reason for establishing the MBI is that the differential morphology section has higher local contrast, so that the building with larger MBI characteristic value is shown as a building, as shown in FIG. 3.
(2) And calculating the MBI mean value in each super pixel, and when the mean value is larger than a certain specified threshold value, regarding the super pixel as a suspected building spot, generating a diagram 4, and generating a final building label after manual modification, such as the diagram 5.
(3) Random clipping and expanding: the invention adopts a random cutting mode to cut the preprocessed image and the label sample, divides the large-size image into 512-512 data sets with specified size, carries out data expansion through horizontal turning, vertical turning and diagonal turning, and finally, according to the following steps of 6:2: the scale of 2 is divided into a training set, a validation set, and a test set for later training and test evaluation.
(C) Multi-scale constrained codec network (MSCNet) extraction building
The network includes an encoder and a decoder, the encoder being constructed from a dual-path architecture and multi-scale branches. The dual path architecture handles local information and global information separately, which are defined herein as local information paths and global information paths, respectively. Specifically, the problem of resolution reduction in the downsampling process is solved by adopting expansion convolution on the local information path, and more local information is reserved. The global information path adopts VGG16 to extract features, the receptive field is increased to obtain global information, meanwhile, multi-scale branches are designed on the path, and the multi-scale information is obtained through different down-sampling multiples.
And a special feature fusion module is introduced in the decoding part to fuse the obtained multiple features. And then, two strategies are adopted to restrict the updating of the parameters on each path. One is to add a constraint (here, a constraint refers to an optimization objective between a prediction of a certain path and a corresponding ground truth) behind each path in the upsampling step. The parameters are updated through multi-path constraints, so that multi-resolution tags can be utilized in back propagation and weight updating, the parameters are prevented from being biased to a single constraint, and the feature representation of each path is further strengthened. Specifically, a 1 × 1 convolutional layer after the Sigmoid layer is used to obtain dense predictions from a feature map with a certain resolution, and the corresponding tags are down-sampled from the ground truth by bilinear interpolation. Finally, the overall loss is used for the network to learn to integrate the multi-resolution segment label graph into the final prediction; secondly, the final characteristic diagrams on each path are fused to form a characteristic diagram containing 4 channels, and finally a final prediction diagram is obtained through 1 multiplied by 1 convolution and sigmoid function, through the strategy, the final prediction diagram converges multipath characteristics, restricts multipath updating parameters, and each path plays an active role in back propagation and model training, and a network structure schematic diagram is shown in fig. 6.
(1) An encoder:
in order to preserve the original input image size and encode rich local information while maintaining a certain perceptual field of view, the present algorithm uses a dilation convolution in the local information path. The path contains three layers, the first two layers include a convolution with step 1, then batch normalization and ReLU, and the third layer is a dilation convolution. Therefore, the output feature map of this path extraction maintains the size of the original image. Because the space size of the feature map is large, the feature map encodes rich local information. The expression of the common convolution is
Figure BDA0003016149890000091
Where O (x, y) is the pixel value of the original image at point (x, y), and H (x, y) is the convolution kernel by which it is multiplied, with a size of w × H.
The expansion convolution is calculated as
Figure BDA0003016149890000092
Where l is the dilation factor and H' (x, y) is the dilation convolution kernel.
As can be seen from equations (6) and (7), the dilation convolution essentially fills the convolution kernel by 0, which can increase the field of view of the convolution kernel while retaining the original pixel information, thereby increasing the resolution. If the convolution kernel size is k and the dilation rate is l, the actual effective size of the dilation convolution is k + (k-1) × (l-1). The dilated convolution not only enlarges the perceived field of view, but also maintains the same resolution as the normal convolution, as compared to the normal convolution of the same size, which is schematically illustrated in fig. 7.
The global information path adopts VGG16 to extract features, the receptive field is increased to obtain global information, and meanwhile, the global information path adopts VGG16 to extract featuresAnd designing multi-scale branches for the path, and acquiring multi-scale information through different down-sampling multiples. Specifically, after block 1, three parallel independent branches are split, and the feature map is rapidly downsampled to obtain a large receptive field. Each branch is intended to encode features to a particular resolution, making full use of semantic information at different scales. Referring to FIG. 8, for a simple representation of the global information path and the multi-scale branching structure, use N sr Representing the feature layer of the stage, wherein s represents the branch, r represents the down sampling times, and the resolution of the sub-network feature map of the s-th branch is the original size divided by 2 r . The highest resolution of the original image is used as the input of the block 1, then the characteristic graph N is output by each branch after parallel branches 12 、N 23 、N 34 The sizes of the (A) and (B) are respectively 1/4, 1/8 and 1/16 of the original figure.
(2) A decoder:
buildings of different dimensions have different importance in sensing the size of the field of view. For larger objects, the features acquired with a large perceived field of view are important, while for smaller objects, the features acquired with a large perceived field of view may acquire excessive peripheral information, leading to errors. The traditional feature fusion method is generally cascade or addition, and the simple method does not consider different perception fields of different feature maps and neglects specificity among features. In contrast, the multipath feature fusion module adopted by the present invention assigns different weights to feature maps of different perception fields, and achieves better feature fusion, as shown in fig. 9.
First, two or more input feature maps are concatenated at the channel dimension level. Secondly, the concatenated feature maps are passed through convolution kernels with the size of 3 x 3 to realize the preliminary fusion of feature map information, and the obtained feature maps are subjected to global pooling operation to extract the information of each feature map. Then, the obtained feature maps are passed through a convolution kernel of size 1 × 1, and the network is allowed to learn the weights from the overall information of each feature map. And finally, obtaining the final weight through a sigmoid function, and multiplying the final weight by the original characteristic diagram. Through the characteristic fusion module, weights are distributed to the characteristic graphs in different perception visual fields, so that the characteristic specificity in different perception visual fields is reflected, and the characteristics are fused better.
Since the output is targeted to a binary classification of buildings and non-buildings, the sigmoid function is chosen to generate predictions for each layer:
Figure BDA0003016149890000101
Figure BDA0003016149890000102
w∈R c and b ∈ R 1 Representing the weight and the deviation, respectively. Predicting y i,j Is limited to [0,1 ]]。
For better convergence during training iterations, binary cross-entropy is chosen to compute the kth constraint (C) between each prediction and the relative ground truth k ) Rather than a simple Mean Square Error (MSE). The formula is:
Figure BDA0003016149890000111
wherein h is k And w k Is the kth prediction y k And ground true value g k Height and width of (a). If the observation is of the category 1,
Figure BDA0003016149890000112
has a value of 1; otherwise, the value is 0.
Figure BDA0003016149890000113
Is the predicted probability that the pixel belongs to class 1.
The MSCNet model generates a pyramidal element map layer by sequential convolution and upsampling blocks and skipping connections.
In strategy one, for each feature layer in the feature pyramid, after sigmoid activation, a single kernel of 1 × 1 convolution is applied to generate a prediction for that layer. Then, the constraint of each layer can be passed through each predictionAnd the associated ground truth. These constraints are denoted as C, in terms of distance from the final convolution layer main 、C′ main 、C sub1 、C sub2 And C sub3 . Thus, the final penalty for MSCNet strategy one can be expressed as:
Loss=α×C main +γC sub1 +λC sub2 +σC sub3 (11)
wherein the sum of α, γ, λ and σ is set to 1.0.
And for the strategy two, fusing each feature layer in the feature pyramid to form a feature map containing 4 channels, and finally obtaining a final prediction map through 1 × 1 convolution and a sigmoid function. Thus, the final penalty for MSCNet strategy two can be expressed as:
Loss′=C main′ (12)
all the layers are trained by small-batch Stochastic Gradient Descent (SGD) and Back Propagation (BP) algorithms to minimize the final loss, and the MSCNet model learns how to map from the input multichannel remote sensing images to the equal-size binary segmentation maps. And finally, combining the predicted binary image with the superpixel object to obtain a building target object, as shown in fig. 10.
(D) IRMAD-based pixel level change detection
The method comprises the steps of obtaining a difference image of 2 time phase remote sensing images by using an IRMAD algorithm, then selecting a proper threshold value, and dividing pixels on the difference image into variable pixels or non-variable pixels. The basic principle of pixel level change detection based on the IRMAD is as follows: introducing a random variable T related to the MAD component and carrying out iterative weighting on pixels through a chi-square distribution probability function, so that the unchanged pixels can obtain larger weight in the iterative process, then carrying out next iteration until convergence by using new weight to generate a difference graph, wherein the probability that a brighter area in the graph is changed is higher, and finally, a threshold value is assigned to judge whether each pixel is changed, so as to generate a pixel-level change detection result, please refer to FIG. 11.
(E) The specific discrimination process of the spatial position superposition method is as follows:
(1) And (4) judging the intersection condition of the extracted result of the time-phase image building and the changed pixels, as shown in a formula (12).
Figure BDA0003016149890000121
In the formula (I), the compound is shown in the specification,
Figure BDA0003016149890000122
the ith building object of the rear phase image is shown, C is the pixel level change detection result,
Figure BDA0003016149890000123
it represents the number of pixels that the intersection of the object and the later-phase building extraction result contains.
(2) According to
Figure BDA0003016149890000124
Size, setting rules to judge newly added buildings:
Figure BDA0003016149890000125
exceed
Figure BDA0003016149890000126
The area of the building object is half, the object is a new building, and otherwise, the building object is not changed. Fig. 12 shows the detection result of the newly added building.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (6)

1. A method for detecting a newly added building by using a high-resolution remote sensing image is characterized by comprising the following steps:
selecting high-resolution GF2 images available in two phases, comparing the first-phase high-resolution GF2 image in retention, and performing superpixel segmentation on the second-phase high-resolution GF2 image to obtain a superpixel object;
constructing a building extraction data set by using the second time-phase high-resolution GF2 image;
inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image;
the multi-scale constraint coding and decoding network comprises a coder and a decoder, wherein the coder consists of a dual-path architecture and multi-scale branches, the dual-path architecture comprises a local information path and a global information path, the local information path uses expansion convolution to extract features, the global information path adopts VGG16 to extract features, and the multi-scale branches acquire multi-scale information through different down-sampling multiples;
the decoder adopts a multipath feature fusion module which allocates different weights to feature maps of different perception fields;
the super pixel object is combined with the second time-phase building binary image to obtain a second time-phase building target object;
performing differential processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result;
and carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building.
2. The method for detecting the newly added building by using the high-resolution remote sensing image as claimed in claim 1, wherein the building extraction data set is constructed by using the second-time-phase high-resolution GF2 image, and the method comprises the following steps:
processing the second time-phase high-resolution GF2 image to obtain a morphological building index gray-scale map;
calculating the building index mean value in the super-pixel object, setting a threshold value and obtaining a suspected building pattern spot after segmentation;
the suspected building pattern spots are converted into final building labels through manual modification;
randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set;
the building extraction data set is divided into a training set, a verification set and a test set.
3. The method for detecting the new building of the high-resolution remote sensing image according to claim 2, wherein in the process of randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain the building extraction data set, the large-size image is divided into 512 by 512 data sets with specified sizes, and data expansion is carried out through horizontal overturning, vertical overturning and diagonal overturning.
4. The method for detecting the newly added building in the high-resolution remote sensing image as claimed in claim 3, wherein the proportion of the training set, the verification set and the test set in the building extraction data set is 6:2:2.
5. the method for detecting the newly-added building by using the high-resolution remote sensing image according to claim 4, wherein an IRMAD algorithm is used for differentiation processing in the process of obtaining a pixel-level change detection result by differentiation processing of the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image.
6. The method for detecting the newly added building by using the high-resolution remote sensing image as set forth in claim 5, wherein in the process of analyzing the spatial position of the pixel-level change detection result and the second-time-phase building target object to detect the newly added building, the intersection of the pixel-level change detection result and the spatial position of the second-time-phase building target object after superposition is used for specific judgment.
CN202110389794.5A 2021-04-12 2021-04-12 High-resolution remote sensing image newly-added building detection method Active CN113191213B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110389794.5A CN113191213B (en) 2021-04-12 2021-04-12 High-resolution remote sensing image newly-added building detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110389794.5A CN113191213B (en) 2021-04-12 2021-04-12 High-resolution remote sensing image newly-added building detection method

Publications (2)

Publication Number Publication Date
CN113191213A CN113191213A (en) 2021-07-30
CN113191213B true CN113191213B (en) 2023-01-03

Family

ID=76975505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110389794.5A Active CN113191213B (en) 2021-04-12 2021-04-12 High-resolution remote sensing image newly-added building detection method

Country Status (1)

Country Link
CN (1) CN113191213B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113901877A (en) * 2021-09-13 2022-01-07 广州市城市规划勘测设计研究院 Method, device and equipment for detecting change of remote sensing image building and storage medium
CN113807301A (en) * 2021-09-26 2021-12-17 武汉汉达瑞科技有限公司 Automatic extraction method and automatic extraction system for newly-added construction land
CN115880575B (en) * 2022-10-26 2023-05-16 中国电子科技集团公司第五十四研究所 Remote sensing image newly-added building extraction method combining change information and building characteristics

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889449A (en) * 2019-11-27 2020-03-17 中国人民解放军国防科技大学 Edge-enhanced multi-scale remote sensing image building semantic feature extraction method
CN111860233A (en) * 2020-07-06 2020-10-30 中国科学院空天信息创新研究院 SAR image complex building extraction method and system based on attention network selection

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110136170B (en) * 2019-05-13 2021-04-02 武汉大学 Remote sensing image building change detection method based on convolutional neural network
CN110781775B (en) * 2019-10-10 2022-06-14 武汉大学 Remote sensing image water body information accurate segmentation method supported by multi-scale features
CN111145131B (en) * 2019-11-28 2023-05-26 中国矿业大学 Infrared and visible light image fusion method based on multiscale generation type countermeasure network
CN111259797A (en) * 2020-01-16 2020-06-09 南开大学 Iterative remote sensing image road extraction method based on points
CN112001293A (en) * 2020-08-19 2020-11-27 四创科技有限公司 Remote sensing image ground object classification method combining multi-scale information and coding and decoding network
CN112287832A (en) * 2020-10-29 2021-01-29 江苏科博空间信息科技有限公司 High-resolution remote sensing image-based urban illegal building detection method
CN112598675A (en) * 2020-12-25 2021-04-02 浙江科技学院 Indoor scene semantic segmentation method based on improved full convolution neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889449A (en) * 2019-11-27 2020-03-17 中国人民解放军国防科技大学 Edge-enhanced multi-scale remote sensing image building semantic feature extraction method
CN111860233A (en) * 2020-07-06 2020-10-30 中国科学院空天信息创新研究院 SAR image complex building extraction method and system based on attention network selection

Also Published As

Publication number Publication date
CN113191213A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN113191213B (en) High-resolution remote sensing image newly-added building detection method
CN110136170B (en) Remote sensing image building change detection method based on convolutional neural network
CN110705457B (en) Remote sensing image building change detection method
CN109241913B (en) Ship detection method and system combining significance detection and deep learning
CN111160276B (en) U-shaped cavity full convolution segmentation network identification model based on remote sensing image
CN111259906B (en) Method for generating remote sensing image target segmentation countermeasures under condition containing multilevel channel attention
CN111625608B (en) Method and system for generating electronic map according to remote sensing image based on GAN model
CN109447994A (en) In conjunction with the remote sensing image segmentation method of complete residual error and Fusion Features
CN112668494A (en) Small sample change detection method based on multi-scale feature extraction
CN111368769B (en) Ship multi-target detection method based on improved anchor point frame generation model
CN111738111A (en) Road extraction method of high-resolution remote sensing image based on multi-branch cascade void space pyramid
CN111291826B (en) Pixel-by-pixel classification method of multisource remote sensing image based on correlation fusion network
CN108052966A (en) Remote sensing images scene based on convolutional neural networks automatically extracts and sorting technique
CN112149594B (en) Urban construction assessment method based on deep learning and high-resolution satellite images
CN111401455B (en) Remote sensing image deep learning classification method and system based on Capsules-Unet model
CN112396607A (en) Streetscape image semantic segmentation method for deformable convolution fusion enhancement
CN112084869A (en) Compact quadrilateral representation-based building target detection method
CN114943963A (en) Remote sensing image cloud and cloud shadow segmentation method based on double-branch fusion network
CN113887515A (en) Remote sensing landslide identification method and system based on convolutional neural network
CN113569724B (en) Road extraction method and system based on attention mechanism and dilation convolution
CN110633633B (en) Remote sensing image road extraction method based on self-adaptive threshold
CN114494821B (en) Remote sensing image cloud detection method based on feature multi-scale perception and self-adaptive aggregation
CN115471467A (en) High-resolution optical remote sensing image building change detection method
CN115359366A (en) Remote sensing image target detection method based on parameter optimization
CN115187786A (en) Rotation-based CenterNet2 target detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant