CN113191213A - High-resolution remote sensing image newly-added building detection method - Google Patents
High-resolution remote sensing image newly-added building detection method Download PDFInfo
- Publication number
- CN113191213A CN113191213A CN202110389794.5A CN202110389794A CN113191213A CN 113191213 A CN113191213 A CN 113191213A CN 202110389794 A CN202110389794 A CN 202110389794A CN 113191213 A CN113191213 A CN 113191213A
- Authority
- CN
- China
- Prior art keywords
- building
- image
- resolution
- phase
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/176—Urban or other man-made structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a high-resolution remote sensing image newly-added building detection method, which comprises the steps of carrying out superpixel segmentation on an obtained second time phase GF2 image to obtain a superpixel object; then constructing a building data set by using the time phase GF2 image; inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image, wherein the multi-scale constraint coding and decoding network adopts a dual-path system structure to respectively obtain global information and local information, can better distinguish buildings from a complex background and refine details of the buildings by combining the global information and the local information, and simultaneously combines the obtained superpixel object with the building binary image to obtain a second time-phase building target object; and then, obtaining a pixel level change detection result by using an IRMAD algorithm, and carrying out spatial position superposition analysis based on the building target object and the pixel level change detection result to realize detection of the newly added building.
Description
Technical Field
The invention relates to the technical field of remote sensing image processing, in particular to a method for detecting a newly added building of a high-resolution remote sensing image.
Background
The building is used as an important artificial ground object target in a basic geographic database, and the realization of automatic change detection on the building is always a research hotspot and difficulty in the fields of artificial intelligence, photogrammetry, remote sensing and the like. The rapid and accurate building change detection has important significance in urban planning, post-earthquake disaster relief and the like.
There are three main strategies for building change detection: firstly, extracting buildings from remote sensing images in different periods, and then carrying out change detection by using an extraction result; building change detection is directly carried out by using the building characteristics; change detection is performed first, and then the part belonging to the building in the change area is judged. The change detection is directly carried out on the basis of building extraction, and the result often depends too much on the building extraction precision. The key point of directly detecting the change of the building by adopting the characteristics of the building is the effectiveness of the characteristics of the building, and the stability and the applicability of the characteristics are still to be improved. Therefore, a strategy of synchronously extracting the building and detecting the change is adopted, the extraction of the building and the change detection of all the land features are independently completed respectively, and then the two results are combined to obtain the detection result of the change of the building.
The traditional remote sensing image building extraction method mainly expresses 'what is a building' by empirically designing an appropriate feature, and creates a corresponding feature set for automatic identification and extraction of the building. Common characteristic indicators include spectrum, length, edges, shape, texture, shading, etc., but these characteristics can vary significantly with season, lighting, atmospheric conditions, sensor quality, dimensions, building style, and environment. The method for designing features by experience can only process specific data and cannot realize automation really, so that at present, deep learning is applied to remote sensing image building extraction, and the traditional method for designing features by artificial experience is replaced by the capability of automatically learning multi-level feature representation by deep learning. On the other hand, the deep learning-based method strongly depends on a large-capacity and high-precision sample database, and only quantitative comparison can be performed on theories and methods by using an open source data set, so that the method cannot be used for practical application.
Disclosure of Invention
The invention aims to provide a high-resolution remote sensing image newly-added building detection method, and aims to solve the technical problems that in the prior art, in the high-resolution remote sensing image newly-added building detection, building feature extraction accuracy is low and no proprietary data set exists when deep learning is used for extracting a building.
In order to achieve the purpose, the invention adopts a method for detecting the newly added building by using the high-resolution remote sensing image, which comprises the following steps:
selecting high-resolution GF2 images available in two phases, retaining and comparing a first-phase high-resolution GF2 image, and performing superpixel segmentation on a second-phase high-resolution GF2 image to obtain a superpixel object;
constructing a building extraction data set using the second temporal high-resolution GF2 image;
inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image;
the super pixel object is combined with the second time-phase building binary image to obtain a second time-phase building target object;
performing differential processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result;
and carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building.
Wherein the building extraction data set is constructed by using the second-phase high-resolution GF2 image, and the method comprises the following steps:
processing the second time-phase high-resolution GF2 image to obtain a morphological building index gray map;
calculating the building index mean value in the super-pixel object, setting a threshold value and obtaining a suspected building pattern spot after segmentation;
the suspected building pattern spots are converted into final building labels through manual modification;
randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set;
the building extraction data set is divided into a training set, a validation set and a test set.
During the process of randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set, the large-size image is divided into 512 x 512 data sets with specified sizes, and data expansion is performed through horizontal overturning, vertical overturning and diagonal overturning.
Wherein the proportion of the training set, the verification set and the test set in the building extraction data set is 6: 2: 2.
the multi-scale constraint coding and decoding network comprises an encoder and a decoder, wherein the encoder is composed of a dual-path architecture and multi-scale branches, the dual-path architecture comprises a local information path and a full-local information path, the local information path uses expansion convolution to extract features, the full-local information path uses VGG16 to extract features, and the multi-scale branches acquire multi-scale information through different down-sampling multiples.
The decoder adopts a multipath feature fusion module, and the multipath feature fusion module distributes different weights to feature maps of different perception fields.
And performing differentiation processing by using an IRMAD algorithm in the process of performing differentiation processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result.
And in the process of analyzing the spatial position of the pixel-level change detection result and the second time-phase building target object to realize the detection of the newly added building, specifically judging by using the intersection after the pixel-level change detection result and the spatial position of the second time-phase building target object are superposed.
According to the method for detecting the newly added building of the high-resolution remote sensing image, the super-pixel object is obtained by carrying out super-pixel segmentation on the obtained second time phase GF2 image; then, the time phase GF2 image is used for constructing a building data set; inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image, wherein the multi-scale constraint coding and decoding network adopts a double-path architecture to respectively obtain global information and local information, can better distinguish buildings from a complex background and refine details of the buildings by combining the global information and the local information, and simultaneously combines the obtained superpixel object with the building binary image to obtain a second time-phase building target object; and then, an IRMAD algorithm is used for obtaining a pixel level change detection result, and spatial position superposition analysis is carried out based on a building target object and the pixel level change detection result, so that detection of a newly-built building is realized, and the technical problems that in the prior art, when a building is extracted by deep learning, building feature extraction accuracy is low and a proprietary data set does not exist in detection of a newly-added building with a high-resolution remote sensing image are solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a high-resolution remote sensing image newly-added building detection method according to the present invention.
Fig. 2 is a diagram of a second time-phase image segmentation result according to an embodiment of the invention.
Fig. 3 is a morphological building index feature map of an embodiment of the present invention.
FIG. 4 is a suspected speckle pattern of a building, according to an embodiment of the invention.
Fig. 5 is a final building label diagram of an embodiment of the present invention.
Fig. 6 is a network structure diagram of the multi-scale constraint codec network of the present invention.
FIG. 7 is a schematic diagram of the convolutional expansion of the multi-scale constrained codec network of the present invention.
FIG. 8 is a diagram of a semantic information path structure of the multi-scale constrained codec network of the present invention.
FIG. 9 is a schematic diagram of a feature fusion module of the multi-scale constrained codec network according to the present invention.
Fig. 10 is a second phase building object diagram of an embodiment of the present invention.
FIG. 11 is a diagram of pixel level change detection results in accordance with an embodiment of the present invention.
Fig. 12 is a diagram showing the detection result of the newly added building according to the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and are intended to be illustrative of the invention and should not be construed as limiting the invention.
In the description of the present invention, it is to be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, indicate orientations or positional relationships that are based on the orientations or positional relationships illustrated in the drawings, are used for convenience in describing the invention and to simplify the description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus are not to be construed as limiting the invention. Further, in the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Referring to fig. 1, the present invention provides a method for detecting a new building with a high resolution remote sensing image, which includes the following steps:
s1: selecting high-resolution GF2 images available in two phases, retaining and comparing a first-phase high-resolution GF2 image, and performing superpixel segmentation on a second-phase high-resolution GF2 image to obtain a superpixel object;
s2: constructing a building extraction data set using the second temporal high-resolution GF2 image;
s3: inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image;
s4: the super pixel object and the second time-phase building binary image are combined to obtain a second time-phase building target object;
s5: performing differential processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result;
s6: and carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building.
Constructing a building extraction data set using the second-phase high-resolution GF2 image, comprising the steps of:
s21, processing the second time-phase high-resolution GF2 image to obtain a morphological building index gray scale map;
s22, calculating the building index mean value in the super pixel object, setting a threshold value and obtaining a suspected building pattern spot after segmentation;
s23, converting the suspected building pattern spots into final building labels through manual modification;
s24, randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set;
and S25, dividing the building extraction data set into a training set, a verification set and a test set.
In the process of randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set, a large-size image is divided into 512 x 512 data sets with specified sizes, and data expansion is carried out through horizontal overturning, vertical overturning and diagonal overturning.
The proportion of the training set, the verification set and the test set in the building extraction data set is 6: 2: 2.
the multi-scale constraint coding and decoding network comprises an encoder and a decoder, wherein the encoder is composed of a double-path architecture and multi-scale branches, the double-path architecture comprises a local information path and a global information path, the local information path uses expansion convolution to extract features, the global information path adopts VGG16 to extract features, and the multi-scale branches acquire multi-scale information through different down-sampling multiples.
The decoder adopts a multipath feature fusion module which allocates different weights to feature maps of different sensed views.
And performing differentiation processing by using an IRMAD algorithm in the process of performing differentiation processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result.
And in the process of carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building, specifically judging by using an intersection formed by superposing the pixel level change detection result and the second time-phase building target object in spatial position.
Referring to fig. 2 to 12, the present invention provides an embodiment of a method for detecting a new building with a high-resolution remote sensing image, which includes:
the research area is a color-overlapping area of Guangxi Zhuang autonomous area Guilin city, a typical sub-area of the area is selected for analysis, data come from 2016 and 2018 two-phase GF2 images, the second-phase image, namely the 2018 year GF2 image, is subjected to super-pixel segmentation, a data set is constructed, a multi-scale constraint coding and decoding network (MSCNet) is used for extracting a second-phase building target object, the two-phase image, namely the 2016 and 2018 year GF2 image, is subjected to change detection to obtain a pixel-level change detection result, and the pixel-level change detection result and the building target object are subjected to spatial position analysis to realize detection of a newly added building.
(A) Superpixel segmentation
The Mean Shift algorithm includes two steps, superpixel primary segmentation and merging of segmented regions.
The super-pixel primary segmentation is equivalent to the realization of primary segmentation by means of a module point search technology, and the specific process is as follows: (1) setting the kernel function bandwidth of a coordinate space and a spectrum space; (2) calculating Mean-Shift vector by using a Gaussian kernel function; (3) and judging whether the modulus of the vector is larger than a specified threshold value or not, thereby carrying out iterative operation and determining the position of the modulus point.
The merging of the segmentation regions is to merge the spatially adjacent and spectrally similar regions into the same object after the primary segmentation of the superpixel is completed, so as to implement image segmentation, and the segmentation result is shown in fig. 2.
(B) Building a data set
(1) Calculating a morphological building index feature map:
the Morphological Building Index (MBI) is based on the large spectral variation at the edge of the building and the small spectral variation at its interior. The index is constructed in consideration of the characteristics of the shape, direction, brightness, contrast, and the like of the building. The steps for establishing MBI are as follows:
1) calculating a luminance value
In the formula: k is the number of bands of the visible light spectrumk(x) The luminance value of the k-th band at pixel x is selected as the maximum value of each pixel in the visible light band because the visible light band has a large influence on the spectral information of the building.
2) Morphological white cap reconstruction
In the formula:to perform a morphological opening operation on the luminance image b, d and s represent the direction and scale of the linear structuring element, respectively. Because the spectral information of buildings and roads are relatively similar, the directions of the roads are less, the roads generally extend along one or two directions, the directions of the buildings are more, and linear structural elements with multiple directions and scales are selected for well distinguishing the roads from the buildings.
3) Calculating the morphological Profile MP
4) Calculating a differential morphology Profile DMP
DMPW-TH(d,s)=|WPW-TH(d,s+Δs)-MPW-TH(d,s) (4)
In the formula: smin≤Δs≤Smax
5) Calculating a morphological building index MBI
In the formula: (S) ═ Smax-Smin) And/deltas) +1, D is the number of directions in calculating the building section.
The reason for establishing the MBI is that the differential morphology section has higher local contrast, so that the building with larger MBI characteristic value is shown as a building, as shown in FIG. 3.
(2) And calculating the MBI mean value in each super pixel, and when the mean value is larger than a certain specified threshold value, regarding the super pixel as a suspected building spot, generating a diagram 4, and generating a final building label after manual modification, such as the diagram 5.
(3) Random clipping and expansion: the invention cuts the preprocessed image and the label sample by adopting a random cutting mode, divides the large-size image into 512-512 data sets with specified sizes, carries out data expansion by horizontal turning, vertical turning and diagonal turning, and finally carries out data expansion according to the following steps of 6: 2: the scale of 2 is divided into a training set, a validation set, and a test set for later training and test evaluation.
(C) Multi-scale constrained codec network (MSCNet) extraction building
The network includes an encoder and a decoder, the encoder being constructed from a dual-path architecture and multi-scale branches. The dual path architecture handles local information and global information separately, where the two paths are defined as local information path and global information path, respectively. Specifically, the problem of resolution reduction in the downsampling process is solved by adopting expansion convolution on the local information path, and more local information is reserved. The global information path adopts VGG16 to extract features, the receptive field is increased to obtain global information, meanwhile, multi-scale branches are designed on the path, and the multi-scale information is obtained through different down-sampling multiples.
And a special feature fusion module is introduced in the decoding part to fuse the obtained multiple features. And then, two strategies are adopted to restrict the updating of the parameters on each path. One is to add a constraint (here, a constraint refers to an optimization objective between a prediction of a certain path and a corresponding ground truth) after each path in the upsampling step. The parameters are updated through multi-path constraints, so that multi-resolution tags can be utilized in back propagation and weight updating, the parameters are prevented from being biased to a single constraint, and the feature representation of each path is further strengthened. Specifically, a 1 × 1 convolutional layer after the Sigmoid layer is used to obtain dense predictions from a feature map with a certain resolution, and the corresponding tags are down-sampled from the ground truth by bilinear interpolation. Finally, the overall loss is used for the network to learn to integrate the multi-resolution segment label graph into the final prediction; secondly, the final characteristic diagrams on each path are fused to form a characteristic diagram containing 4 channels, and finally a final prediction diagram is obtained through 1 multiplied by 1 convolution and sigmoid function, through the strategy, the final prediction diagram converges multipath characteristics, restricts multipath updating parameters, and each path plays an active role in back propagation and model training, and a network structure schematic diagram is shown in fig. 6.
(1) An encoder:
in order to maintain the original input image size and encode rich local information while maintaining a certain perceived field of view, the algorithm of the present invention uses a dilation convolution in the local information path. The path contains three layers, the first two layers include a convolution with step 1, then batch normalization and ReLU, and the third layer is a dilation convolution. Therefore, the output feature map of this path extraction maintains the size of the original image. Since the spatial size of the feature map is large, it encodes rich local information. The expression of the common convolution is
Where O (x, y) is the pixel value of the original image at point (x, y), and H (x, y) is the convolution kernel by which it is multiplied, with a size of w × H.
The expansion convolution is calculated as
Where l is the dilation factor and H' (x, y) is the dilation convolution kernel.
As can be seen from equations (6) and (7), the dilation convolution essentially fills the convolution kernel by 0, which can increase the field of view of the convolution kernel while retaining the original pixel information, thereby increasing the resolution. If the convolution kernel size is k and the dilation rate is l, the actual effective size of the dilation convolution is k + (k-1) × (l-1). The dilated convolution not only enlarges the perceived field of view, but also maintains the same resolution as the normal convolution, as compared to the normal convolution of the same size, which is schematically illustrated in fig. 7.
The global information path adopts VGG16 to extract features, the receptive field is increased to obtain global information, meanwhile, multi-scale branches are designed on the path, and the multi-scale information is obtained through different down-sampling multiples. Specifically, three parallel independent branches are divided after the block 1, and the feature map is rapidly down-sampled to obtain a large receptive field. Each branch is intended to encode features to a particular resolution, making full use of semantic information at different scales. Referring to FIG. 8, for a simple representation of the global information path and the multi-scale branching structure, use NsrRepresenting the stage feature layer, wherein s represents the branch, r represents the down sampling times, and the s-th branch subnet feature map resolution is the original size divided by 2r. The highest resolution of the original image is used as the input of the block 1, then the characteristic graph N is output by each branch after parallel branches12、N23、N34The sizes of (a) and (b) are 1/4, 1/8 and 1/16 of the original image respectively.
(2) A decoder:
buildings of different dimensions have different importance in sensing the size of the field of view. For larger objects, the features acquired from the large sensing field of view are more important, while for smaller objects, the features acquired from the large sensing field of view may acquire excessive peripheral information, leading to errors. The traditional feature fusion method is generally cascade or addition, and the simple method does not consider different perception fields of different feature maps and neglects specificity among features. In contrast, the multipath feature fusion module adopted by the present invention assigns different weights to the feature maps of different perception fields, and achieves better feature fusion, as shown in fig. 9.
First, two or more input feature maps are hierarchically connected in channel dimensions. Secondly, the cascaded feature maps pass through convolution kernels with the size of 3 multiplied by 3 to realize the preliminary fusion of feature map information, the obtained feature maps are subjected to global pooling operation, and the information of each feature map is extracted. Then, the obtained feature maps are passed through a convolution kernel of size 1 × 1, and the network is allowed to learn the weights from the overall information of each feature map. And finally, obtaining the final weight through a sigmoid function, and multiplying the final weight by the original feature map. Through the characteristic fusion module, weights are distributed to the characteristic graphs in different perception visual fields, so that the characteristic specificity in different perception visual fields is reflected, and the characteristics are fused better.
Since the output is targeted to a binary classification of buildings and non-buildings, the sigmoid function is chosen to generate predictions for each layer:
w∈Rcand b ∈ R1Respectively representing the weight and the deviation. Prediction of yi,jIs limited to the range of [0, 1 ]]。
For better convergence during the training iteration, binary cross-entropy is chosen to compute the kth constraint (C) between each prediction and the relative ground truthk) Rather than a simple Mean Square Error (MSE). The formula is:
wherein h iskAnd wkIs the kth prediction ykAnd ground true value gkHeight and width of (a). If it is viewedWhat is measured is the category 1 of the test,has a value of 1; otherwise, the value is 0.Is the predicted probability that the pixel belongs to class 1.
The MSCNet model generates a pyramidal element map layer by sequential convolution and upsampling blocks and skipping connections.
In strategy one, for each feature layer in the feature pyramid, after sigmoid activation, a single kernel of 1 × 1 convolution operation is applied to generate a prediction for that layer. The constraints for each layer can then be computed by the binary cross entropy between each prediction and the associated ground truth. These constraints are denoted C according to the distance from the final convolutional layermain、C′main、Csub1、Csub2And Csub3. Thus, the final penalty for MSCNet strategy one can be expressed as:
Loss=α×Cmain+γCsub1+λCsub2+σCsub3 (11)
wherein the sum of α, γ, λ and σ is set to 1.0.
And for the strategy two, fusing each feature layer in the feature pyramid to form a feature map containing 4 channels, and finally obtaining a final prediction map through 1 × 1 convolution and a sigmoid function. Thus, the final penalty for MSCNet strategy two can be expressed as:
Loss′=Cmain′ (12)
all the layers are trained by small-batch Stochastic Gradient Descent (SGD) and Back Propagation (BP) algorithms to minimize the final loss, and the MSCNet model learns how to map from the input multichannel remote sensing images to the equal-size binary segmentation map. And finally, combining the predicted binary image with the super-pixel object to obtain a building target object, as shown in fig. 10.
(D) IRMAD-based pixel level change detection
And obtaining difference images of the 2 time-phase remote sensing images by using an IRMAD algorithm, then selecting a proper threshold value, and dividing pixels on the difference images into variable pixels or non-variable pixels. The basic principle of pixel level change detection based on the IRMAD is as follows: introducing a random variable T related to the MAD component and carrying out iterative weighting on pixels through a chi-square distribution probability function, so that the unchanged pixels can obtain larger weight in the iterative process, then carrying out next iteration until convergence by using new weight to generate a difference graph, wherein the probability that a brighter area in the graph is changed is higher, and finally, a threshold value is assigned to judge whether each pixel is changed, so as to generate a pixel-level change detection result, please refer to FIG. 11.
(E) The specific discrimination process of the spatial position superposition method is as follows:
(1) and (4) judging the intersection condition of the extracted result of the time-phase image building and the changed pixels, as shown in a formula (12).
In the formula (I), the compound is shown in the specification,representing the ith building object of the rear phase image, C is the pixel level change detection result,it represents the number of pixels that the intersection of the object and the post-temporal building extraction result contains.
(2) According toSize, setting rules to judge newly added buildings:exceedThe area of the building object is half, the object is a newly built building, and otherwise, the building object is unchanged.Fig. 12 shows the detection result of the newly added building.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.
Claims (8)
1. A method for detecting a newly added building by using a high-resolution remote sensing image is characterized by comprising the following steps:
selecting high-resolution GF2 images available in two phases, retaining and comparing a first-phase high-resolution GF2 image, and performing superpixel segmentation on a second-phase high-resolution GF2 image to obtain a superpixel object;
constructing a building extraction data set using the second temporal high-resolution GF2 image;
inputting the building extraction data set into a multi-scale constraint coding and decoding network for training to obtain a second time-phase building binary image;
the super pixel object and the second time-phase building binary image are combined to obtain a second time-phase building target object;
performing differential processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result;
and carrying out spatial position analysis on the pixel level change detection result and the second time-phase building target object to realize detection of the newly added building.
2. The method for detecting the newly added building by using the high-resolution remote sensing image as claimed in claim 1, wherein the building extraction data set is constructed by using the second-time-phase high-resolution GF2 image, and the method comprises the following steps:
processing the second time-phase high-resolution GF2 image to obtain a morphological building index gray map;
calculating the building index mean value in the super-pixel object, setting a threshold value and obtaining a suspected building pattern spot after segmentation;
the suspected building pattern spots are converted into final building labels through manual modification;
randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain a building extraction data set;
the building extraction data set is divided into a training set, a validation set and a test set.
3. The method for detecting the newly added building in the high-resolution remote sensing image according to claim 2, wherein in the process of randomly cutting and expanding the high-resolution GF2 image and the final building label to obtain the building extracted data set, the large-size image is divided into 512 by 512 data sets with specified sizes, and data expansion is carried out through horizontal overturning, vertical overturning and diagonal overturning.
4. The method for detecting the newly added building in the high-resolution remote sensing image as claimed in claim 3, wherein the proportion of the training set, the verification set and the test set in the building extraction data set is 6: 2: 2.
5. the method for detecting the newly added building in the high-resolution remote sensing image as claimed in claim 4, wherein the multi-scale constraint coding and decoding network comprises an encoder and a decoder, the encoder is composed of a dual-path architecture and multi-scale branches, the dual-path architecture comprises a local information path and a global information path, the local information path uses dilation convolution to extract features, the global information path adopts VGG16 to extract features, and the multi-scale branches acquire multi-scale information through different down-sampling multiples.
6. The method as claimed in claim 5, wherein the decoder uses a multipath feature fusion module, and the multipath feature fusion module assigns different weights to the feature maps of different perception fields.
7. The method for detecting the newly added building by using the high-resolution remote sensing image as claimed in claim 6, wherein an IRMAD algorithm is used for carrying out differentiation processing in the process of carrying out differentiation processing on the first time-phase high-resolution GF2 image and the second time-phase high-resolution GF2 image to obtain a pixel-level change detection result.
8. The method for detecting the new building by using the high-resolution remote sensing image as set forth in claim 7, wherein in the process of analyzing the spatial position of the pixel-level change detection result and the second-time-phase building target object to detect the new building, the intersection of the pixel-level change detection result and the spatial position of the second-time-phase building target object after superposition is used for specific judgment.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110389794.5A CN113191213B (en) | 2021-04-12 | 2021-04-12 | High-resolution remote sensing image newly-added building detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110389794.5A CN113191213B (en) | 2021-04-12 | 2021-04-12 | High-resolution remote sensing image newly-added building detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113191213A true CN113191213A (en) | 2021-07-30 |
CN113191213B CN113191213B (en) | 2023-01-03 |
Family
ID=76975505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110389794.5A Active CN113191213B (en) | 2021-04-12 | 2021-04-12 | High-resolution remote sensing image newly-added building detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113191213B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113807301A (en) * | 2021-09-26 | 2021-12-17 | 武汉汉达瑞科技有限公司 | Automatic extraction method and automatic extraction system for newly-added construction land |
CN113901877A (en) * | 2021-09-13 | 2022-01-07 | 广州市城市规划勘测设计研究院 | Method, device and equipment for detecting change of remote sensing image building and storage medium |
CN115880575A (en) * | 2022-10-26 | 2023-03-31 | 中国电子科技集团公司第五十四研究所 | Method for extracting newly added buildings from remote sensing images by combining change information and building characteristics |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110136170A (en) * | 2019-05-13 | 2019-08-16 | 武汉大学 | A kind of remote sensing image building change detecting method based on convolutional neural networks |
CN110781775A (en) * | 2019-10-10 | 2020-02-11 | 武汉大学 | Remote sensing image water body information accurate segmentation method supported by multi-scale features |
CN110889449A (en) * | 2019-11-27 | 2020-03-17 | 中国人民解放军国防科技大学 | Edge-enhanced multi-scale remote sensing image building semantic feature extraction method |
CN111145131A (en) * | 2019-11-28 | 2020-05-12 | 中国矿业大学 | Infrared and visible light image fusion method based on multi-scale generation type countermeasure network |
CN111259797A (en) * | 2020-01-16 | 2020-06-09 | 南开大学 | Iterative remote sensing image road extraction method based on points |
CN111860233A (en) * | 2020-07-06 | 2020-10-30 | 中国科学院空天信息创新研究院 | SAR image complex building extraction method and system based on attention network selection |
CN112001293A (en) * | 2020-08-19 | 2020-11-27 | 四创科技有限公司 | Remote sensing image ground object classification method combining multi-scale information and coding and decoding network |
CN112287832A (en) * | 2020-10-29 | 2021-01-29 | 江苏科博空间信息科技有限公司 | High-resolution remote sensing image-based urban illegal building detection method |
CN112598675A (en) * | 2020-12-25 | 2021-04-02 | 浙江科技学院 | Indoor scene semantic segmentation method based on improved full convolution neural network |
-
2021
- 2021-04-12 CN CN202110389794.5A patent/CN113191213B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110136170A (en) * | 2019-05-13 | 2019-08-16 | 武汉大学 | A kind of remote sensing image building change detecting method based on convolutional neural networks |
CN110781775A (en) * | 2019-10-10 | 2020-02-11 | 武汉大学 | Remote sensing image water body information accurate segmentation method supported by multi-scale features |
CN110889449A (en) * | 2019-11-27 | 2020-03-17 | 中国人民解放军国防科技大学 | Edge-enhanced multi-scale remote sensing image building semantic feature extraction method |
CN111145131A (en) * | 2019-11-28 | 2020-05-12 | 中国矿业大学 | Infrared and visible light image fusion method based on multi-scale generation type countermeasure network |
CN111259797A (en) * | 2020-01-16 | 2020-06-09 | 南开大学 | Iterative remote sensing image road extraction method based on points |
CN111860233A (en) * | 2020-07-06 | 2020-10-30 | 中国科学院空天信息创新研究院 | SAR image complex building extraction method and system based on attention network selection |
CN112001293A (en) * | 2020-08-19 | 2020-11-27 | 四创科技有限公司 | Remote sensing image ground object classification method combining multi-scale information and coding and decoding network |
CN112287832A (en) * | 2020-10-29 | 2021-01-29 | 江苏科博空间信息科技有限公司 | High-resolution remote sensing image-based urban illegal building detection method |
CN112598675A (en) * | 2020-12-25 | 2021-04-02 | 浙江科技学院 | Indoor scene semantic segmentation method based on improved full convolution neural network |
Non-Patent Citations (1)
Title |
---|
张田: "基于深度学习空谱特征提取的高光谱图像分类", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113901877A (en) * | 2021-09-13 | 2022-01-07 | 广州市城市规划勘测设计研究院 | Method, device and equipment for detecting change of remote sensing image building and storage medium |
CN113807301A (en) * | 2021-09-26 | 2021-12-17 | 武汉汉达瑞科技有限公司 | Automatic extraction method and automatic extraction system for newly-added construction land |
CN113807301B (en) * | 2021-09-26 | 2024-06-07 | 武汉汉达瑞科技有限公司 | Automatic extraction method and automatic extraction system for newly-added construction land |
CN115880575A (en) * | 2022-10-26 | 2023-03-31 | 中国电子科技集团公司第五十四研究所 | Method for extracting newly added buildings from remote sensing images by combining change information and building characteristics |
CN115880575B (en) * | 2022-10-26 | 2023-05-16 | 中国电子科技集团公司第五十四研究所 | Remote sensing image newly-added building extraction method combining change information and building characteristics |
Also Published As
Publication number | Publication date |
---|---|
CN113191213B (en) | 2023-01-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113191213B (en) | High-resolution remote sensing image newly-added building detection method | |
CN109241913B (en) | Ship detection method and system combining significance detection and deep learning | |
CN110705457B (en) | Remote sensing image building change detection method | |
CN111191736B (en) | Hyperspectral image classification method based on depth feature cross fusion | |
CN110136170B (en) | Remote sensing image building change detection method based on convolutional neural network | |
CN111259906B (en) | Method for generating remote sensing image target segmentation countermeasures under condition containing multilevel channel attention | |
CN111625608B (en) | Method and system for generating electronic map according to remote sensing image based on GAN model | |
CN112084869B (en) | Compact quadrilateral representation-based building target detection method | |
CN111368769B (en) | Ship multi-target detection method based on improved anchor point frame generation model | |
CN111738111A (en) | Road extraction method of high-resolution remote sensing image based on multi-branch cascade void space pyramid | |
CN108052966A (en) | Remote sensing images scene based on convolutional neural networks automatically extracts and sorting technique | |
CN111401455B (en) | Remote sensing image deep learning classification method and system based on Capsules-Unet model | |
CN112215847B (en) | Method for automatically segmenting overlapped chromosomes based on counterstudy multi-scale features | |
CN115359372A (en) | Unmanned aerial vehicle video moving object detection method based on optical flow network | |
CN115359366A (en) | Remote sensing image target detection method based on parameter optimization | |
CN114550161A (en) | End-to-end three-dimensional target sparse detection method | |
CN115187786A (en) | Rotation-based CenterNet2 target detection method | |
CN114092824A (en) | Remote sensing image road segmentation method combining intensive attention and parallel up-sampling | |
CN114494821A (en) | Remote sensing image cloud detection method based on feature multi-scale perception and self-adaptive aggregation | |
CN118314353B (en) | Remote sensing image segmentation method based on double-branch multi-scale feature fusion | |
CN115810149A (en) | High-resolution remote sensing image building extraction method based on superpixel and image convolution | |
CN116343058A (en) | Global collaborative fusion-based multispectral and panchromatic satellite image earth surface classification method | |
Meyer et al. | Revealing the potential of spectral and textural predictor variables in a neural network-based rainfall retrieval technique | |
CN117953299A (en) | Land utilization classification method based on multi-scale remote sensing images | |
CN116912661A (en) | Target track prediction method and system with domain generalization capability |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |