CN113269717A - Building detection method and device based on remote sensing image - Google Patents
Building detection method and device based on remote sensing image Download PDFInfo
- Publication number
- CN113269717A CN113269717A CN202110382233.2A CN202110382233A CN113269717A CN 113269717 A CN113269717 A CN 113269717A CN 202110382233 A CN202110382233 A CN 202110382233A CN 113269717 A CN113269717 A CN 113269717A
- Authority
- CN
- China
- Prior art keywords
- remote sensing
- sensing image
- building
- building detection
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 124
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000000605 extraction Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 238000005070 sampling Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000012805 post-processing Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 101100108191 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) add gene Proteins 0.000 description 1
- 101150060298 add2 gene Proteins 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a building detection method and a device based on a remote sensing image, wherein the method comprises the following steps: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label. The invention improves the recall rate of building detection, so that the building detection is more accurate.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a building detection method and device based on remote sensing images.
Background
The detection of specific buildings like gas stations, schools and airports is of great importance in smart cities and military applications. Although the traditional surveying and mapping technology is high in precision, time and labor are wasted, the updating period is long, and the requirements of quick updating and changing urban construction cannot be met.
With the rapid development of sensors and aerospace technologies, the time resolution, the spatial resolution and the spectral resolution of remote sensing images are higher and higher. Remote sensing technology can obtain more detailed information of the ground features in a shorter time, which makes it possible to detect a certain type of building from the remote sensing image.
Traditionally, the detection of a particular building in a remote sensing image has been based primarily on artificial features such as corners, edges and textures. Although the methods based on these features are easy to understand, the detection accuracy is often low due to the limited amount of information and the lack of spatial structure information in the artificially constructed features. Furthermore, these methods are poorly portable and difficult to use universally between different types of buildings.
Although a Convolutional Neural Network (CNN) has a strong capability of mining spatial structure information and has strong universality due to an automatic learning mechanism, the Convolutional Neural Network (CNN) has the problems of low recall rate of building detection and inaccurate detection.
Disclosure of Invention
The invention provides a building detection method and device based on a remote sensing image, which are used for solving the defects of low recall rate and inaccurate detection of the remote sensing image building detection in the prior art and realizing the accurate detection of the remote sensing image building.
The invention provides a building detection method based on a remote sensing image, which comprises the following steps:
inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected;
inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected;
wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
According to the building detection method based on the remote sensing image, provided by the invention, each SE-ResNeXt layer comprises a conversion layer with preset number of weight sharing;
each transformation layer comprises two CBL structures;
each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer;
wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
According to the building detection method based on the remote sensing image, each SE-ResNeXt layer further comprises a SEnet layer, and each SE-ResNeXt layer is connected through one jump layer.
According to the building detection method based on the remote sensing image provided by the invention, the remote sensing image to be detected is input into a first network in a building detection model, and the characteristics of the remote sensing image to be detected are output, and the method also comprises the following steps:
calculating a first intersection-parallel ratio between a prediction frame and a real frame according to the overlapping area between the prediction frame and the real frame of the building in the remote sensing image sample, the distance between the central point of the prediction frame and the central point of the real frame, and the consistency between the aspect ratio of the prediction frame and the aspect ratio of the real frame;
calculating a position loss between the prediction frame and the real frame according to the first intersection ratio;
and training the building detection model according to the position loss.
According to the building detection method based on the remote sensing image, provided by the invention, the first intersection ratio between the prediction frame and the real frame is calculated through the following formula:
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum bounding rectangle that simultaneously encloses the prediction box and the true-value box, ν represents a correspondence between an aspect ratio of the prediction box and an aspect ratio of the true box, and α represents a coefficient of ν.
According to the building detection method based on the remote sensing image, the consistency between the length-width ratio of the prediction frame and the length-width ratio of the real frame is calculated through the following formula:
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
According to the building detection method based on the remote sensing image, provided by the invention, the coefficient is calculated through the following formula:
according to the building detection method based on the remote sensing image, provided by the invention, the position loss between the prediction frame and the real frame is calculated according to the first intersection ratio through the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS isThe position loss, CIOU, is the first cross-over ratio, wtWidth, h, representing the real boxtRepresenting the true box high and Confidence representing the Confidence of the predicted box.
The invention also provides a building detection device based on the remote sensing image, which comprises:
the extraction module is used for inputting the remote sensing image to be detected into a first network in a building detection model and outputting the characteristics of the remote sensing image to be detected;
the detection module is used for inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model and outputting a prediction frame of a building in the remote sensing image to be detected;
wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the building detection method based on the remote sensing image.
According to the building detection method and device based on the remote sensing image, the first network in the building detection model is used as the feature extraction network, the first network comprises a plurality of parts, each part comprises a downsampling layer and one or more SE-ResNeXt layers, so that the first network keeps a larger depth, the recall rate of building detection is improved, and the building detection is more accurate, fast and strong in robustness.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for building detection based on remote sensing images provided by the present invention;
FIG. 2 is a schematic structural diagram of a building detection model in the building detection method based on remote sensing images provided by the invention;
FIG. 3 is a schematic structural diagram of a building detection device based on remote sensing images provided by the invention;
fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The building detection method based on remote sensing images of the invention is described below with reference to fig. 1, and comprises the following steps: step 101, inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the remote sensing image to be detected is a remote sensing image needing building detection.
The building detection model includes a first network and a second network. The first network is used for extracting the characteristics of the remote sensing image to be detected. The second network is used for building detection according to the features extracted by the first network.
Each part of the first network is treated as one nSR module. Optionally, the first network comprises 5 nSR modules, n being 1, 2, 8 and 4, respectively, as shown in (r) in fig. 2.
The number of nSR modules and the value of n included in the first network may be set as desired.
Each nSR module is formed by the superposition of one down-sampling layer down-sampling and n SE-rennext layers, as shown in fig. 2, so that the first network maintains a large depth.
Optionally, the first layer of the first network is a CBL structure, which includes one convolution layer Conv2D, one batch normalization layer (BN) and one leakage relu layer, as indicated by (v) in fig. 2.
Optionally, the remote sensing image to be detected is preprocessed before the remote sensing image to be detected is subjected to building detection. The pre-processing includes geometric correction and image registration. The geometric correction is used for eliminating deformation of the geometric position, shape and other characteristics of the building on the remote sensing image caused by factors such as atmospheric refraction, earth curvature, topographic relief and the like.
Optionally, the preprocessed remote sensing image to be detected is cut according to a preset resolution, for example, the resolution is 416 × 416. And inputting the cut image blocks Input into a trained building detection model for detection.
And splicing the image blocks with the building prediction frame output by the building detection model according to the original positions of the image blocks in the image to be detected to obtain a complete remote sensing image with a detection result.
Optionally, if the remote sensing image to be detected is an RGB image, the image block Input is an image of 416 × 416 × 3.
optionally, as shown in fig. 2, the output of the second network includes a small-scale prediction result output, a medium-scale prediction result output, and a large-scale prediction result output.
Wherein, the small-scale prediction result is output as that the remote sensing image to be detected is subjected to multiple down-sampling through a first network to obtain a small-scale characteristic diagram, for example, 416 × 416 characteristic diagram of the remote sensing image to be detected is output as that 13 × 13 characteristic diagram through the first network; and then, after the post-processing of the second network, a prediction frame on a 13 × 13 scale is obtained.
Outputting the mesoscale prediction result as the output add1 of the ith part in the first network and the up-sampling result finally output by the first network to be spliced through a concat layer; and then, carrying out post-processing on the splicing result through a second network to obtain a prediction frame on a 26 × 26 scale. E.g., the i-th part is the second 8 SR.
Outputting a large-scale prediction result as that the output add2 of the j-th part in the first network is spliced with the up-sampling result of the splicing result through a concat layer; and then, carrying out post-processing on the splicing result through a second network to obtain a prediction frame on a 52 x 52 scale. E.g., the first 8SR in section j. Wherein i > j.
The second network in the present embodiment is not limited to the structure shown in fig. 2.
The building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
When the building detection model is trained, remote sensing image samples containing buildings are collected and marked with real frames and categories of the buildings.
All remote sensing image samples contain the same or different classes of buildings.
When the remote sensing image samples containing different types of buildings are used for training the building detection model, the building detection model can detect the different types of buildings.
Optionally, the remote sensing image sample is a 416 x 416 image block.
And training the building detection model by using the collected remote sensing image samples, thereby optimizing the weight parameters in the building detection model.
According to the method, the first network in the building detection model is used as the feature extraction network, the first network comprises multiple parts, each part comprises a down-sampling layer and one or more SE-ResNeXt layers, so that the first network keeps a larger depth, the recall rate of building detection is improved, the building detection is more accurate, the speed is high, and the robustness is strong.
On the basis of the above embodiment, each SE-resenext layer in this embodiment includes a preset number of conversion layers shared by weights; each transformation layer comprises two CBL structures; each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer; wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
Alternatively, as shown in ((c) of fig. 2), the preset number cardability is 16. The 16 transform layers are arranged in parallel.
The two CBL structures in each conversion layer are connected in series.
Optionally, the convolution kernel sizes of the two CBL structures in each transform layer are 1 × 1 and 3 × 3, respectively.
CBL structures are the most frequently used base components in building inspection models.
In the embodiment, the learning capability of the building detection model is improved by using a plurality of weight sharing transformation layers, and meanwhile, the weight sharing strategy can keep the model parameter quantity not to be increased along with the improvement of the model performance; in addition, the improvement also effectively avoids the problem of propagation gradient caused by deepening or widening the model to improve the precision.
On the basis of the above embodiment, each SE-resenext layer in this embodiment further includes a SEnet layer, and each SE-resenext layer uses one hop layer connection.
Optionally, as shown in fig. 2, after being spliced by the splicing layer, the outputs of all the transformation layers sequentially pass through the convolutional layer, the batch normalization layer and the SEnet layer, so as to ensure that the dominant features learned from the current SE-renex are preferentially applied to the learning process of the next SE-renex. In addition, the addition of SEnet can inhibit the interference of some useless features on learning.
The structure of SEnet is shown as (r) in FIG. 2.
One skip layer connection is used at each SE-resenext layer to avoid gradient problems in the back propagation.
The embodiment introduces an attention mechanism into the model through the use of SEnet, so that the learned characteristics are selectively used, the dominant characteristics are preferentially utilized, and meanwhile, the interference of useless characteristics is avoided.
On the basis of the foregoing embodiments, in this embodiment, the inputting a remote sensing image to be detected into a first network in a building detection model, and outputting characteristics of the remote sensing image to be detected further includes: calculating a first intersection-parallel ratio between a prediction frame and a real frame according to the overlapping area between the prediction frame and the real frame of the building in the remote sensing image sample, the distance between the central point of the prediction frame and the central point of the real frame, and the consistency between the aspect ratio of the prediction frame and the aspect ratio of the real frame; calculating a position loss between the prediction frame and the real frame according to the first intersection ratio; and training the building detection model according to the position loss.
In the prior art, the frame regression process of the building is mainly performed under the guidance of an intersection ratio IOU. However, the IOU only considers the overlapping region between the prediction box and the truth box. When two blocks contain or do not intersect each other, no optimization direction can be given.
The first intersection ratio in this embodiment considers the overlapping area between the prediction box and the real box, the distance between the center points of the two boxes, and the aspect ratio consistency of the two boxes. The optimization direction can be provided under various position relations of the prediction frame and the true value frame, the regression of the building frame is more accurate, and the building detection accuracy is further improved.
Optionally, a loss function is constructed based on the position loss and confidence loss between the predicted frame and the real frame of the building in the remote sensing image sample, and the class loss between the predicted class and the real class of the building in the remote sensing image sample.
Optionally, the second network of the building detection model outputs a prediction box, but the prediction box may not contain the target building, and the confidence loss of the prediction box is calculated according to whether the target is contained in the prediction box.
Optionally, the second network in the building detection model further outputs the category of the building, and the category loss is determined according to the predicted category and the real category of the building in the remote sensing image sample.
On the basis of the above embodiment, in this embodiment, the first intersection ratio between the prediction frame and the real frame is calculated by the following formula:
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum bounding rectangle that simultaneously encloses the prediction box and the real box, v represents a correspondence between an aspect ratio of the prediction box and an aspect ratio of the real box, and α represents a coefficient of v.
The present embodiment is not limited to a specific method of calculating the second intersection ratio according to the overlapping area between the prediction box and the real box.
On the basis of the above embodiment, in the present embodiment, the consistency between the aspect ratio of the prediction box and the aspect ratio of the real box is calculated by the following formula:
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
On the basis of the above embodiment, the coefficient is calculated in the present embodiment by the following formula:
wherein alpha is used for balancing the proportion of v in the first cross-over ratio.
On the basis of the above embodiment, in the present embodiment, the position loss between the prediction frame and the real frame is calculated according to the first intersection ratio by the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS is the position LOSS, CIOU is the first cross-over ratio, wtWidth, h, representing the real boxtRepresenting the true box high and Confidence representing the Confidence of the predicted box.
Optionally, when there is a building in the prediction box, the Confidence is 1, otherwise, the Confidence is 0.
wtAnd htThe results of the width and height normalization of the real box are shown, respectively, in the range between 0 and 1.
The program of the method in the embodiment is carried out on a Windows10 operating system which carries an RTX2080Ti independent display card (running memory 11GB) and i9-9900k processors. Experiments with remote sensing image samples show that: in the embodiment, the precision in the gas station detection is improved by 40 percent on average, the recall rate is improved by 50 percent on average, and the parameter quantity is reduced by 9 MB. The method can well realize accurate detection of the specific building on the remote sensing image, and has high application value.
The building detection device based on the remote sensing image provided by the invention is described below, and the building detection device based on the remote sensing image described below and the building detection method based on the remote sensing image described above can be referred to correspondingly.
As shown in fig. 3, the apparatus comprises an extraction module 301 and a detection module 302, wherein:
the extraction module 301 is configured to input a remote sensing image to be detected into a first network in a building detection model, and output characteristics of the remote sensing image to be detected;
the remote sensing image to be detected is a remote sensing image needing building detection.
The building detection model includes a first network and a second network. The first network is used for extracting the characteristics of the remote sensing image to be detected. The second network is used for building detection according to the features extracted by the first network.
The detection module 302 is configured to input the features of the remote sensing image to be detected into a second network in the building detection model, and output a prediction frame of a building in the remote sensing image to be detected;
the present embodiment is not limited to the structure of the second network.
Wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
each part of the first network is treated as one nSR module. Each nSR module is formed by the superposition of one down-sampling layer down-sampling and n SE-resenext layers.
The number of nSR modules and the value of n included in the first network may be set as desired.
The building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
When the building detection model is trained, remote sensing image samples containing buildings are collected and marked with real frames and categories of the buildings.
All remote sensing image samples contain the same or different classes of buildings.
When the remote sensing image samples containing different types of buildings are used for training the building detection model, the building detection model can detect the different types of buildings.
And training the building detection model by using the collected remote sensing image samples, thereby optimizing the weight parameters in the building detection model.
According to the method, the first network in the building detection model is used as the feature extraction network, the first network comprises multiple parts, each part comprises a down-sampling layer and one or more SE-ResNeXt layers, so that the first network keeps a larger depth, the recall rate of building detection is improved, the building detection is more accurate, the speed is high, and the robustness is strong.
On the basis of the above embodiment, each SE-resenext layer in this embodiment includes a preset number of conversion layers shared by weights; each transformation layer comprises two CBL structures; each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer; wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
On the basis of the above embodiment, each SE-resenext layer in this embodiment further includes a SEnet layer, and each SE-resenext layer uses one hop layer connection.
On the basis of the foregoing embodiments, in this embodiment, the method further includes a training module, configured to calculate a first intersection-parallel ratio between a prediction frame and a real frame of a building according to an overlapping area between the prediction frame and the real frame in the remote sensing image sample, a distance between a center point of the prediction frame and a center point of the real frame, and a consistency between an aspect ratio of the prediction frame and an aspect ratio of the real frame; calculating a position loss between the prediction frame and the real frame according to the first intersection ratio; and training the building detection model according to the position loss.
On the basis of the above embodiment, in this embodiment, the training module calculates a first intersection ratio between the prediction box and the real box by using the following formula:
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum bounding rectangle that simultaneously bounds the prediction box and the real boxIn the above description, v represents the correspondence between the aspect ratio of the prediction frame and the aspect ratio of the real frame, and α represents a coefficient of v.
On the basis of the above embodiment, in this embodiment, the training module calculates the consistency between the aspect ratio of the prediction box and the aspect ratio of the real box by the following formula:
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
On the basis of the above embodiment, in this embodiment, the training module calculates the coefficient by the following formula:
on the basis of the above embodiment, in this embodiment, the training module calculates the position loss between the prediction frame and the real frame according to the first intersection ratio by the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS is the position LOSS, CIOU is the first cross-over ratio, wtWidth, h, representing the real boxtRepresenting the true box high and Confidence representing the Confidence of the predicted box.
Fig. 4 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 4: a processor (processor)410, a communication Interface 420, a memory (memory)430 and a communication bus 440, wherein the processor 410, the communication Interface 420 and the memory 430 are communicated with each other via the communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform a method for remote sensing image-based building detection, the method comprising: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method for remote sensing image-based building detection provided by the above methods, the method comprising: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the provided remote sensing image based building detection methods described above, the method comprising: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A building detection method based on remote sensing images is characterized by comprising the following steps:
inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected;
inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected;
wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
2. The remote sensing image-based building detection method according to claim 1, wherein each SE-resenex layer comprises a preset number of weight-shared transform layers;
each transformation layer comprises two CBL structures;
each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer;
wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
3. The remote sensing image based building detection method of claim 2, wherein each SE-resenext layer further comprises a SEnet layer, and each SE-resenext layer is connected using one hop layer.
4. The remote sensing image-based building detection method according to any one of claims 1-3, wherein the inputting the remote sensing image to be detected into a first network in a building detection model and outputting the characteristics of the remote sensing image to be detected further comprises:
calculating a first intersection-parallel ratio between a prediction frame and a real frame according to the overlapping area between the prediction frame and the real frame of the building in the remote sensing image sample, the distance between the central point of the prediction frame and the central point of the real frame, and the consistency between the aspect ratio of the prediction frame and the aspect ratio of the real frame;
calculating a position loss between the prediction frame and the real frame according to the first intersection ratio;
and training the building detection model according to the position loss.
5. The remote sensing image-based building detection method according to claim 4, wherein the first intersection ratio between the prediction frame and the real frame is calculated by the following formula:
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum rectangle that surrounds both the prediction box and the real box, v represents a correspondence between an aspect ratio of the prediction box and an aspect ratio of the real box, and α represents a coefficient of v.
6. The remote sensing image-based building detection method according to claim 5, wherein the consistency between the aspect ratio of the prediction box and the aspect ratio of the real box is calculated by the following formula:
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
8. the remote sensing image-based building detection method according to claim 4, wherein the position loss between the prediction frame and the real frame is calculated from the first intersection ratio by the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS is the position LOSS, CIOU is the first cross-over ratio, wtWidth, h, representing the real boxtHigh, Confidence representing said real boxRepresenting a confidence of the prediction box.
9. A building detection device based on remote sensing images is characterized by comprising:
the extraction module is used for inputting the remote sensing image to be detected into a first network in a building detection model and outputting the characteristics of the remote sensing image to be detected;
the detection module is used for inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model and outputting a prediction frame of a building in the remote sensing image to be detected;
wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, carries out the steps of the method for remote sensing image based building detection according to any of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110382233.2A CN113269717A (en) | 2021-04-09 | 2021-04-09 | Building detection method and device based on remote sensing image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110382233.2A CN113269717A (en) | 2021-04-09 | 2021-04-09 | Building detection method and device based on remote sensing image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113269717A true CN113269717A (en) | 2021-08-17 |
Family
ID=77228583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110382233.2A Pending CN113269717A (en) | 2021-04-09 | 2021-04-09 | Building detection method and device based on remote sensing image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113269717A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114239755A (en) * | 2022-02-25 | 2022-03-25 | 北京智弘通达科技有限公司 | Intelligent identification method for color steel tile buildings along railway based on deep learning |
CN117115641A (en) * | 2023-07-20 | 2023-11-24 | 中国科学院空天信息创新研究院 | Building information extraction method and device, electronic equipment and storage medium |
-
2021
- 2021-04-09 CN CN202110382233.2A patent/CN113269717A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114239755A (en) * | 2022-02-25 | 2022-03-25 | 北京智弘通达科技有限公司 | Intelligent identification method for color steel tile buildings along railway based on deep learning |
CN117115641A (en) * | 2023-07-20 | 2023-11-24 | 中国科学院空天信息创新研究院 | Building information extraction method and device, electronic equipment and storage medium |
CN117115641B (en) * | 2023-07-20 | 2024-03-22 | 中国科学院空天信息创新研究院 | Building information extraction method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111178206B (en) | Building embedded part detection method and system based on improved YOLO | |
CN113076871B (en) | Fish shoal automatic detection method based on target shielding compensation | |
CN112418212B (en) | YOLOv3 algorithm based on EIoU improvement | |
CN113052109A (en) | 3D target detection system and 3D target detection method thereof | |
CN113269717A (en) | Building detection method and device based on remote sensing image | |
CN112668461B (en) | Intelligent supervision system with wild animal identification function | |
CN114089330B (en) | Indoor mobile robot glass detection and map updating method based on depth image restoration | |
CN114155244A (en) | Defect detection method, device, equipment and storage medium | |
CN112489099A (en) | Point cloud registration method and device, storage medium and electronic equipment | |
CN111611925A (en) | Building detection and identification method and device | |
CN113569852A (en) | Training method and device of semantic segmentation model, electronic equipment and storage medium | |
CN111428191A (en) | Antenna downward inclination angle calculation method and device based on knowledge distillation and storage medium | |
CN114565842A (en) | Unmanned aerial vehicle real-time target detection method and system based on Nvidia Jetson embedded hardware | |
RU2612571C1 (en) | Method and system for recognizing urban facilities | |
CN115115601A (en) | Remote sensing ship target detection method based on deformation attention pyramid | |
CN115170978A (en) | Vehicle target detection method and device, electronic equipment and storage medium | |
CN118196544A (en) | Unmanned aerial vehicle small target detection method and system based on information enhancement and feature fusion | |
CN114943870A (en) | Training method and device of line feature extraction model and point cloud matching method and device | |
CN113743346A (en) | Image recognition method and device, electronic equipment and storage medium | |
CN114463503A (en) | Fusion method and device of three-dimensional model and geographic information system | |
CN117671480A (en) | Landslide automatic identification method, system and computer equipment based on visual large model | |
CN116486285B (en) | Aerial image target detection method based on class mask distillation | |
CN116665201A (en) | Monocular three-dimensional target detection method, monocular three-dimensional target detection device, monocular three-dimensional target detection equipment and storage medium | |
CN114821192A (en) | Remote sensing image elevation prediction method combining semantic information | |
CN114627371A (en) | Bridge health monitoring method based on attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |