CN113269717A - Building detection method and device based on remote sensing image - Google Patents

Building detection method and device based on remote sensing image Download PDF

Info

Publication number
CN113269717A
CN113269717A CN202110382233.2A CN202110382233A CN113269717A CN 113269717 A CN113269717 A CN 113269717A CN 202110382233 A CN202110382233 A CN 202110382233A CN 113269717 A CN113269717 A CN 113269717A
Authority
CN
China
Prior art keywords
remote sensing
sensing image
frame
building
building detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110382233.2A
Other languages
Chinese (zh)
Other versions
CN113269717B (en
Inventor
魏永明
高锦风
陈玉
李剑南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202110382233.2A priority Critical patent/CN113269717B/en
Publication of CN113269717A publication Critical patent/CN113269717A/en
Application granted granted Critical
Publication of CN113269717B publication Critical patent/CN113269717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供一种基于遥感图像的建筑物检测方法及装置,该方法包括:将待检测遥感图像输入建筑物检测模型中的第一网络,输出所述待检测遥感图像的特征;将所述待检测遥感图像的特征输入所述建筑物检测模型中的第二网络,输出所述待检测遥感图像中建筑物的预测框;其中,所述第一网络包括多部分,每部分包括一个下采样层,以及一个或多个SE‑ResNeXt层;所述建筑物检测模型以含有建筑物的遥感图像样本作为样本,以所述遥感图像样本中建筑物的真实框作为标签进行训练获取。本发明提高建筑物检测的召回率,使得建筑物检测更加准确。

Figure 202110382233

The present invention provides a method and device for building detection based on remote sensing images. The method includes: inputting remote sensing images to be detected into a first network in a building detection model, and outputting features of the remote sensing images to be detected; The feature of detecting remote sensing images is input into the second network in the building detection model, and the prediction frame of the building in the remote sensing image to be detected is output; wherein, the first network includes multiple parts, and each part includes a downsampling layer , and one or more SE-ResNeXt layers; the building detection model takes the remote sensing image sample containing the building as a sample, and uses the real frame of the building in the remote sensing image sample as a label for training and acquisition. The invention improves the recall rate of building detection and makes the building detection more accurate.

Figure 202110382233

Description

Building detection method and device based on remote sensing image
Technical Field
The invention relates to the technical field of image processing, in particular to a building detection method and device based on remote sensing images.
Background
The detection of specific buildings like gas stations, schools and airports is of great importance in smart cities and military applications. Although the traditional surveying and mapping technology is high in precision, time and labor are wasted, the updating period is long, and the requirements of quick updating and changing urban construction cannot be met.
With the rapid development of sensors and aerospace technologies, the time resolution, the spatial resolution and the spectral resolution of remote sensing images are higher and higher. Remote sensing technology can obtain more detailed information of the ground features in a shorter time, which makes it possible to detect a certain type of building from the remote sensing image.
Traditionally, the detection of a particular building in a remote sensing image has been based primarily on artificial features such as corners, edges and textures. Although the methods based on these features are easy to understand, the detection accuracy is often low due to the limited amount of information and the lack of spatial structure information in the artificially constructed features. Furthermore, these methods are poorly portable and difficult to use universally between different types of buildings.
Although a Convolutional Neural Network (CNN) has a strong capability of mining spatial structure information and has strong universality due to an automatic learning mechanism, the Convolutional Neural Network (CNN) has the problems of low recall rate of building detection and inaccurate detection.
Disclosure of Invention
The invention provides a building detection method and device based on a remote sensing image, which are used for solving the defects of low recall rate and inaccurate detection of the remote sensing image building detection in the prior art and realizing the accurate detection of the remote sensing image building.
The invention provides a building detection method based on a remote sensing image, which comprises the following steps:
inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected;
inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected;
wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
According to the building detection method based on the remote sensing image, provided by the invention, each SE-ResNeXt layer comprises a conversion layer with preset number of weight sharing;
each transformation layer comprises two CBL structures;
each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer;
wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
According to the building detection method based on the remote sensing image, each SE-ResNeXt layer further comprises a SEnet layer, and each SE-ResNeXt layer is connected through one jump layer.
According to the building detection method based on the remote sensing image provided by the invention, the remote sensing image to be detected is input into a first network in a building detection model, and the characteristics of the remote sensing image to be detected are output, and the method also comprises the following steps:
calculating a first intersection-parallel ratio between a prediction frame and a real frame according to the overlapping area between the prediction frame and the real frame of the building in the remote sensing image sample, the distance between the central point of the prediction frame and the central point of the real frame, and the consistency between the aspect ratio of the prediction frame and the aspect ratio of the real frame;
calculating a position loss between the prediction frame and the real frame according to the first intersection ratio;
and training the building detection model according to the position loss.
According to the building detection method based on the remote sensing image, provided by the invention, the first intersection ratio between the prediction frame and the real frame is calculated through the following formula:
Figure BDA0003013452550000031
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum bounding rectangle that simultaneously encloses the prediction box and the true-value box, ν represents a correspondence between an aspect ratio of the prediction box and an aspect ratio of the true box, and α represents a coefficient of ν.
According to the building detection method based on the remote sensing image, the consistency between the length-width ratio of the prediction frame and the length-width ratio of the real frame is calculated through the following formula:
Figure BDA0003013452550000032
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
According to the building detection method based on the remote sensing image, provided by the invention, the coefficient is calculated through the following formula:
Figure BDA0003013452550000033
according to the building detection method based on the remote sensing image, provided by the invention, the position loss between the prediction frame and the real frame is calculated according to the first intersection ratio through the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS isThe position loss, CIOU, is the first cross-over ratio, wtWidth, h, representing the real boxtRepresenting the true box high and Confidence representing the Confidence of the predicted box.
The invention also provides a building detection device based on the remote sensing image, which comprises:
the extraction module is used for inputting the remote sensing image to be detected into a first network in a building detection model and outputting the characteristics of the remote sensing image to be detected;
the detection module is used for inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model and outputting a prediction frame of a building in the remote sensing image to be detected;
wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the building detection method based on the remote sensing image.
According to the building detection method and device based on the remote sensing image, the first network in the building detection model is used as the feature extraction network, the first network comprises a plurality of parts, each part comprises a downsampling layer and one or more SE-ResNeXt layers, so that the first network keeps a larger depth, the recall rate of building detection is improved, and the building detection is more accurate, fast and strong in robustness.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for building detection based on remote sensing images provided by the present invention;
FIG. 2 is a schematic structural diagram of a building detection model in the building detection method based on remote sensing images provided by the invention;
FIG. 3 is a schematic structural diagram of a building detection device based on remote sensing images provided by the invention;
fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The building detection method based on remote sensing images of the invention is described below with reference to fig. 1, and comprises the following steps: step 101, inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
the remote sensing image to be detected is a remote sensing image needing building detection.
The building detection model includes a first network and a second network. The first network is used for extracting the characteristics of the remote sensing image to be detected. The second network is used for building detection according to the features extracted by the first network.
Each part of the first network is treated as one nSR module. Optionally, the first network comprises 5 nSR modules, n being 1, 2, 8 and 4, respectively, as shown in (r) in fig. 2.
The number of nSR modules and the value of n included in the first network may be set as desired.
Each nSR module is formed by the superposition of one down-sampling layer down-sampling and n SE-rennext layers, as shown in fig. 2, so that the first network maintains a large depth.
Optionally, the first layer of the first network is a CBL structure, which includes one convolution layer Conv2D, one batch normalization layer (BN) and one leakage relu layer, as indicated by (v) in fig. 2.
Optionally, the remote sensing image to be detected is preprocessed before the remote sensing image to be detected is subjected to building detection. The pre-processing includes geometric correction and image registration. The geometric correction is used for eliminating deformation of the geometric position, shape and other characteristics of the building on the remote sensing image caused by factors such as atmospheric refraction, earth curvature, topographic relief and the like.
Optionally, the preprocessed remote sensing image to be detected is cut according to a preset resolution, for example, the resolution is 416 × 416. And inputting the cut image blocks Input into a trained building detection model for detection.
And splicing the image blocks with the building prediction frame output by the building detection model according to the original positions of the image blocks in the image to be detected to obtain a complete remote sensing image with a detection result.
Optionally, if the remote sensing image to be detected is an RGB image, the image block Input is an image of 416 × 416 × 3.
Step 102, inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected;
optionally, as shown in fig. 2, the output of the second network includes a small-scale prediction result output, a medium-scale prediction result output, and a large-scale prediction result output.
Wherein, the small-scale prediction result is output as that the remote sensing image to be detected is subjected to multiple down-sampling through a first network to obtain a small-scale characteristic diagram, for example, 416 × 416 characteristic diagram of the remote sensing image to be detected is output as that 13 × 13 characteristic diagram through the first network; and then, after the post-processing of the second network, a prediction frame on a 13 × 13 scale is obtained.
Outputting the mesoscale prediction result as the output add1 of the ith part in the first network and the up-sampling result finally output by the first network to be spliced through a concat layer; and then, carrying out post-processing on the splicing result through a second network to obtain a prediction frame on a 26 × 26 scale. E.g., the i-th part is the second 8 SR.
Outputting a large-scale prediction result as that the output add2 of the j-th part in the first network is spliced with the up-sampling result of the splicing result through a concat layer; and then, carrying out post-processing on the splicing result through a second network to obtain a prediction frame on a 52 x 52 scale. E.g., the first 8SR in section j. Wherein i > j.
The second network in the present embodiment is not limited to the structure shown in fig. 2.
The building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
When the building detection model is trained, remote sensing image samples containing buildings are collected and marked with real frames and categories of the buildings.
All remote sensing image samples contain the same or different classes of buildings.
When the remote sensing image samples containing different types of buildings are used for training the building detection model, the building detection model can detect the different types of buildings.
Optionally, the remote sensing image sample is a 416 x 416 image block.
And training the building detection model by using the collected remote sensing image samples, thereby optimizing the weight parameters in the building detection model.
According to the method, the first network in the building detection model is used as the feature extraction network, the first network comprises multiple parts, each part comprises a down-sampling layer and one or more SE-ResNeXt layers, so that the first network keeps a larger depth, the recall rate of building detection is improved, the building detection is more accurate, the speed is high, and the robustness is strong.
On the basis of the above embodiment, each SE-resenext layer in this embodiment includes a preset number of conversion layers shared by weights; each transformation layer comprises two CBL structures; each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer; wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
Alternatively, as shown in ((c) of fig. 2), the preset number cardability is 16. The 16 transform layers are arranged in parallel.
The two CBL structures in each conversion layer are connected in series.
Optionally, the convolution kernel sizes of the two CBL structures in each transform layer are 1 × 1 and 3 × 3, respectively.
CBL structures are the most frequently used base components in building inspection models.
In the embodiment, the learning capability of the building detection model is improved by using a plurality of weight sharing transformation layers, and meanwhile, the weight sharing strategy can keep the model parameter quantity not to be increased along with the improvement of the model performance; in addition, the improvement also effectively avoids the problem of propagation gradient caused by deepening or widening the model to improve the precision.
On the basis of the above embodiment, each SE-resenext layer in this embodiment further includes a SEnet layer, and each SE-resenext layer uses one hop layer connection.
Optionally, as shown in fig. 2, after being spliced by the splicing layer, the outputs of all the transformation layers sequentially pass through the convolutional layer, the batch normalization layer and the SEnet layer, so as to ensure that the dominant features learned from the current SE-renex are preferentially applied to the learning process of the next SE-renex. In addition, the addition of SEnet can inhibit the interference of some useless features on learning.
The structure of SEnet is shown as (r) in FIG. 2.
One skip layer connection is used at each SE-resenext layer to avoid gradient problems in the back propagation.
The embodiment introduces an attention mechanism into the model through the use of SEnet, so that the learned characteristics are selectively used, the dominant characteristics are preferentially utilized, and meanwhile, the interference of useless characteristics is avoided.
On the basis of the foregoing embodiments, in this embodiment, the inputting a remote sensing image to be detected into a first network in a building detection model, and outputting characteristics of the remote sensing image to be detected further includes: calculating a first intersection-parallel ratio between a prediction frame and a real frame according to the overlapping area between the prediction frame and the real frame of the building in the remote sensing image sample, the distance between the central point of the prediction frame and the central point of the real frame, and the consistency between the aspect ratio of the prediction frame and the aspect ratio of the real frame; calculating a position loss between the prediction frame and the real frame according to the first intersection ratio; and training the building detection model according to the position loss.
In the prior art, the frame regression process of the building is mainly performed under the guidance of an intersection ratio IOU. However, the IOU only considers the overlapping region between the prediction box and the truth box. When two blocks contain or do not intersect each other, no optimization direction can be given.
The first intersection ratio in this embodiment considers the overlapping area between the prediction box and the real box, the distance between the center points of the two boxes, and the aspect ratio consistency of the two boxes. The optimization direction can be provided under various position relations of the prediction frame and the true value frame, the regression of the building frame is more accurate, and the building detection accuracy is further improved.
Optionally, a loss function is constructed based on the position loss and confidence loss between the predicted frame and the real frame of the building in the remote sensing image sample, and the class loss between the predicted class and the real class of the building in the remote sensing image sample.
Optionally, the second network of the building detection model outputs a prediction box, but the prediction box may not contain the target building, and the confidence loss of the prediction box is calculated according to whether the target is contained in the prediction box.
Optionally, the second network in the building detection model further outputs the category of the building, and the category loss is determined according to the predicted category and the real category of the building in the remote sensing image sample.
On the basis of the above embodiment, in this embodiment, the first intersection ratio between the prediction frame and the real frame is calculated by the following formula:
Figure BDA0003013452550000091
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum bounding rectangle that simultaneously encloses the prediction box and the real box, v represents a correspondence between an aspect ratio of the prediction box and an aspect ratio of the real box, and α represents a coefficient of v.
The present embodiment is not limited to a specific method of calculating the second intersection ratio according to the overlapping area between the prediction box and the real box.
On the basis of the above embodiment, in the present embodiment, the consistency between the aspect ratio of the prediction box and the aspect ratio of the real box is calculated by the following formula:
Figure BDA0003013452550000101
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
On the basis of the above embodiment, the coefficient is calculated in the present embodiment by the following formula:
Figure BDA0003013452550000102
wherein alpha is used for balancing the proportion of v in the first cross-over ratio.
On the basis of the above embodiment, in the present embodiment, the position loss between the prediction frame and the real frame is calculated according to the first intersection ratio by the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS is the position LOSS, CIOU is the first cross-over ratio, wtWidth, h, representing the real boxtRepresenting the true box high and Confidence representing the Confidence of the predicted box.
Optionally, when there is a building in the prediction box, the Confidence is 1, otherwise, the Confidence is 0.
wtAnd htThe results of the width and height normalization of the real box are shown, respectively, in the range between 0 and 1.
The program of the method in the embodiment is carried out on a Windows10 operating system which carries an RTX2080Ti independent display card (running memory 11GB) and i9-9900k processors. Experiments with remote sensing image samples show that: in the embodiment, the precision in the gas station detection is improved by 40 percent on average, the recall rate is improved by 50 percent on average, and the parameter quantity is reduced by 9 MB. The method can well realize accurate detection of the specific building on the remote sensing image, and has high application value.
The building detection device based on the remote sensing image provided by the invention is described below, and the building detection device based on the remote sensing image described below and the building detection method based on the remote sensing image described above can be referred to correspondingly.
As shown in fig. 3, the apparatus comprises an extraction module 301 and a detection module 302, wherein:
the extraction module 301 is configured to input a remote sensing image to be detected into a first network in a building detection model, and output characteristics of the remote sensing image to be detected;
the remote sensing image to be detected is a remote sensing image needing building detection.
The building detection model includes a first network and a second network. The first network is used for extracting the characteristics of the remote sensing image to be detected. The second network is used for building detection according to the features extracted by the first network.
The detection module 302 is configured to input the features of the remote sensing image to be detected into a second network in the building detection model, and output a prediction frame of a building in the remote sensing image to be detected;
the present embodiment is not limited to the structure of the second network.
Wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers;
each part of the first network is treated as one nSR module. Each nSR module is formed by the superposition of one down-sampling layer down-sampling and n SE-resenext layers.
The number of nSR modules and the value of n included in the first network may be set as desired.
The building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
When the building detection model is trained, remote sensing image samples containing buildings are collected and marked with real frames and categories of the buildings.
All remote sensing image samples contain the same or different classes of buildings.
When the remote sensing image samples containing different types of buildings are used for training the building detection model, the building detection model can detect the different types of buildings.
And training the building detection model by using the collected remote sensing image samples, thereby optimizing the weight parameters in the building detection model.
According to the method, the first network in the building detection model is used as the feature extraction network, the first network comprises multiple parts, each part comprises a down-sampling layer and one or more SE-ResNeXt layers, so that the first network keeps a larger depth, the recall rate of building detection is improved, the building detection is more accurate, the speed is high, and the robustness is strong.
On the basis of the above embodiment, each SE-resenext layer in this embodiment includes a preset number of conversion layers shared by weights; each transformation layer comprises two CBL structures; each CBL structure comprises a convolution layer, a batch processing normalization layer and a LeakyReLU layer; wherein, the convolution kernel sizes of the convolution layers in the two CBL structures are different.
On the basis of the above embodiment, each SE-resenext layer in this embodiment further includes a SEnet layer, and each SE-resenext layer uses one hop layer connection.
On the basis of the foregoing embodiments, in this embodiment, the method further includes a training module, configured to calculate a first intersection-parallel ratio between a prediction frame and a real frame of a building according to an overlapping area between the prediction frame and the real frame in the remote sensing image sample, a distance between a center point of the prediction frame and a center point of the real frame, and a consistency between an aspect ratio of the prediction frame and an aspect ratio of the real frame; calculating a position loss between the prediction frame and the real frame according to the first intersection ratio; and training the building detection model according to the position loss.
On the basis of the above embodiment, in this embodiment, the training module calculates a first intersection ratio between the prediction box and the real box by using the following formula:
Figure BDA0003013452550000121
wherein CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the prediction frame and the real frame, OpRepresents the center point of the prediction box, OlRepresents the center point of the real box, l (O)p,Ol) Represents OpAnd OlC represents a distance between diagonals of a minimum bounding rectangle that simultaneously bounds the prediction box and the real boxIn the above description, v represents the correspondence between the aspect ratio of the prediction frame and the aspect ratio of the real frame, and α represents a coefficient of v.
On the basis of the above embodiment, in this embodiment, the training module calculates the consistency between the aspect ratio of the prediction box and the aspect ratio of the real box by the following formula:
Figure BDA0003013452550000131
wherein, wtWidth, h, representing the real boxtHigh, w representing said real boxpWidth, h, of the prediction boxpIndicating a high for the prediction box.
On the basis of the above embodiment, in this embodiment, the training module calculates the coefficient by the following formula:
Figure BDA0003013452550000132
on the basis of the above embodiment, in this embodiment, the training module calculates the position loss between the prediction frame and the real frame according to the first intersection ratio by the following formula:
CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);
wherein CIOU _ LOSS is the position LOSS, CIOU is the first cross-over ratio, wtWidth, h, representing the real boxtRepresenting the true box high and Confidence representing the Confidence of the predicted box.
Fig. 4 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 4: a processor (processor)410, a communication Interface 420, a memory (memory)430 and a communication bus 440, wherein the processor 410, the communication Interface 420 and the memory 430 are communicated with each other via the communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform a method for remote sensing image-based building detection, the method comprising: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method for remote sensing image-based building detection provided by the above methods, the method comprising: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the provided remote sensing image based building detection methods described above, the method comprising: inputting a remote sensing image to be detected into a first network in a building detection model, and outputting the characteristics of the remote sensing image to be detected; inputting the characteristics of the remote sensing image to be detected into a second network in the building detection model, and outputting a prediction frame of a building in the remote sensing image to be detected; wherein the first network comprises a plurality of portions, each portion comprising a downsampling layer, and one or more SE-ResNeXt layers; the building detection model is obtained by training by taking a remote sensing image sample containing a building as a sample and taking a real frame of the building in the remote sensing image sample as a label.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1.一种基于遥感图像的建筑物检测方法,其特征在于,包括:1. a building detection method based on remote sensing image, is characterized in that, comprises: 将待检测遥感图像输入建筑物检测模型中的第一网络,输出所述待检测遥感图像的特征;Input the remote sensing image to be detected into the first network in the building detection model, and output the features of the remote sensing image to be detected; 将所述待检测遥感图像的特征输入所述建筑物检测模型中的第二网络,输出所述待检测遥感图像中建筑物的预测框;Input the feature of the remote sensing image to be detected into the second network in the building detection model, and output the prediction frame of the building in the remote sensing image to be detected; 其中,所述第一网络包括多部分,每部分包括一个下采样层,以及一个或多个SE-ResNeXt层;Wherein, the first network includes multiple parts, and each part includes a downsampling layer and one or more SE-ResNeXt layers; 所述建筑物检测模型以含有建筑物的遥感图像样本作为样本,以所述遥感图像样本中建筑物的真实框作为标签进行训练获取。The building detection model uses remote sensing image samples containing buildings as samples, and uses the real frames of buildings in the remote sensing image samples as labels for training and acquisition. 2.根据权利要求1所述的基于遥感图像的建筑物检测方法,其特征在于,每个SE-ResNeXt层包括预设个数的权值共享的变换层;2. the building detection method based on remote sensing image according to claim 1, is characterized in that, each SE-ResNeXt layer comprises the transformation layer that the weights of preset number share; 每个变换层包括两个CBL结构;Each transform layer includes two CBL structures; 每个CBL结构包括卷积层、批处理规范化层和LeakyReLU层;Each CBL structure includes convolutional layers, batch normalization layers, and LeakyReLU layers; 其中,两个CBL结构中卷积层的卷积核尺寸不同。Among them, the convolution kernel sizes of the convolutional layers in the two CBL structures are different. 3.根据权利要求2所述的基于遥感图像的建筑物检测方法,其特征在于,每个SE-ResNeXt层还包括SEnet层,每个SE-ResNeXt层使用一个跳层连接。3. The building detection method based on remote sensing images according to claim 2, wherein each SE-ResNeXt layer further comprises an SEnet layer, and each SE-ResNeXt layer uses a jump layer connection. 4.根据权利要求1-3任一所述的基于遥感图像的建筑物检测方法,其特征在于,所述将待检测遥感图像输入建筑物检测模型中的第一网络,输出所述待检测遥感图像的特征,之前还包括:4. The method for detecting buildings based on remote sensing images according to any one of claims 1-3, wherein the remote sensing images to be detected are input into the first network in a building detection model, and the remote sensing images to be detected are outputted Features of the image, which previously included: 根据所述遥感图像样本中建筑物的预测框和真实框之间的重叠面积、所述预测框的中心点和所述真实框的中心点之间的距离,以及所述预测框的长宽比和所述真实框的长宽比之间的一致性,计算所述预测框和所述真实框之间的第一交并比;According to the overlapping area between the predicted frame and the real frame of the building in the remote sensing image sample, the distance between the center point of the predicted frame and the center point of the real frame, and the aspect ratio of the predicted frame and the consistency between the aspect ratio of the real frame, calculate the first intersection ratio between the predicted frame and the real frame; 根据所述第一交并比计算所述预测框和所述真实框之间的位置损失;Calculate the position loss between the predicted frame and the true frame according to the first intersection ratio; 根据所述位置损失,对所述建筑物检测模型进行训练。The building detection model is trained according to the location loss. 5.根据权利要求4所述的基于遥感图像的建筑物检测方法,其特征在于,通过以下公式计算所述预测框和所述真实框之间的第一交并比:5. The building detection method based on remote sensing images according to claim 4, wherein the first intersection ratio between the predicted frame and the real frame is calculated by the following formula:
Figure FDA0003013452540000021
Figure FDA0003013452540000021
其中,CIOU为所述第一交并比,IOU为根据所述预测框和所述真实框之间的重叠面积计算的第二交并比,Op表示所述预测框的中心点,Ol表示所述真实框的中心点,l(Op,Ol)表示Op和Ol之间的距离,c表示同时包围所述预测框和所述真实框的最小矩形的对角线之间的距离,ν表示所述预测框的长宽比和所述真实框的长宽比之间的一致性,α表示ν的系数。Wherein, CIOU is the first intersection ratio, IOU is the second intersection ratio calculated according to the overlapping area between the predicted frame and the real frame, O p represents the center point of the predicted frame, O l Represents the center point of the real box, l(O p , O l ) represents the distance between O p and O l , c represents the diagonal line between the smallest rectangle surrounding the predicted box and the real box at the same time , ν represents the consistency between the aspect ratio of the predicted box and the ground-truth box, and α represents the coefficient of ν.
6.根据权利要求5所述的基于遥感图像的建筑物检测方法,其特征在于,通过以下公式计算所述预测框的长宽比和所述真实框的长宽比之间的一致性:6. The method for detecting buildings based on remote sensing images according to claim 5, wherein the consistency between the aspect ratio of the predicted frame and the aspect ratio of the real frame is calculated by the following formula:
Figure FDA0003013452540000022
Figure FDA0003013452540000022
其中,wt表示所述真实框的宽,ht表示所述真实框的高,wp表示所述预测框的宽,hp表示所述预测框的高。Wherein, w t represents the width of the real frame, h t represents the height of the real frame, w p represents the width of the predicted frame, and h p represents the height of the predicted frame.
7.根据权利要求5所述的基于遥感图像的建筑物检测方法,其特征在于,通过以下公式计算所述系数:7. The building detection method based on remote sensing images according to claim 5, wherein the coefficient is calculated by the following formula:
Figure FDA0003013452540000023
Figure FDA0003013452540000023
8.根据权利要求4所述的基于遥感图像的建筑物检测方法,其特征在于,通过以下公式根据所述第一交并比计算所述预测框和所述真实框之间的位置损失:8. The building detection method based on remote sensing images according to claim 4, wherein the position loss between the predicted frame and the real frame is calculated according to the first intersection ratio by the following formula: CIOU_LOSS=Confidence×(2-wt×ht)×(1-CIOU);CIOU_LOSS=Confidence×(2-w t ×h t )×(1-CIOU); 其中,CIOU_LOSS为所述位置损失,CIOU为所述第一交并比,wt表示所述真实框的宽,ht表示所述真实框的高,Confidence表示所述预测框的置信度。Wherein, CIOU_LOSS is the position loss, CIOU is the first intersection ratio, wt represents the width of the real frame, h t represents the height of the real frame, and Confidence represents the confidence of the predicted frame. 9.一种基于遥感图像的建筑物检测装置,其特征在于,包括:9. A building detection device based on remote sensing images, characterized in that, comprising: 提取模块,用于将待检测遥感图像输入建筑物检测模型中的第一网络,输出所述待检测遥感图像的特征;an extraction module, configured to input the remote sensing image to be detected into the first network in the building detection model, and output the features of the remote sensing image to be detected; 检测模块,用于将所述待检测遥感图像的特征输入所述建筑物检测模型中的第二网络,输出所述待检测遥感图像中建筑物的预测框;A detection module, for inputting the feature of the remote sensing image to be detected into the second network in the building detection model, and outputting the prediction frame of the building in the remote sensing image to be detected; 其中,所述第一网络包括多部分,每部分包括一个下采样层,以及一个或多个SE-ResNeXt层;Wherein, the first network includes multiple parts, and each part includes a downsampling layer and one or more SE-ResNeXt layers; 所述建筑物检测模型以含有建筑物的遥感图像样本作为样本,以所述遥感图像样本中建筑物的真实框作为标签进行训练获取。The building detection model uses remote sensing image samples containing buildings as samples, and uses the real frames of buildings in the remote sensing image samples as labels for training and acquisition. 10.一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至8任一项所述基于遥感图像的建筑物检测方法的步骤。10. An electronic device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the program as claimed in claim 1 when the processor executes the program Steps of any one of the steps of the remote sensing image-based building detection method described in 8.
CN202110382233.2A 2021-04-09 2021-04-09 Building detection method and device based on remote sensing image Active CN113269717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110382233.2A CN113269717B (en) 2021-04-09 2021-04-09 Building detection method and device based on remote sensing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110382233.2A CN113269717B (en) 2021-04-09 2021-04-09 Building detection method and device based on remote sensing image

Publications (2)

Publication Number Publication Date
CN113269717A true CN113269717A (en) 2021-08-17
CN113269717B CN113269717B (en) 2025-01-21

Family

ID=77228583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110382233.2A Active CN113269717B (en) 2021-04-09 2021-04-09 Building detection method and device based on remote sensing image

Country Status (1)

Country Link
CN (1) CN113269717B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114239755A (en) * 2022-02-25 2022-03-25 北京智弘通达科技有限公司 Intelligent identification method for color steel tile buildings along railway based on deep learning
CN117115641A (en) * 2023-07-20 2023-11-24 中国科学院空天信息创新研究院 Building information extraction method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018214195A1 (en) * 2017-05-25 2018-11-29 中国矿业大学 Remote sensing imaging bridge detection method based on convolutional neural network
CN108921190A (en) * 2018-05-24 2018-11-30 北京飞搜科技有限公司 A kind of image classification method, device and electronic equipment
CN109886106A (en) * 2019-01-15 2019-06-14 浙江大学 A method for detecting building changes in remote sensing images based on deep learning
CN111798417A (en) * 2020-06-19 2020-10-20 中国资源卫星应用中心 SSD-based remote sensing image target detection method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018214195A1 (en) * 2017-05-25 2018-11-29 中国矿业大学 Remote sensing imaging bridge detection method based on convolutional neural network
CN108921190A (en) * 2018-05-24 2018-11-30 北京飞搜科技有限公司 A kind of image classification method, device and electronic equipment
CN109886106A (en) * 2019-01-15 2019-06-14 浙江大学 A method for detecting building changes in remote sensing images based on deep learning
CN111798417A (en) * 2020-06-19 2020-10-20 中国资源卫星应用中心 SSD-based remote sensing image target detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张通: "一种高分辨率遥感影像建筑物自动检测方法", 测绘地理信息, 22 November 2019 (2019-11-22) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114239755A (en) * 2022-02-25 2022-03-25 北京智弘通达科技有限公司 Intelligent identification method for color steel tile buildings along railway based on deep learning
CN117115641A (en) * 2023-07-20 2023-11-24 中国科学院空天信息创新研究院 Building information extraction method and device, electronic equipment and storage medium
CN117115641B (en) * 2023-07-20 2024-03-22 中国科学院空天信息创新研究院 Building information extraction method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113269717B (en) 2025-01-21

Similar Documents

Publication Publication Date Title
CN112016614B (en) Construction method of optical image target detection model, target detection method and device
CN109960742B (en) Local information searching method and device
CN113052109A (en) 3D target detection system and 3D target detection method thereof
CN115082674A (en) A 3D object detection method based on multimodal data fusion based on attention mechanism
CN112489099B (en) Point cloud registration method and device, storage medium and electronic equipment
CN115272691A (en) A training method, identification method and equipment for detecting model of steel bar binding state
CN114463503B (en) Method and device for integrating three-dimensional model and geographic information system
CN113269717A (en) Building detection method and device based on remote sensing image
WO2022100607A1 (en) Method for determining neural network structure and apparatus thereof
CN114089330A (en) Indoor mobile robot glass detection and map updating method based on depth image restoration
CN111611925A (en) Building detection and identification method and device
CN112668461B (en) Intelligent supervision system with wild animal identification function
CN116704505A (en) Target detection method, device, equipment and storage medium
RU2612571C1 (en) Method and system for recognizing urban facilities
CN114494870A (en) A dual-phase remote sensing image change detection method, model building method and device
TWI812888B (en) Image recognition method and image recognition system
CN113902712A (en) Image processing method, device, equipment and medium based on artificial intelligence
CN115345932A (en) A Laser SLAM Loop Closure Detection Method Based on Semantic Information
CN114943870A (en) Training method and device of line feature extraction model and point cloud matching method and device
CN114332533A (en) A Landslide Image Recognition Method and System Based on DenseNet
CN117612029B (en) A remote sensing image target detection method based on progressive feature smoothing and scale-adaptive dilated convolution
CN116152503B (en) Street view-oriented online extraction method and system of urban sky visual domain
CN118154524A (en) Image defect detection method, device, medium and electronic equipment
CN110084203A (en) Full convolutional network aircraft level detection method based on context relation
CN116310756A (en) Remains identification method, remains identification device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant