CN117115641B - Building information extraction method and device, electronic equipment and storage medium - Google Patents

Building information extraction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117115641B
CN117115641B CN202310897212.3A CN202310897212A CN117115641B CN 117115641 B CN117115641 B CN 117115641B CN 202310897212 A CN202310897212 A CN 202310897212A CN 117115641 B CN117115641 B CN 117115641B
Authority
CN
China
Prior art keywords
building
feature
remote sensing
extraction
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310897212.3A
Other languages
Chinese (zh)
Other versions
CN117115641A (en
Inventor
王福涛
王世新
周艺
王振庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202310897212.3A priority Critical patent/CN117115641B/en
Publication of CN117115641A publication Critical patent/CN117115641A/en
Application granted granted Critical
Publication of CN117115641B publication Critical patent/CN117115641B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/194Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a building information extraction method, a device, electronic equipment and a storage medium, which belong to the technical field of image processing and comprise the following steps: acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area; inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in a target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by a building extraction model and then fusing. According to the building information extraction method, the device, the electronic equipment and the storage medium, the feature extraction and the fusion of the visible light remote sensing data and the near infrared remote sensing data are respectively carried out through the building extraction model, and the feature supplementation is carried out on the visible light data by using the near infrared data, so that the extraction capability of the model to the building can be effectively improved.

Description

Building information extraction method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and apparatus for extracting building information, an electronic device, and a storage medium.
Background
The building is a main place for human production and living, and the space distribution situation can play a great value in the fields of urban planning, real estate management, disaster prevention, disaster reduction and the like.
The remote sensing satellite with high spatial resolution can quickly acquire large-scale earth surface observation data, and provides a way for timely acquiring building information, for example, a machine learning method is utilized to extract the characteristics of the optical data of the satellite, so that the ground building can be identified.
However, the above method does not have comprehensive characteristics, which in turn affect the extraction capacity of the building.
Disclosure of Invention
The building information extraction method, the device, the electronic equipment and the storage medium are used for solving the problems that in the prior art, a machine learning method is utilized to extract features of optical data of satellites, so that a ground building is identified, the extracted features are not comprehensive, the defect of the extraction capacity of the building is further influenced, the feature extraction and fusion of visible light remote sensing data and near infrared remote sensing data are respectively carried out through a building extraction model, the feature supplementation of the visible light data is carried out through the near infrared data, and the extraction capacity of the model to the building can be effectively improved.
The invention provides a building information extraction method, which comprises the following steps:
acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area;
inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in the target area output by the building extraction model;
the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
According to the building information extraction method provided by the invention, the building extraction model comprises the following steps: the system comprises a first feature extraction network, a second feature extraction network and a feature fusion network; the first feature extraction network and the second feature extraction network are parallel;
the output end of the first feature extraction network is connected with one input end of the feature fusion network, and the output end of the second feature extraction network is connected with the other input end of the feature fusion network;
the first feature extraction network is used for extracting features of the target visible light remote sensing image input by the first input end so as to obtain a first mode data feature set;
The second feature extraction network is used for extracting features of the target near infrared remote sensing image input by the second input end so as to obtain a second mode data feature set;
and the feature fusion network is used for carrying out fusion correction on the first modal data feature set and the second modal data feature set to generate the type and position information of the building in the target area.
According to the building information extraction method provided by the invention, the first feature extraction network and the second feature extraction network are both constructed based on the HR-Net added with the IBN module.
According to the building information extraction method provided by the invention, the first feature extraction network comprises the following steps: the convolution module and the feature extraction module are connected in sequence;
the convolution module comprises an example standardization module;
the convolution module is used for carrying out dimension normalization on the target visible light remote sensing data to generate normalized remote sensing data;
the feature extraction module is used for performing downsampling and convolution on the normalized remote sensing data to generate the first modal data feature set;
the second feature extraction network is identical in structure to the first feature extraction network.
According to the building information extraction method provided by the invention, the feature fusion network comprises a connection module, an attention module, a dimension reduction module and a discriminator which are connected in sequence;
the connection module is used for carrying out channel dimension splicing on the first modal data feature set and the second modal data feature set so as to generate fusion features;
the attention module is used for carrying out weight correction on the fusion characteristics to generate correction characteristics;
the dimension reduction module is used for carrying out feature dimension reduction on the correction features to generate feature fusion results;
and the discriminator is used for classifying the characteristic fusion result to generate the type and position information of the building in the target area.
According to the building information extraction method provided by the invention, the building extraction model is obtained based on the following steps:
constructing a teacher model and a student model;
acquiring sample visible light remote sensing images, sample near infrared remote sensing images and building type position labels of a plurality of sample areas;
taking a combination of a sample visible light remote sensing image, a sample near infrared remote sensing image and a building type position label of any sample area as a training sample of the any sample area to obtain a plurality of training samples;
Training the teacher model by utilizing the plurality of training samples to obtain a trained teacher model;
extracting soft labels of each training sample from the trained teacher model;
and training the student model by using a training sample with a soft label to take the trained student model as the building extraction model.
The invention also provides a server, wherein a processor is arranged in the server; the building information extraction system further comprises a memory and a program or instructions stored on the memory and capable of running on the processor, wherein the program or instructions are executed by the processor to perform any one of the building information extraction methods.
The invention also provides a building information extraction device, which comprises:
the invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the building information extraction method as described above when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a building information extraction method as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a building information extraction method as described in any one of the above.
According to the building information extraction method, the device, the electronic equipment and the storage medium, the feature extraction and the fusion of the visible light remote sensing data and the near infrared remote sensing data are respectively carried out through the building extraction model, and the feature supplementation is carried out on the visible light data by using the near infrared data, so that the extraction capability of the model to the building can be effectively improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a building information extraction method provided by the invention;
FIG. 2 is a schematic diagram of a building extraction model provided by the invention;
FIG. 3 is a schematic structural view of BasicbLock provided by the invention;
FIG. 4 is a schematic structural view of a BasicbLock-IBN provided by the invention;
FIG. 5 is a schematic diagram of the structure of Bottleneck provided by the present invention;
FIG. 6 is a schematic diagram of the structure of a Bottleneck-IBN provided by the present invention;
FIG. 7 is a schematic illustration of the structure of ESEP provided by the present invention;
FIG. 8 is a schematic flow diagram of knowledge distillation provided by the present invention;
fig. 9 is a schematic structural view of a building information extraction device provided by the present invention;
fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the traditional building extraction method, one uses an unsupervised morphological building index, and one uses supervised machine learning.
Morphological building indices (Morphological Building Index, MBI) extract the footprint of a building by establishing the relationship of the implicit features of the building in the image (e.g., brightness, size, and contrast) to morphological operators (e.g., top hat changes), but MBI accuracy is poor due to the similarity of bare earth, road, etc. features to the building.
To improve the performance of MBI, the MBI is constrained using morphological shading indices (Morphological Shadow Index, MSI). And uses a geometric index and vegetation index to eliminate the effects of narrow roads and bright vegetation. An unsupervised morphological building index means that it does not require the time-consuming preparation of training samples and training phases. However, the threshold selection is biased to subjective and is easily interfered by other features, so that the extraction accuracy is difficult to meet the requirement. Machine learning methods represented by support vector machines, random forests, artificial neural networks, and the like are widely used for building extraction.
These works target individual pels in the image, training and learning the features of the pels. Because the context information of the pixels is not considered, the extraction result is rough, and the number of erroneously extracted pixels is large. To alleviate this problem, some work has combined object-oriented techniques with machine learning. And performing category discrimination on the super pixels by taking the super pixels formed by a plurality of adjacent homogeneous pixels as analysis objects. This approach to incorporating neighborhood features into machine learning greatly reduces the roughness of the extracted results. And after the analysis object is converted into super-pixels from pixels, the model reasoning complexity is greatly reduced. However, the extraction effect of such methods depends on the accuracy of the superpixel, while high-level semantic information cannot be analyzed.
The actual generalization ability of the model is weak because the building characteristics in the training set and the test set are not greatly different.
In recent years, deep learning has been expanding the variety of colors in the fields of natural language processing, image analysis, video recommendation, and the like. By virtue of its powerful feature extraction capability, the deep learning approach no longer relies solely on shallow features of the target. The deep learning method represented by the full convolution neural network improves the building extraction effect by one grade. Deeper architecture and more parameters also make deep learning methods more and more demanding on training data. Sufficient training data and sufficient diversity of building characterizations to enable the model to learn more general features. Currently, a total of 4 sets of relatively popular large open source data sets are available for Building extraction, the Massachusetts data set, the Vaihingen and postdam data sets of ISPRS, the Inria data set, and the WHU Building data set, respectively.
The Massachusetts dataset contained 151 aerial images of 1500 x 1500 pixels with an image resolution of 1m, but the dataset was of poor quality and was not used in the latest literature. The isps dataset consists of two aerial image dataset subsets, namely Vaihingen and watsdam datasets. The Potsdam dataset contains 38 image slices of 6000 x 6000 pixel size with a resolution of 0.05 meters and the Vaihingen dataset contains 16 image slices of 1000-4000 x 1000-4000 size with a resolution of 0.09 meters. The isps dataset covers only 13 square kilometers of area in total, with too few building instances to be useful for a wide range of applications. The Inria dataset contains buildings from five urban scenes, covering 405 square kilometers, with a resolution of 0.3 meters, can be used to some extent to test the extrapolation and generalization ability of the deep learning approach. The WHU Building dataset contains aerial and satellite images covering 450 and 550 square kilometers of area, respectively, of higher quality. Numerous building extraction works are generated by means of the above-mentioned open source data set.
Variants of the SegNet architecture are trained on the remote sensing images of urban areas, and a multi-core convolution layer is introduced for fast aggregate prediction on multiple scales. The segmentation accuracy is improved by sharing weighted Siamese U-Net in both branches and taking as input the original image and its downsampled counterpart.
Edge preserving convolutional neural networks combine edge detection with the building extraction process. So that the extracted building can maintain the positioning precision and the edge precision.
A boundary refinement (Boundary Refinement, BR) convolution module improves building prediction by perceiving the direction of each pixel in the image to the center of the nearest object to which it may belong.
The edge features are fused with other learnable multi-scale features at feature levels, so that prior constraints are achieved. And further improves the effect by means of a attentive mechanism.
In the practical application of building extraction, the extraction area and the image of the training area are often cross-domain, and the requirement on the cross-domain extraction capacity of the model is relatively high.
The following describes a building information extraction method, a device, an electronic apparatus, and a storage medium provided by an embodiment of the present invention with reference to fig. 1 to 10.
According to the building information extraction method provided by the embodiment of the invention, the execution subject can be the electronic equipment or the software or the functional module or the functional entity capable of realizing the building information extraction method in the electronic equipment, and the electronic equipment in the embodiment of the invention comprises but is not limited to a server. The execution body is not limited to the present invention.
Fig. 1 is a flow chart of a building information extraction method provided by the present invention, as shown in fig. 1, including but not limited to the following steps:
first, in step S1, target visible light remote sensing data and target near infrared remote sensing data of a target area are acquired.
Most optical satellites have the capability of acquiring near infrared information on the ground, and the characteristic supplementing effect of near infrared data on visible light data is not negligible.
Firstly, for a target area needing building information extraction, the visible light remote sensing data and the near infrared remote sensing data of the target area can be respectively shot by utilizing a high-spatial resolution remote sensing satellite to serve as target visible light remote sensing data and target near infrared remote sensing data.
Optionally, the building extraction model includes: the system comprises a first feature extraction network, a second feature extraction network and a feature fusion network; the first feature extraction network and the second feature extraction network are parallel;
The output end of the first feature extraction network is connected with one input end of the feature fusion network, and the output end of the second feature extraction network is connected with the other input end of the feature fusion network;
the first feature extraction network is used for extracting features of the target visible light remote sensing image input by the first input end so as to obtain a first mode data feature set;
the second feature extraction network is used for extracting features of the target near infrared remote sensing image input by the second input end so as to obtain a second mode data feature set;
and the feature fusion network is used for carrying out fusion correction on the first modal data feature set and the second modal data feature set to generate the type and position information of the building in the target area.
The Building extraction model can be constructed based on a Building Fine-granularity extraction Network (BFE-Net) for extracting Fine-granularity buildings, wherein the BFE-Net adopts an Encoder-Decoder structure, and the Encoder has a feature extraction function on visible light and near infrared data and can improve style invariance.
According to the building information extraction method provided by the invention, the two parallel characteristic extraction networks are used for respectively extracting the characteristics of the optical data and the near infrared data, so that the interference of the characteristic extraction of the optical data can be effectively prevented, the auxiliary effect of the near infrared data is enhanced, and the full data mining of the data in two modes is realized.
Optionally, the first feature extraction network and the second feature extraction network are each constructed based on HR-Net with IBN modules added.
In order to enable the building extraction model to have multi-scale feature fusion characteristics, a feature extraction Network is constructed Based on a High-Resolution Network (HR-Net), and an intention-Based Network (IBN) module, i.e. IBN-block, is added in the HR-Net so as to improve the style invariance of the building extraction model.
According to the building information extraction method provided by the invention, the IBN-block is added in the HR-Net, so that the style invariance of a building extraction model is improved.
Optionally, the feature fusion network comprises a connection module, an attention module, a dimension reduction module and a discriminator which are sequentially connected;
the connection module is used for carrying out channel dimension splicing on the first modal data feature set and the second modal data feature set so as to generate fusion features;
the attention module is used for carrying out weight correction on the fusion characteristics to generate correction characteristics;
the dimension reduction module is used for carrying out feature dimension reduction on the correction features to generate feature fusion results;
And the discriminator is used for classifying the characteristic fusion result to generate the type and position information of the building in the target area.
FIG. 2 is a schematic structural view of a building extraction model provided by the invention, and as shown in FIG. 2, the building extraction model comprises an Encoder and a Decode;
the method comprises the steps that an Encoder is used for extracting features of visible light (RGB Data) and near infrared Data (NIR Data), and comprises two parallel feature extraction networks which do not share weights, namely a first feature extraction network and a second feature extraction network which are identical in structure; the feature extraction network comprises a convolution module (Conv 3, IN, reLU) and a feature extraction module connected IN sequence, the feature extraction module comprising: stage1 (Stage 1), stage2 (Stage 2), stage3 (Stage 3), and Stage4 (Stage 4);
wherein Stage1 comprises 1 Bottleneck-IBN×4; stage2 comprises 2 parallel BasicbLock-IBNX 4; stage3 comprises 3 parallel BasicbLock×4; stage4 comprises 4 parallel BasicbLock; the feature results output by Stage4 are overlapped and fused to obtain a modal data feature set and input the modal data feature set to a Decoder; the modal data feature set comprises a first modal data feature set and a second modal data feature set;
The Decoder comprises a feature fusion network for correcting and fusing the multi-mode features extracted by the Encoder. The feature fusion network comprises: 3 connection modules (concat), effective Squeeze-specification Plus attention module (ESEP-block), dimension reduction module (Conv 1, BN, reLU) and discriminator (Conv 1, BN, softmax);
the input ends of the first 2 concat in the connection module are respectively connected with the output end of the first feature extraction network and the output end of the second feature extraction network, and the output ends are connected with the input end of the 3 rd concat. The output of the 3 rd con is connected to the input of the Effective Squeeze-specification Plus attention module (ESEP-block).
And outputting the type position information of the building in the target area as an extraction result by the Softmax classifier in the discriminator.
According to the building information extraction method provided by the invention, the attention module is used in the feature fusion network, and only one FC layer is used, so that the channel information is reserved to the greatest extent, and the performance is improved; and the global maximum pooling is added outside the global average pooling, and the weight sharing is carried out on the full connection layers after the two pooling, so that the model learning capacity is improved while the complexity is almost unchanged.
Optionally, the first feature extraction network includes: the convolution module and the feature extraction module are connected in sequence;
the convolution module comprises an example standardization module;
the convolution module is used for carrying out dimension normalization on the target visible light remote sensing data to generate normalized remote sensing data;
the feature extraction module is used for performing downsampling and convolution on the normalized remote sensing data to generate the first modal data feature set;
the second feature extraction network is identical in structure to the first feature extraction network.
Due to the difference of different buildings and the influence of factors such as illumination intensity, the styles of the building performances in the images of different areas can be different, and the styles comprise: shallow features such as color, brightness, texture, etc.; IN order to avoid large output difference caused by building style difference, the style invariance of the network is increased, and IN addition to using batch normalization BN module IN the Encoder and the Decoder, an example normalization IN module is added IN the feature extraction network IN the normalization processing of the features.
Batch normalization (Batch Normalization, BN) is the averaging and standard deviation of all training samples in a batch (batch), the independent calculation between channels, and the output statistics are affected by all samples in the batch. For a batch x with N training samples, C channels of height H and width W, BN normalizes the N, H, W dimension, leaving the C dimension:
Wherein mu c Is the average value of the dimension C; sigma (sigma) c Standard deviation in dimension C; epsilon is the offset (prevent sigma) c 0).Normalized value for the C dimension; gamma and beta are scaling and translation variables.
BN retains the distinction between individual samples, but also makes convolutional neural networks (Convolutional Neural Network, CNN) susceptible to appearance variations. Unlike BN, the example normalized IN is the calculation of the mean and standard deviation for each channel for each sample.
IN normalizes the H and W dimensions, preserving the N and C dimensions:
wherein mu nc Is the mean value of the N and C dimensions, sigma nc Is the standard deviation of the N and C dimensions, ε is the offset (prevent σ c 0),normalized values for the N and C dimensions, γ and β are scaling and translation variables.
IN eliminates individual comparison while reducing useful global information affecting classifier feature recognition.
The general conclusion is that BN is suitable for cognitive tasks such as image classification, semantic segmentation, object detection, etc.; while IN is suitable for generating tasks such as style migration, image denoising, image generation, etc.
IN brings invariance between different samples IN feature mean and variance statistics, while BN as a global normalization retains the variability between different samples.
If the BN of the model is changed to IN the cognitive task, the performance of the model may be significantly reduced due to incorrect usage of IN, but not IN is not beneficial to the cognitive task.
IN order to avoid the damage to the distinguishing degree caused by the use of IN, the invention adopts IN for partial features IN the first half of the building extraction model, and the deep layer still keeps BN unchanged, thereby the style invariance is designed IN the model as a priori.
Fig. 3 is a schematic structural view of the basic block provided by the present invention, fig. 4 is a schematic structural view of the basic block-IBN provided by the present invention, fig. 5 is a schematic structural view of the basic block-IBN provided by the present invention, and fig. 6 is a schematic structural view of the basic block-IBN provided by the present invention, as shown IN fig. 3-6, IN order to reduce the feature variance caused by shallow layer appearance, and at the same time, not to interfere with deep layer content recognition, IN is only added IN a shallower part IN the model, and basic blocks IN the basic block-IBN and the basic block-IBN IN the stage 1 IN the original HR-Net are replaced with the basic block-IBN and the basic block-IBN, respectively, so that the robustness to style change is improved by adding IN while maintaining BN capable of smoothly introducing content related information into deep layer.
For the data of two modes, namely the visible light remote sensing data and the near infrared remote sensing data, parallel independent feature extraction is adopted, namely the first feature extraction network is used for carrying out feature extraction on the visible light remote sensing data, and the second feature extraction network is used for carrying out feature extraction on the near infrared remote sensing data. If the data of the two modes are directly input into a backbone network (backbone) in a channel stacking mode, on one hand, because the image net pre-training model is trained based on RGB data, the feature extraction of the optical data can be interfered; on the other hand, the auxiliary effect of near infrared data may be weakened. Therefore, the invention uses two backbones with the same structure and not shared weight as the first feature extraction network and the second feature extraction network to carry out feature extraction, and can realize full feature mining on data of two modes.
According to the building information extraction method provided by the invention, the IN is added to the shallow layer of the feature extraction network, so that feature variance caused by the appearance of the shallow layer can be reduced, and meanwhile, deep content identification is not interfered.
In the building extraction model provided by the invention, the Decoder is used for fusing the multi-mode features extracted by the Encoder. And the characteristic group of the optical data and the characteristic group of the near infrared data are spliced in the channel dimension, and then the ESEP attention module is used for carrying out weight correction on each characteristic, so that the characteristic representation capability is further improved. And finally, performing feature dimension reduction by using 1*1 convolution (Conv 1) to obtain a final fused feature result, and classifying by using a Softmax discriminator.
FIG. 7 is a schematic illustration of the structure of ESEP provided by the present invention, as shown in FIG. 7, which is a channel attention, and which can improve extrusion-and-Specification (SE) more effectively.
Wherein,is global average pooling (Global average pooling); />Is global max pooling (Global max pooling); />Is coordinated with the mxn core c channel (concolution with m × n kernel c channels).
SE is a common channel attention method in CNN for explicit modeling of interdependencies between feature map channels. SE pools the compressed spatial correlations by global averaging to learn channel-specific descriptors, and then rescales the input feature map using two fully connected layers (Fully Connected layer, FC) and Sigmoid type functions to highlight only useful channels.
The SE module has certain limitations, specifically: to avoid the high model complexity burden, the two FC layers of SE modules need to be reduced in channel size, which can lead to channel information loss.
ESEP uses only one FC layer, so that channel information is reserved to the greatest extent, and performance is improved. In addition, SE uses global averaging pooling in compressing spatial correlation, while global information can be taken into account, but locally important saliency information is attenuated. Therefore, global maximum pooling is added in addition to global average pooling, and weight sharing is carried out on the full-connection layers after two kinds of pooling, so that the model learning capacity is improved while the complexity is almost unchanged.
A single building type cannot meet finer requirements of practical application, for example, in the disaster field, a building is used as an important disaster-bearing body, the capability of buildings of different structure types for resisting disasters is large in difference, and the fine-granularity building structure type can play a larger practical role in disaster relief and evaluation. In other application areas, the material price, construction difficulty, space service application and aesthetic requirements of different types of buildings are different.
Optionally, the building extraction model is obtained based on the following steps:
constructing a teacher model and a student model;
acquiring sample visible light remote sensing images, sample near infrared remote sensing images and building type position labels of a plurality of sample areas;
taking a combination of a sample visible light remote sensing image, a sample near infrared remote sensing image and a building type position label of any sample area as a training sample of the any sample area to obtain a plurality of training samples;
training the teacher model by utilizing the plurality of training samples to obtain a trained teacher model;
extracting soft labels of each training sample from the trained teacher model;
and training the student model by using a training sample with a soft label to take the trained student model as the building extraction model.
Wherein, the building type position label can be the type and position of each building in the sample area; the kinds of buildings may include: steel, reinforced concrete structures, hybrid structures and brick structures.
The Decoder of the building extraction model adopts a knowledge distillation strategy to fully mine the inter-class relationship of the fine-grained building in a training strategy.
FIG. 8 is a schematic flow chart of knowledge distillation provided by the invention, as shown in FIG. 8, comprising:
for example, the teacher model in knowledge distillation is BFE-Net with HRNet48 as the baseline and the student model is BFE-Net with HRNet18 as the baseline.
Training a Teacher model (Teacher network) by using a training sample, wherein the training sample comprises a sample visible light remote sensing image (RGB Data), a sample near infrared remote sensing image (NIR Data) and a building type position label serving as a Hard label (Hard label).
After training the teacher model, soft labels (Soft labels) for each training sample are extracted from the teacher model, and knowledge distillation (knowledge distillation) is used to mine inter-class associations of fine-grained buildings, further optimizing the fine-grained extraction results.
Training samples with Soft labels (Soft labels) were then used to train the Student model (Student network).
In the deep learning field, data are usually marked as Hard labs, but in fact, the same data contain different types of information, and direct marking as Hard labs can cause loss of a large amount of information, so that the effect obtained by a final model is affected. Correlation exists among different types of buildings, hard label can weaken correlation among the types, and the risk of overfitting exists. Label smoothing (label smoothing) can generate Soft label, but Soft label is artificially set and has strong subjectivity. Compared with the label smoothening, the main difference of the knowledge distillation is that the Soft label of the knowledge distillation is obtained through network reasoning, the effective information of the data set is counted, the associated information among classes is reserved, and the redundant information which is partially invalid is removed. Compared with the label smoothing, the model is more reliable in training the Soft label on the data set.
The general practice of training a model is to match the softmax distribution of the model to the real labels, while the knowledge distillation method is to match the softmax distribution of the student model to the teacher model. The latter has one such advantage over the former: the softmax distribution of the trained teacher model contains a certain knowledge. Hard label can only display, and a certain pixel in an image belongs to a steel and reinforced concrete structure, and is not a mixed structure or a brick structure; the trained teacher model may determine that it is most likely to be a reinforced concrete structure, possibly a hybrid structure, but never a brick structure.
Firstly, training a teacher model, wherein a loss function is cross entropy of a Hard label (Hard label) of a model result.
As shown in FIG. 8, the loss function of the student model contains twoLoss branch, one is distillation loss L soft Another is the conventional loss L hard
When obtaining Soft labels by the teacher model, test-Time Augmentation, TTA is used to increase the reliability of Soft labels (Soft labels).
Distillation loss L soft The student model learns rich knowledge (including intra-class relations and inter-class relations) in the teacher model, and the conventional loss is used for relieving negative effects caused by partial wrong knowledge of the teacher model. The loss function of the student model is shown as follows:
wherein α and β are weights for distillation loss and conventional loss; n is the number of pixels at which loss is calculated; m is the number of categories;the probability (soft label) that the ith pixel output for the teacher model belongs to class c; />The probability that the ith pixel output by the student model belongs to the class c; />The probability (hard tag) that the ith pixel belongs to class c in the ground truth value.
The BFE-Net and the backbone in the comparison method of the invention use a pre-training model obtained by training by imageNet, the total training round number is 50, the batch size is 8, the initial learning rate is 0.001, the optimizer uses Adam with weight attenuation, and the attenuation factor is 0.001. In addition, the learning rate was adjusted using cosine annealing, the minimum learning rate was set to 0.00001, and 5 rounds of wakeup were used to ensure the stability of training.
According to the building information extraction method provided by the invention, the inter-class association of the fine-grained building is mined through knowledge distillation, and the fine-grained extraction result is further optimized.
F 1 The score is used to evaluate the performance of the model in extracting each type of building and context. F (F) 1 Is the harmonic mean of precision and recovery, and the calculation formula is:
wherein, precision is the ratio of true positive to the number of predicted positive; the recall (recovery) is the ratio of true positive to the number of positive actually present.
The overall classification accuracy OA is used to evaluate the overall performance of the model. OA is the ratio of the number of correctly classified pixels to the total number of pixels in the sample.
The invention carries out quantitative evaluation on BFE-Net and a comparison method, the evaluation result is shown in Table 1, F 1 -0 denotes F of the background (area other than building) 1 ;F 1 -1 represents F of a steel and reinforced concrete structure building 1 ;F 1 -2 represents F of a hybrid structural building 1 ;F 1 -3 represents F of a brick-structured building 1
F of BFE-Net 1 -0、F 1 -1、F 1 -2、F 1 96.80%, 31.53%, 50.55%, 64.25% and 93.23% were obtained for-3 and OA, respectively.
F of steel and reinforced concrete structure building 1 Most likely, it is most clearly related to this type of building feature. The shadows of the mixed structure building are also obvious and are easily confused with steel and reinforced concrete structure buildings, resulting in F 1 And lower. Because part of the brick-wood structure building is similar to other surrounding ground objects and is difficult to extract effectively, F is the same as F 1 Lowest.
TABLE 1 precision results for BFE-Net and other models
Model F 1 -0(%) F 1 -1(%) F 1 -2(%) F 1 -3(%) OA(%)
BFE-Net 96.80 31.53 50.55 64.25 93.23
In order to more intuitively show the model effect, the invention also visualizes the reasoning results of part of the test set. For images with better imaging quality, BFE-Net achieves good results. For a brick-and-wood house with poor imaging quality, the BFE-Net can still obtain a relatively good extraction result.
Further, in step S2, the target visible light remote sensing data and the target near infrared remote sensing data are input to a building extraction model, so as to obtain type and position information of a building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
After the target visible light remote sensing data of the target area are input into a first feature extraction network of a building extraction model, and the target near infrared remote sensing data are input into a second feature extraction network of the building extraction model, the building extraction model respectively performs independent feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data, and then the extracted feature groups are fused, so that the type and the position of each building in the target area are generated and used as type position information.
According to the building information extraction method provided by the invention, the feature extraction and fusion are respectively carried out on the visible light remote sensing data and the near infrared remote sensing data through the building extraction model, and the feature supplementation is carried out on the visible light data by using the near infrared data, so that the extraction capability of the model on the building can be effectively improved.
The server provided by the invention is described below, and the server described below and the building information extraction method described above can be referred to correspondingly.
The invention also provides a server, wherein a processor is arranged in the server; further comprising a memory and a program or instruction stored on the memory and executable on the processor, the program or instruction when executed by the processor performing the building information extraction method according to any of the embodiments above, the method comprising: acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area; inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
According to the server provided by the invention, the feature extraction and fusion are respectively carried out on the visible light remote sensing data and the near infrared remote sensing data through the building extraction model, and the feature supplementation is carried out on the visible light data through the near infrared data, so that the extraction capability of the model on the building can be effectively improved.
The building information extraction apparatus provided by the present invention will be described below, and the building information extraction apparatus described below and the building information extraction method described above may be referred to correspondingly to each other.
Fig. 9 is a schematic structural view of a building information extraction device provided by the present invention, and as shown in fig. 9, the device includes:
the acquisition module 901 is used for acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area;
the input module 902 is configured to input the target visible light remote sensing data and the target near infrared remote sensing data to a building extraction model, so as to obtain type and position information of a building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
In the running process of the device, an acquisition module 901 acquires target visible light remote sensing data and target near infrared remote sensing data of a target area; the input module 902 inputs the target visible light remote sensing data and the target near infrared remote sensing data to a building extraction model to obtain type and position information of a building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
According to the building information extraction device provided by the invention, the feature extraction and fusion of the visible light remote sensing data and the near infrared remote sensing data are respectively carried out through the building extraction model, and the feature supplementation is carried out on the visible light data by using the near infrared data, so that the extraction capability of the model to the building can be effectively improved.
Fig. 10 is a schematic structural diagram of an electronic device according to the present invention, and as shown in fig. 10, the electronic device may include: a processor 1010, a communication interface (Communications Interface) 1020, a memory 1030, and a communication bus 1040, wherein the processor 1010, the communication interface 1020, and the memory 1030 communicate with each other via the communication bus 1040. Processor 1010 may invoke logic instructions in memory 1030 to perform a building information extraction method comprising: acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area; inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
Further, the logic instructions in the memory 1030 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing the building information extraction method provided by the above methods, the method comprising: acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area; inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the building information extraction method provided by the above methods, the method comprising: acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area; inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in the target area output by the building extraction model; the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the extracted features.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A building information extraction method, characterized by comprising:
acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area;
inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model to obtain the type and position information of the building in the target area output by the building extraction model;
the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the feature extraction;
the building extraction model comprises an Encoder part and a Decode part; the Encoder is used for extracting features of visible light data and near infrared data, and comprises two parallel feature extraction networks which do not share weight, namely a first feature extraction network and a second feature extraction network which are the same in structure; the feature extraction network comprises a convolution module Conv3-IN-ReLU and a feature extraction module which are connected IN sequence; the feature extraction module includes: stage1, stage2, stage3 and Stage4; stage1 includes 1 Bottleneck-IBN 4; stage2 comprises 2 parallel BasicbLock-IBN 4; stage3 includes 3 parallel basicblock×4; stage4 includes 4 parallel basic locks; the feature result output by Stage4 is overlapped and fused to obtain a modal data feature group and input the modal data feature group to the Decoder; the modal data feature set comprises a first modal data feature set and a second modal data feature set; the Decoder comprises a feature fusion network, which is used for correcting and fusing the multi-mode features extracted by the Encoder; the feature fusion network comprises: 3 connection modules concat, effective Squeeze-specification Plus attention module ESEP-block, dimension reduction module Conv1-BN-ReLU and discriminator Conv1-BN-Softmax; the input ends of the first 2 concat in the connection module are respectively connected with the output end of the first feature extraction network and the output end of the second feature extraction network, the output ends of the first 2 concat are connected with the input end of the 3 rd concat, and the output end of the 3 rd concat is connected with the input end of the Effective Squeeze-specification Plus attention module ESEP-block.
2. The building information extraction method according to claim 1, wherein the building extraction model includes: the system comprises a first feature extraction network, a second feature extraction network and a feature fusion network; the first feature extraction network and the second feature extraction network are parallel;
the output end of the first feature extraction network is connected with one input end of the feature fusion network, and the output end of the second feature extraction network is connected with the other input end of the feature fusion network;
the first feature extraction network is used for extracting features of the target visible light remote sensing image input by the first input end so as to obtain a first mode data feature set;
the second feature extraction network is used for extracting features of the target near infrared remote sensing image input by the second input end so as to obtain a second mode data feature set;
and the feature fusion network is used for carrying out fusion correction on the first modal data feature set and the second modal data feature set to generate the type and position information of the building in the target area.
3. The building information extraction method according to claim 2, wherein the first feature extraction network and the second feature extraction network are each constructed based on HR-Net to which IBN modules are added.
4. The building information extraction method according to claim 2, wherein the first feature extraction network includes: the convolution module and the feature extraction module are connected in sequence;
the convolution module comprises an example standardization module;
the convolution module is used for carrying out dimension normalization on the target visible light remote sensing data to generate normalized remote sensing data;
the feature extraction module is used for performing downsampling and convolution on the normalized remote sensing data to generate the first modal data feature set;
the second feature extraction network is identical in structure to the first feature extraction network.
5. The building information extraction method according to claim 2, wherein the feature fusion network comprises a connection module, an attention module, a dimension reduction module and a discriminator which are connected in sequence;
the connection module is used for carrying out channel dimension splicing on the first modal data feature set and the second modal data feature set so as to generate fusion features;
the attention module is used for carrying out weight correction on the fusion characteristics to generate correction characteristics;
the dimension reduction module is used for carrying out feature dimension reduction on the correction features to generate feature fusion results;
And the discriminator is used for classifying the characteristic fusion result to generate the type and position information of the building in the target area.
6. The building information extraction method according to any one of claims 1 to 5, wherein the building extraction model is obtained based on:
constructing a teacher model and a student model;
acquiring sample visible light remote sensing images, sample near infrared remote sensing images and building type position labels of a plurality of sample areas;
taking a combination of a sample visible light remote sensing image, a sample near infrared remote sensing image and a building type position label of any sample area as a training sample of the any sample area to obtain a plurality of training samples;
training the teacher model by utilizing the plurality of training samples to obtain a trained teacher model;
extracting soft labels of each training sample from the trained teacher model;
and training the student model by using a training sample with a soft label to take the trained student model as the building extraction model.
7. A server, wherein a processor is disposed in the server; further comprising a memory and a program or instruction stored on the memory and executable on the processor, which program or instruction when executed by the processor performs the building information extraction method according to any one of claims 1-6.
8. A building information extraction apparatus, comprising:
the acquisition module is used for acquiring target visible light remote sensing data and target near infrared remote sensing data of a target area;
the input module is used for inputting the target visible light remote sensing data and the target near infrared remote sensing data into a building extraction model so as to acquire the type and position information of the building in the target area output by the building extraction model;
the category position information is obtained by respectively carrying out feature extraction on the target visible light remote sensing data and the target near infrared remote sensing data by the building extraction model and then fusing the feature extraction;
the building extraction model comprises an Encoder part and a Decode part; the Encoder is used for extracting features of visible light data and near infrared data, and comprises two parallel feature extraction networks which do not share weight, namely a first feature extraction network and a second feature extraction network which are the same in structure; the feature extraction network comprises a convolution module Conv3-IN-ReLU and a feature extraction module which are connected IN sequence; the feature extraction module includes: stage1, stage2, stage3 and Stage4; stage1 includes 1 Bottleneck-IBN 4; stage2 comprises 2 parallel BasicbLock-IBN 4; stage3 includes 3 parallel basicblock×4; stage4 includes 4 parallel basic locks; the feature result output by Stage4 is overlapped and fused to obtain a modal data feature group and input the modal data feature group to the Decoder; the modal data feature set comprises a first modal data feature set and a second modal data feature set; the Decoder comprises a feature fusion network, which is used for correcting and fusing the multi-mode features extracted by the Encoder; the feature fusion network comprises: 3 connection modules concat, effective Squeeze-specification Plus attention module ESEP-block, dimension reduction module Conv1-BN-ReLU and discriminator Conv1-BN-Softmax; the input ends of the first 2 concat in the connection module are respectively connected with the output end of the first feature extraction network and the output end of the second feature extraction network, the output ends of the first 2 concat are connected with the input end of the 3 rd concat, and the output end of the 3 rd concat is connected with the input end of the Effective Squeeze-specification Plus attention module ESEP-block.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the building information extraction method of any one of claims 1-6 when the program is executed by the processor.
10. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the building information extraction method according to any one of claims 1-6.
CN202310897212.3A 2023-07-20 2023-07-20 Building information extraction method and device, electronic equipment and storage medium Active CN117115641B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310897212.3A CN117115641B (en) 2023-07-20 2023-07-20 Building information extraction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310897212.3A CN117115641B (en) 2023-07-20 2023-07-20 Building information extraction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117115641A CN117115641A (en) 2023-11-24
CN117115641B true CN117115641B (en) 2024-03-22

Family

ID=88811788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310897212.3A Active CN117115641B (en) 2023-07-20 2023-07-20 Building information extraction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117115641B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454253B (en) * 2023-12-08 2024-04-02 深圳市蕾奥规划设计咨询股份有限公司 Building classification method, device, terminal equipment and storage medium

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020232905A1 (en) * 2019-05-20 2020-11-26 平安科技(深圳)有限公司 Superobject information-based remote sensing image target extraction method, device, electronic apparatus, and medium
JP2021103519A (en) * 2019-12-24 2021-07-15 ネイバー コーポレーションNAVER Corporation Method and system for normalizing smoothing feature of time space for behavior recognition
CN113269717A (en) * 2021-04-09 2021-08-17 中国科学院空天信息创新研究院 Building detection method and device based on remote sensing image
JP2021192261A (en) * 2020-09-11 2021-12-16 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Building extraction method, device, apparatus, and storage medium
CN113901972A (en) * 2021-12-09 2022-01-07 深圳市海清视讯科技有限公司 Method, device and equipment for detecting remote sensing image building and storage medium
WO2022027917A1 (en) * 2020-08-05 2022-02-10 深圳市优必选科技股份有限公司 Image processing method, apparatus and system, and electronic device and readable storage medium
WO2022062543A1 (en) * 2020-09-27 2022-03-31 上海商汤智能科技有限公司 Image processing method and apparatus, device and storage medium
CN114387512A (en) * 2021-12-28 2022-04-22 南京邮电大学 Remote sensing image building extraction method based on multi-scale feature fusion and enhancement
CN114708494A (en) * 2022-03-04 2022-07-05 中国农业科学院农业信息研究所 Rural homestead building identification method and system
CN114913428A (en) * 2022-04-26 2022-08-16 哈尔滨理工大学 Remote sensing image target detection system based on deep learning
CN115082806A (en) * 2022-05-30 2022-09-20 青海大学 Ground object extraction method for medium and high resolution satellite remote sensing image
CN115291210A (en) * 2022-07-26 2022-11-04 哈尔滨工业大学 Three-dimensional image pipeline identification method of 3D-CNN ground penetrating radar combined with attention mechanism
CN115375715A (en) * 2022-07-13 2022-11-22 中国科学院空天信息创新研究院 Target extraction method and device, electronic equipment and storage medium
US11521377B1 (en) * 2021-10-26 2022-12-06 Nanjing University Of Information Sci. & Tech. Landslide recognition method based on laplacian pyramid remote sensing image fusion
CN115565047A (en) * 2022-08-31 2023-01-03 华为技术有限公司 Multitasking method, medium, and electronic device
CN115641613A (en) * 2022-11-03 2023-01-24 西安电子科技大学 Unsupervised cross-domain pedestrian re-identification method based on clustering and multi-scale learning
CN116229235A (en) * 2023-03-06 2023-06-06 西安电子科技大学 Human body posture estimation network model and estimation method based on thermal imaging
KR20230086457A (en) * 2021-12-08 2023-06-15 호남대학교 산학협력단 Electronic apparatus for building fire detecting system and method thereof
CN116403162A (en) * 2023-04-11 2023-07-07 南京航空航天大学 Airport scene target behavior recognition method and system and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875787B (en) * 2018-05-23 2020-07-14 北京市商汤科技开发有限公司 Image recognition method and device, computer equipment and storage medium

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020232905A1 (en) * 2019-05-20 2020-11-26 平安科技(深圳)有限公司 Superobject information-based remote sensing image target extraction method, device, electronic apparatus, and medium
JP2021103519A (en) * 2019-12-24 2021-07-15 ネイバー コーポレーションNAVER Corporation Method and system for normalizing smoothing feature of time space for behavior recognition
WO2022027917A1 (en) * 2020-08-05 2022-02-10 深圳市优必选科技股份有限公司 Image processing method, apparatus and system, and electronic device and readable storage medium
JP2021192261A (en) * 2020-09-11 2021-12-16 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Building extraction method, device, apparatus, and storage medium
WO2022062543A1 (en) * 2020-09-27 2022-03-31 上海商汤智能科技有限公司 Image processing method and apparatus, device and storage medium
CN113269717A (en) * 2021-04-09 2021-08-17 中国科学院空天信息创新研究院 Building detection method and device based on remote sensing image
US11521377B1 (en) * 2021-10-26 2022-12-06 Nanjing University Of Information Sci. & Tech. Landslide recognition method based on laplacian pyramid remote sensing image fusion
KR20230086457A (en) * 2021-12-08 2023-06-15 호남대학교 산학협력단 Electronic apparatus for building fire detecting system and method thereof
CN113901972A (en) * 2021-12-09 2022-01-07 深圳市海清视讯科技有限公司 Method, device and equipment for detecting remote sensing image building and storage medium
CN114387512A (en) * 2021-12-28 2022-04-22 南京邮电大学 Remote sensing image building extraction method based on multi-scale feature fusion and enhancement
CN114708494A (en) * 2022-03-04 2022-07-05 中国农业科学院农业信息研究所 Rural homestead building identification method and system
CN114913428A (en) * 2022-04-26 2022-08-16 哈尔滨理工大学 Remote sensing image target detection system based on deep learning
CN115082806A (en) * 2022-05-30 2022-09-20 青海大学 Ground object extraction method for medium and high resolution satellite remote sensing image
CN115375715A (en) * 2022-07-13 2022-11-22 中国科学院空天信息创新研究院 Target extraction method and device, electronic equipment and storage medium
CN115291210A (en) * 2022-07-26 2022-11-04 哈尔滨工业大学 Three-dimensional image pipeline identification method of 3D-CNN ground penetrating radar combined with attention mechanism
CN115565047A (en) * 2022-08-31 2023-01-03 华为技术有限公司 Multitasking method, medium, and electronic device
CN115641613A (en) * 2022-11-03 2023-01-24 西安电子科技大学 Unsupervised cross-domain pedestrian re-identification method based on clustering and multi-scale learning
CN116229235A (en) * 2023-03-06 2023-06-06 西安电子科技大学 Human body posture estimation network model and estimation method based on thermal imaging
CN116403162A (en) * 2023-04-11 2023-07-07 南京航空航天大学 Airport scene target behavior recognition method and system and electronic equipment

Non-Patent Citations (13)

* Cited by examiner, † Cited by third party
Title
A Multi-Scale Edge Constraint Network for the Fine Extraction of Buildings from Remote Sensing Images;Zhenqing Wang等;remote sensing;20230208;第15卷(第927期);第1-20页 *
A Review of Residual Convolutional Networks;Wuwen Qiu;International Journal of Knowledge and Language Processing;20221231;第13卷(第02期);第13-27页 *
Big Earth Data in Support of Marine Sustainable Development;Futao WANG;Bulletin of Chinese Academy of Science;20210820;第36卷(第08期);第1-7页 *
Fine-Grained Building Extraction With Multispectral Remote Sensing Imagery Using the Deep Model;Zhenqing Wang等;IEEE Transactions on Geoscience and Remote Sensing;20231025;第61卷;第4706013页 *
IEU-Net 高分辨率遥感影像房屋建筑物提取;王振庆等;遥感学报;20211125;第25卷(第11期);第2245-2254页 *
MSBA: Multiple Scales, Branches and Attention Network With Bag of Tricks for Person Re-Identification;HANLIN TAN等;IEEE Access;20200401;第8卷;第63632 - 63642页 *
基于Bottleneck的目标检测网络模型加速研究;徐小成等;广西大学学报(自然科学版);20221025;第47卷(第05期);第1306-1313页 *
基于多光谱影像辅助的微波遥感水体提取方法研究;熊金国等;中国水利水电科学研究院学报;20120315;第10卷(第01期);第23-29页 *
基于瓶颈多态性的生产物流瓶颈闭环预测方法;刘志等;计算机集成制造系统;20121115;第18卷(第11期);第2554-2561页 *
基于神经网络的遥感影像超高分辨率目标识别;焦云清等;系统仿真学报;20070720;第19卷(第14期);第3223-3225页 *
文本对抗验证码的研究;李剑明等;计算机工程与应用;20221110;第59卷(第21期);第278-286页 *
目标网格编码支持的多源遥感影像地理匹配;夏列钢等;地理信息世界;20200825;第27卷(第04期);第36-40页 *
面向GF-2 遥感影像的U-Net 城市绿地分类;徐知宇等;中国图象图形学报;20210316;第26卷(第03期);第700-713页 *

Also Published As

Publication number Publication date
CN117115641A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
Zhang et al. A multi-level context-guided classification method with object-based convolutional neural network for land cover classification using very high resolution remote sensing images
CN111080628B (en) Image tampering detection method, apparatus, computer device and storage medium
Hui et al. Effective building extraction from high-resolution remote sensing images with multitask driven deep neural network
Yin et al. Hot region selection based on selective search and modified fuzzy C-means in remote sensing images
US9042648B2 (en) Salient object segmentation
US11854244B2 (en) Labeling techniques for a modified panoptic labeling neural network
CN113283435A (en) Remote sensing image semantic segmentation method based on multi-scale attention fusion
CN117115641B (en) Building information extraction method and device, electronic equipment and storage medium
CN112861690A (en) Multi-method fused remote sensing image change detection method and system
CN113838064B (en) Cloud removal method based on branch GAN using multi-temporal remote sensing data
CN112950780B (en) Intelligent network map generation method and system based on remote sensing image
CN116311254B (en) Image target detection method, system and equipment under severe weather condition
CN112836625A (en) Face living body detection method and device and electronic equipment
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
Liu et al. Iris recognition in visible spectrum based on multi-layer analogous convolution and collaborative representation
CN116469020A (en) Unmanned aerial vehicle image target detection method based on multiscale and Gaussian Wasserstein distance
CN114005107A (en) Document processing method and device, storage medium and electronic equipment
CN112330562B (en) Heterogeneous remote sensing image transformation method and system
Bressan et al. Semantic segmentation with labeling uncertainty and class imbalance
CN116883303A (en) Infrared and visible light image fusion method based on characteristic difference compensation and fusion
CN117197763A (en) Road crack detection method and system based on cross attention guide feature alignment network
CN116798041A (en) Image recognition method and device and electronic equipment
CN111079807A (en) Ground object classification method and device
CN116740362A (en) Attention-based lightweight asymmetric scene semantic segmentation method and system
CN116543325A (en) Unmanned aerial vehicle image-based crop artificial intelligent automatic identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant