CN114549992A

CN114549992A - Cross-resolution building image extraction method and device

Info

Publication number: CN114549992A
Application number: CN202210182051.5A
Authority: CN
Inventors: 张立贤; 付昊桓; 徐一丹
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-02-25
Filing date: 2022-02-25
Publication date: 2022-05-27

Abstract

The application provides a cross-resolution building image extraction method and device. The method comprises the following steps: acquiring a low-resolution remote sensing image, wherein the low-resolution remote sensing image comprises a building; performing super-resolution reconstruction on the low-resolution remote sensing image by using a generator in the generative countermeasure network, and determining the super-resolution remote sensing image and the edge characteristics of the building; and extracting the image of the building from the super-resolution remote sensing image by using the extraction model according to the edge characteristics. The method can realize cross-resolution building image extraction, namely, the building image with relatively high resolution is extracted under the condition of only low-resolution remote sensing image.

Description

Cross-resolution building image extraction method and device

Technical Field

The application relates to the technical field of surveying and mapping and the technical field of remote sensing image processing, in particular to a cross-resolution building image extraction method and device.

Background

The rapid acquisition of the building distribution information in a large range is one of the most important practical production tasks in the fields of mapping and remote sensing. With the rapid development of the remote sensing technology, the topographic map sketching based on the remote sensing image with relatively high resolution gradually shows incomparable efficiency advantages in the rapid and large-scale middle and small scale surveying and mapping.

However, when the building image is extracted, the selection of the remote sensing image has an important influence on the result of the building image extraction. In the actual mapping production process, the selection of the remote sensing image is usually considered from two aspects: image quality and acquisition cost. The quality measurement of the remote sensing image is generally based on the physical characteristics and the time characteristics of the remote sensing image, including the hue and color difference, the definition, the detail expression, the spatial resolution and the time resolution of the image. The acquisition costs are generally primarily concerned with the time, labor, and capital costs involved in acquiring images. Although higher image quality for the same set of extraction algorithms generally results in better building extraction results, higher quality images inevitably come at the expense of higher acquisition costs.

Disclosure of Invention

The application provides a cross-resolution building image extraction method and device, which are used for extracting a building image meeting the expected high-resolution requirement based on a relatively low-resolution remote sensing image.

In a first aspect, the present application provides a building image extraction method. The method comprises the following steps: acquiring a low-resolution remote sensing image, wherein the low-resolution remote sensing image comprises a building; performing super-resolution reconstruction on the low-resolution remote sensing image by using a generator in the generative countermeasure network, and determining the super-resolution remote sensing image and the edge characteristics of the building; and extracting the image of the building from the super-resolution remote sensing image by using an extraction model according to the edge characteristics.

According to the scheme, the pre-trained generator is used for conducting super-separation on the low-resolution remote sensing image to obtain the high-resolution remote sensing image meeting a certain super-separation multiplying power, so that the pre-trained extraction model can be used for extracting the high-resolution building image from the high-resolution remote sensing image, and the cost for obtaining the remote sensing image can be avoided being increased under the condition that the actual high-quality extraction requirement is met. In addition, when the building is extracted, the edge characteristics of the building in the super-resolution image are considered, so that the extraction model can be more accurately extracted into the building image.

In one possible embodiment, the generator includes a first network and a second network, and the determining the super-resolution remote sensing image and the edge feature of the building includes: inputting the low-resolution remote sensing image into the first network, and determining the super-resolution remote sensing image according to the output of the first network; and determining the edge characteristics in the gradient map of the super-resolution remote sensing image by utilizing the second network according to the prior information of the building.

According to the scheme, the generator in two stages is designed to achieve the super-resolution calculation of the low-resolution remote sensing image so as to obtain the high-resolution remote sensing image, and the edge features of the building are extracted from the obtained gradient map of the super-resolution remote sensing image.

In one possible embodiment, the generative countermeasure network is obtained by: according to the super-resolution ratio, performing down-sampling processing on the low-resolution remote sensing image to obtain a first remote sensing image sample, wherein the resolution of the first remote sensing image sample is smaller than that of the low-resolution remote sensing image; and updating parameters of a generator and a discriminator in the generating countermeasure network according to the first remote sensing image sample and the low-resolution remote sensing image.

The super-resolution ratio determines the multiple relation between the resolution ratio of the super-resolution remote sensing image obtained by the generator and the resolution ratio of the low-resolution remote sensing image.

According to the scheme, the training sample obtained by using the low-resolution remote sensing image of the building image to be extracted is used for training the training sample of the countermeasure network, so that on one hand, the training cost of the network can be saved, and on the other hand, the problem that the generated high-resolution image is not in accordance with the actual situation due to the fact that a large number of networks trained by external data are not suitable for the low-resolution remote sensing image can be avoided.

In a possible embodiment, before the extracting the image of the building in the super-resolution remote sensing image by using an extraction model according to the edge feature, the method further comprises: and enhancing the edge area of the building in the super-resolution remote sensing image according to the edge characteristics.

In a possible implementation manner, the extracting, according to the edge feature, the image of the building in the super-resolution remote sensing image by using an extraction model includes: and inputting the edge characteristics and the super-resolution remote sensing image into the extraction model, and determining the image of the building according to the output of the extraction model.

In one possible implementation, the extraction model is obtained by training: acquiring a second remote sensing image sample and a gradient map of the second remote sensing image sample; and updating the parameters of the extraction model according to the second remote sensing image sample, the gradient map of the second remote sensing image sample and the building label corresponding to the second remote sensing image sample.

In a possible embodiment, the second remote sensing image sample and a gradient map of the second remote sensing image sample are obtained by the generator.

In a second aspect, the present application further provides a building image extraction device. The device includes: the device comprises an acquisition module, a super-division module and an extraction module.

The acquisition module is used for acquiring a low-resolution remote sensing image, and the low-resolution remote sensing image comprises a building.

The super-resolution module is used for carrying out super-resolution reconstruction on the low-resolution remote sensing image by using a generator in the generative countermeasure network and determining the super-resolution remote sensing image and the edge characteristics of the building.

And the extraction module is used for extracting the image of the building from the super-resolution remote sensing image by using an extraction model according to the edge characteristics.

In a possible implementation, the generator includes a first network and a second network, and the hyper-segmentation module is specifically configured to: inputting the low-resolution remote sensing image into the first network, and determining the super-resolution remote sensing image according to the output of the first network; and determining the edge characteristics in the gradient map of the super-resolution remote sensing image by utilizing the second network according to the prior information of the building.

In a possible implementation, the hyper-differentiation module is further configured to obtain a generative confrontation network by: according to the super-resolution ratio, performing down-sampling processing on the low-resolution remote sensing image to obtain a first remote sensing image sample, wherein the resolution of the first remote sensing image sample is smaller than that of the low-resolution remote sensing image; and updating parameters of a generator and a discriminator in the generating countermeasure network according to the first remote sensing image sample and the low-resolution remote sensing image.

In one possible implementation, the hyper-segmentation module is further configured to: and enhancing the edge area of the building in the super-resolution remote sensing image according to the edge characteristics.

In a possible implementation manner, the extraction module is specifically configured to: and inputting the edge characteristics and the super-resolution remote sensing image into the extraction model, and determining the image of the building according to the output of the extraction model.

In a possible implementation, the extraction module further trains the extraction model by: acquiring a second remote sensing image sample and a gradient map of the second remote sensing image sample; and updating the parameters of the extraction model according to the second remote sensing image sample, the gradient map of the second remote sensing image sample and the building label corresponding to the second remote sensing image sample.

In a third aspect, the present application also provides a computing device. The method comprises the following steps: a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the extraction method provided by the foregoing first aspect or any one of the possible implementations of the first aspect.

In a fourth aspect, the present application further provides a computer-readable storage medium. Comprising instructions that, when executed on a computing device, cause the computing device to perform the extraction method as provided by the first aspect or any one of the possible implementations of the first aspect.

In a fifth aspect, the present application further provides a computer program product, which when run on a computer, causes the computer to execute the extraction method provided in the first aspect or any one of the possible implementations of the first aspect.

Any one of the above-mentioned apparatuses, computer storage media, or computer program products is configured to execute the above-mentioned methods, so that the beneficial effects achieved by the apparatuses, the computer storage media, or the computer program products can refer to the beneficial effects of the corresponding schemes in the corresponding methods provided above, and are not described herein again.

Drawings

Fig. 1 is an overall flow chart of an extraction building provided in an embodiment of the present application;

FIG. 2 is a flow chart of a building extraction method provided by an embodiment of the present application;

fig. 3 is a schematic structural diagram of a generative countermeasure network provided in an embodiment of the present application;

fig. 4 is a flowchart of a training method of a generative confrontation network according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a training sample for obtaining a generative confrontation network according to an embodiment of the present application;

FIG. 6 is a flowchart of a training method for extracting a model according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a building extraction device provided in an embodiment of the present application;

fig. 8 is a schematic structural diagram of an extraction apparatus provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be described below with reference to the accompanying drawings.

In the description of the embodiments of the present application, the words "exemplary," "for example," or "for instance" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary," "e.g.," or "e.g.," is not to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the words "exemplary," "e.g.," or "exemplary" is intended to present relevant concepts in a concrete fashion.

In the description of the embodiments of the present application, the term "and/or" is only one kind of association relationship describing an associated object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, B exists alone, and A and B exist at the same time. In addition, the term "plurality" means two or more unless otherwise specified. For example, the plurality of systems refers to two or more systems, and the plurality of screen terminals refers to two or more screen terminals.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit indication of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

In practical mapping applications, as described in the background art, when extracting a building image from a remote sensing image, a relationship between a quality requirement of the building image and an acquisition cost of the remote sensing image needs to be considered.

The same building has great differences in the color, texture feature, edge detail and even feature quantity presented in remote sensing images with different resolutions. In order to meet the high quality requirement of the building image, the building image can be extracted from the high-resolution remote sensing image by using an extraction model, but the acquisition of the original high-resolution remote sensing image consumes a large cost.

If the extraction model is used for extracting the building image from the low-resolution remote sensing image, although the problem of overlarge cost caused by obtaining the high-resolution remote sensing image can be solved, the edge outline and the range detail of the building in the building image obtained by the scheme are poor, and the requirement of high-quality extraction cannot be met.

In view of this, the present application provides a cross-resolution building image extraction method. The overall flow diagram of the building extraction, as shown in fig. 1, includes a hyper-segmentation phase and a segmentation phase.

In the hyper-resolution stage, based on Low resolution remote sensing images (LR images or I) containing buildings_low) Training samples for a super resolution enhancement module (EASR) are obtained and trained using a dataset production module (IPG). The EASR adopts a structure of a generative countermeasure network. After EASR training is finished, performing Super-resolution reconstruction on the low-resolution remote sensing image by using the trained generative countermeasure network to obtain a Super-resolution remote sensing image (Super resolution images or Super resolved images, SR images or I)_SR) And edge features of buildings

In the segmentation stage, the edge feature is taken as a reference and is compared with I_SRInputting the extracted model to extract the image of the building. Wherein, the extraction model is the segmentation network (DES) after training.

The method realizes high-quality building image extraction based on the low-resolution remote sensing image, not only meets the high-quality requirement of the building image, but also can avoid increasing the acquisition cost of the remote sensing image. In addition, when the building image is extracted, the accuracy of extracting the building image can be improved by combining the edge characteristics of the building.

The building image extraction method according to the embodiment of the present application is specifically described below with reference to fig. 2.

Fig. 2 is a flowchart of a cross-resolution building image extraction method according to an embodiment of the present application. The method can be applied to extraction equipment. The extraction equipment can realize the extraction of the building image.

As shown in fig. 2, the method includes steps S201 to S203 as follows.

In step S201, a low-resolution remote sensing image is acquired.

When the topographic map is sketched, the extracting device can obtain the existing low-resolution remote sensing image containing the building from the database. The low-resolution remote sensing image can be an image acquired by a satellite.

In step S202, the generator in the generative countermeasure network is used to perform super-resolution reconstruction on the low-resolution remote sensing image, so as to obtain a super-resolution remote sensing image and edge features of the building. The super-resolution remote sensing image is equivalent to a high-resolution remote sensing image of an area indicated by the low-resolution remote sensing image.

The generative confrontation network includes a generator and an arbiter. The generative confrontation network is obtained according to the low-resolution remote sensing image training. As shown in fig. 1, the IPG module is used to process the low-resolution remote sensing image to obtain a training sample of the generative confrontation network. After the generative confrontation network training is finished, the generator is the EASR.

The extraction equipment can input the low-resolution remote sensing image into the trained generator, and the super-resolution remote sensing image and the edge characteristics of the building are determined according to the output of the generator. The generator extracts and codes the hierarchical features of the input image to obtain semantic features from a bottom layer to a high layer, so that the high-resolution remote sensing image with better detail expression is constructed. In the process, the generator firstly encodes input data layer by layer, and then decodes the obtained high-level semantic information, thereby reconstructing an image with higher resolution. In addition, in order to enhance the attention of the generator to the edge details, the generator is also enabled to guide the texture recovery of the edge details by using the gradient information of the image instead of conservatively suppressing most edge detail features, so that the robustness of the generator is enhanced.

Specifically, referring to the schematic structure of the generative countermeasure network shown in fig. 3, the Generator (Generator) may include a first network and a second network. With continued reference to fig. 3, the extraction device may perform a super-resolution reconstruction of the low-resolution remote sensing image into a super-resolution remote sensing image using the first network. The extraction equipment extracts Enhanced high-frequency information (Enhanced high frequency information), namely edge features of the building, from the gradient map of the super-resolution remote sensing image by using a second network according to the prior information of the building.

The specific structure and training process of the generative countermeasure network will be described in detail with reference to fig. 3, and will not be described herein again.

In one example, in order to improve the accuracy of extracting the building, the super-resolution remote sensing image may be subjected to enhancement processing using the edge features of the building obtained as described above to enhance the edge portion of the building in the super-resolution remote sensing image. As shown in fig. 3, after removing the gradient map (Grad of SR base images) of the super-resolution remote sensing image from the initially obtained super-resolution remote sensing image (SR base images), the edge feature of the building may be added to obtain an enhanced super-resolution remote sensing image (SR images).

In step S203, the image of the building is extracted from the high-resolution remote sensing image using the extraction model based on the edge feature.

After the extraction device obtains the super-resolution remote sensing image, as shown in fig. 1, the super-resolution remote sensing image and the edge features of the building may be input into an extraction model, an output obtained by the extraction model after feature encoding, fusion and decoding are performed on the input data is obtained, and the image of the building is determined according to the output of the extraction model.

Wherein, the extraction model adopts a structure of a two-way segmentation network. The two paths mainly mean that the image and the edge feature are respectively extracted from a low layer to a high layer through different feature extractors, specific feature enhancement is obtained on the edge area of the building after feature fusion aiming at the extracted two paths of features, corresponding semantic information and the like are correspondingly amplified, irrelevant and redundant information around the building is inhibited, and therefore a building extraction result with relatively high resolution is obtained. The specific structure introduction and training process of the extraction model will be described in detail later with reference to the accompanying drawings, and will not be described herein again.

In one example, when the super-resolution remote sensing image is input into the extraction model, the extraction device may further perform feature enhancement processing on the super-resolution remote sensing image using edge features of the building to enhance an edge portion of the building in the super-resolution remote sensing image.

According to the method, on one hand, the generator is used for conducting the super-resolution on the low-resolution remote sensing image to obtain the relatively high-resolution remote sensing image, the low-resolution remote sensing image can be used for extracting the high-resolution building image under the condition that the acquisition cost of the remote sensing image is not increased, and the actual high-quality extraction requirement is met. On the other hand, when the building image is extracted, the high-resolution image and the edge feature of the building are fused, so that the boundary accuracy of the building extraction result can be improved, and redundant information around the building can be inhibited.

Based on the above embodiment of the method for extracting a building image shown in fig. 2, the present application provides a training method for a generative confrontation network.

The generative countermeasure network includes a generator and an arbiter. The generator is used for performing super-resolution reconstruction on the low-resolution remote sensing image to obtain the high-resolution remote sensing image. Referring to the schematic structural diagram of the generative countermeasure network shown in fig. 2, the generator may specifically include a first network and a second network. The first network may employ a Residual neural network including a convolutional layer (Conv layer), an active layer (LeakyRelu layer), a Residual block (Residual block) and a decoding layer (Decoder), the Residual block (Residual block) employing a Residual link (Skip connection) structure. The second network may employ a neural network structure including a convolutional layer (Conv layer), an active layer (LeakyRelu layer), and two branches (Frame branch and Mask branch). The discriminator is a neural network including a plurality of convolutional layers, a plurality of active layers, a plurality of Normalization layers (Batch Normalization layer), and a full connection layer (density).

Fig. 4 is a flowchart of a training method of a generative confrontation network according to an embodiment of the present application. The method can be applied to the extraction device described above.

As shown in fig. 4, the method includes steps S401 and S402 as follows.

In step S401, a training sample is determined from the low-resolution remote sensing image.

The training sample comprises an initial image slice library corresponding to the low-resolution remote sensing image, an initial image pairing library corresponding to the first remote sensing image sample and an edge feature label of a building corresponding to the low-resolution remote sensing image. The resolution of the first remote sensing image sample is smaller than that of the low-resolution remote sensing image.

The first remote sensing image sample can be used for mining autocorrelation inside the low-resolution remote sensing image through the IPG module according to a preset hyper-resolution ratio, and performing down-sampling processing on the low-resolution remote sensing image to obtain a first remote sensing image sample different from the low-resolution remote sensing image in sampling scale.

Taking a common 4-fold super-resolution reconstruction as an example, as shown in fig. 5, a low-resolution remote sensing image I with a sampling scale of 2m is obtained_low(2m) cutting (crop)&split) into several small pictures, constituting the initial image slice library.

And then, constructing an image with smaller resolution corresponding to each small image in the initial image slice library according to the hyper-resolution ratio of 4 times to obtain an initial image pairing library.

The resolution of each image in the initial image matching library is 8m, and the images in the initial image matching library correspond to the images in the initial image slice library one by one.

Each image in the initial image slice library is used as high-score image data (HR tracking dataset I) of training stage_LR). Each image in the initial image matching library is low-score image data (LR training dataset I) in the training phase_LLR). The generator obtained by the training can be composed of a low-resolution remote sensing image I_lowObtaining the high-resolution image I after the super-resolution_HR. As shown in FIG. 5, when I_lowAt a resolution of 2m, I at a resolution of 0.5m can be obtained using the generator_HR。

In one example, before training the network, the image data in the initial image slice library and the initial image pair library may be further processed using a plurality of data enhancement methods, such as random horizontal rotation, random angle clockwise rotation, random scaling, random noise addition, random blurring addition, random cropping, and the like.

The edge feature labels of the buildings corresponding to the low-resolution remote sensing images can be obtained by manually labeling the low-resolution remote sensing images.

According to the method, the training sample of the confrontation network is constructed by mining the autocorrelation inside the low-resolution image without introducing a large amount of external data, so that on one hand, the calculated amount and the cost for obtaining the remote sensing image can be reduced, and on the other hand, the accuracy of the confrontation network for the super-resolution reconstruction of the low-resolution image can be improved. The confrontation network training process relies on prior information provided by training data, and the prior information obtained from a large amount of external data is probably not suitable for the low-resolution remote sensing image to be processed. The autocorrelation of the low-scale features in the remote sensing image can provide more valuable prior information for image reconstruction.

In step S402, parameters of the generator and the discriminator in the generative confrontation network are updated according to the training samples.

After the training samples are determined, the generative confrontation network is trained. During the training process, the generator continuously learns how to reconstruct a more realistic, high-resolution, remote-sensing image in order to confuse the discriminator. The discriminator is continuously learning and strives to judge whether the input image is a real high-resolution remote sensing image. The generator and the discriminator are in game, so that the generator can reconstruct a super-resolution remote sensing image with higher quality and closer to a real high-resolution remote sensing image.

The specific training process comprises the following steps:

s4021, determining an image in an initial image pairing library indicating a first remote sensing image sample to be input into a first network in a generator. Wherein, each layer and module of the first network carries out feature coding and decoding on the super-resolution remote sensing images (SR images) and outputs the SR images. The super-resolution remote sensing image is a prediction result of the input low-resolution remote sensing image.

S4022, inputting the gradient map of the super-resolution remote sensing image into a second network in the generator. The method comprises the steps that feature coding is carried out on a convolutional layer and an active layer of a second network to obtain texture information, high-frequency information is extracted from frame branch texture information of the second network, a mask branch of the second network obtains information of a building area from the texture information according to prior information of a building corresponding to a low-resolution remote sensing image, a decoding layer of the second network carries out fusion decoding on the high-frequency information obtained by the frame branch and the information of the building area obtained by the mask branch, and an enhanced high-resolution gradient map (enhanced high-frequency information) of the super-resolution remote sensing image is output and indicates predicted edge features of the building corresponding to the low-resolution remote sensing image. The information output by the two branches is combined and then decoded instead of being decoded directly respectively, which can help to filter out redundant high-frequency information.

S4023, inputting the super-resolution remote sensing image and the enhanced gradient map of the super-resolution remote sensing image into a discriminator, and obtaining the output of the discriminator. As shown in fig. 3, the discriminator extracts semantic information from a lower layer to a higher layer of the input image, obtains 1024-dimensional features after a full connection layer detect (01024) is recombined, and then obtains a one-dimensional array through the full connection layer detect (1), wherein the array indicates whether the input image is considered by the discriminator to be a real high-resolution remote sensing image or a reconstructed false high-resolution remote sensing image.

S4024, substituting the output of the first network, the output of the second network and the output of the discriminator into a loss function, calculating a loss value, and updating parameters of the generator and the discriminator according to the loss value. The loss function is determined from a weighted sum of the image-based pixel-level loss function, the perceptual loss function (the characteristic distance function shown in equation 2), the antagonistic loss function (shown in equation 3), and the image-gradient-based pixel-level loss function (shown in equation 4).

The first network, being affected by an exemplar loss function, tends to render the detail more smoothly as a whole in order to obtain a lower exemplar loss score, thereby losing the high frequency detail we need. Therefore, the super-resolution remote sensing image output by the first network is subjected to feature enhancement by adopting the edge features output by the second network. As shown in fig. 3, after the gradient map of the image is removed from the image output by the first network, the enhanced high-frequency information (i.e., the edge features of the building) output by the second network is added, so as to obtain the feature-enhanced super-resolution remote sensing image.

The pixel-level loss function based on the image is shown in formula (1), and is used for calculating the L1 distance L of the pixel between the super-resolution remote sensing image and the high-resolution remote sensing image of the first remote sensing image sample_pix。

In the formula (1), y_iThe ith pixel in the high resolution remote sensing image (i.e. the low resolution remote sensing image) representing the first remote sensing image sample,

the i-th pixel in the super-resolution remote sensing image representing the first remote sensing image sample,

to represent

Norm of (d).

The image-based perception loss function is shown in formula (2), and the function is used for measuring the distance between the super-resolution remote sensing image and the high-resolution remote sensing image of the first remote sensing image sample in the feature space. Wherein, a pre-trained 19-layer VGG network (represented by phi function) is used for extracting feature maps of two images, and the feature maps are substituted into a formula (2) to calculate a loss value L_{g_pix}。

The image-based opposition loss function is shown in equation 3, which encourages the generator to generate a more organoleptically realistic image.

In the formula (3), the first and second groups,

and the probability value of the ith pixel in the super-resolution remote sensing image of the first remote sensing image sample output by the discriminator is represented.

The pixel-level loss function based on the image gradient is shown in formula (4), and the function is used for calculating the L1 distance L of the gradient between the super-resolution remote sensing image and the high-resolution remote sensing image of the first remote sensing image sample_pix。

In the formula (4), the first and second groups,

representing the gradient of the ith pixel in the super-resolution remote sensing image of the first remote sensing image sample, F (y)_i) A gradient of an ith pixel in the high resolution remote sensing image representing the first remote sensing image sample.

An image gradient based opposition loss function is shown in equation (5), which encourages the generator to generate a more perceptually realistic gradient.

Based on the above embodiment of the method for extracting a building image shown in fig. 2, the embodiment of the present application further provides a training method for extracting a model.

Referring to fig. 2, the extraction model adopts a two-path structure, and makes full use of the super-resolution remote sensing image generated at the previous stage and the enhanced edge features of the building to efficiently and accurately extract the high-resolution building image. One path is used for carrying out feature coding on the super-resolution remote sensing image, and semantic information is extracted layer by layer from a low layer to a high layer. And the other path is used for carrying out feature coding on the building edge features. Since the purpose of the present application is to better extract the building, it is intuitive that the edge features of the building are of great significance to reconstruct the edge of the building and suppress redundant information around the building. And moreover, a parallel path is independently set for feature extraction, rather than a serial or other mode, mainly for better enhancing the guiding effect of the edge features on the extraction result. The two paths of coding features obtained after the two paths of parallel coding structures have the same importance in dimension and weight, and the method has better guiding value for further reconstructing through feature extraction coding results.

And after the coding is finished, a fusion layer and a decoding layer in the extraction model fuse the two paths of coding features and then decode the two paths of coding features to obtain the extraction result of the building. The fusion layer is designed mainly to allow the model to better combine multiple levels of the extracted two-way features, so that the decoding layer can easily reconstruct the building from the extracted features.

Fig. 6 is a flowchart of a training method for extracting a model according to an embodiment of the present disclosure. The method can be applied to the extraction device described above. As shown in fig. 6, the method includes steps S601-S602 as follows.

In step S601, training samples are obtained, where the training samples include: the second remote sensing image sample, the edge characteristics of the building corresponding to the second remote sensing image sample and the building image sample corresponding to the second remote sensing image sample.

And processing the gradient map of the second remote sensing image sample by using a second network in the generator to obtain the edge characteristics of the building corresponding to the second remote sensing image sample. The building object label corresponding to the second remote sensing image sample can be determined manually.

In step S602, parameters of the extraction model are updated according to the training samples.

And inputting the second remote sensing image sample and the edge characteristics of the building corresponding to the second remote sensing image sample into the extraction model, and obtaining the building image output by the extraction model after the extraction model pair is subjected to characteristic coding, fusion and decoding.

The L1 distance of pixels between the building image sample and the building image output by the extraction model is determined using a loss function, wherein the loss function is shown in equation (6).

In the formula (6), the first and second groups,

representing the ith pixel, s, in the building image output by the extraction model_iRepresenting the ith pixel in the building image sample.

In one example, the aforementioned individual loss functions may be weighted and summed to obtain a total loss function, from which parameters of the generative countermeasure network and the extraction model are updated.

Based on the method embodiment shown in fig. 2, the embodiment of the present application further provides a cross-resolution building image extraction device. The extraction device is used for implementing the method embodiment described in fig. 2 above. As shown in fig. 7, the extracting apparatus 700 includes: an obtaining module 701, a super-dividing module 702 and an extracting module 703.

The obtaining module 701 is configured to obtain a low-resolution remote sensing image, where the low-resolution remote sensing image includes a building.

The super-resolution module 702 is configured to perform super-resolution reconstruction on the low-resolution remote sensing image by using a generator in the generative countermeasure network, and determine a super-resolution remote sensing image and edge features of the building.

The extraction module 703 is configured to extract an image of the building from the super-resolution remote sensing image by using an extraction model according to the edge feature.

It should be noted that, when the extraction apparatus 700 provided in the embodiment shown in fig. 7 executes the extraction method, only the division of the above functional modules is illustrated, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the extraction device provided in the above embodiment and the extraction method embodiment shown in fig. 2 belong to the same concept, and the specific implementation process thereof is described in detail in the method embodiment and is not described again here.

Fig. 8 is a schematic hardware structure diagram of an extraction apparatus 800 according to an embodiment of the present application.

The extraction apparatus 800 can be used to implement the above-described extraction method of a building, training method of a generative confrontation network, and training method of an extraction model. Referring to fig. 8, the extracting apparatus 800 includes a processor 820, a memory 820, a communication interface 840, and a bus 840, and the processor 820, the memory 820, and the communication interface 840 are connected to each other through the bus 840. Processor 820, memory 820 and communication interface 840 may also be connected by connections other than bus 840.

The memory 820 may be various types of storage media, such as Random Access Memory (RAM), read-only memory (ROM), non-volatile RAM (NVRAM), Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), flash memory, optical memory, hard disk, and the like.

Where the processor 820 may be a general-purpose processor, the general-purpose processor may be a processor that performs certain steps and/or operations by reading and executing content stored in a memory, such as the memory 820. For example, a general purpose processor may be a Central Processing Unit (CPU). Processor 820 may include at least one circuit to perform all or a portion of the steps of the methods provided by the embodiments shown in fig. 2, 4, or 6.

The communication interface 840 includes an input/output (I/O) interface, a physical interface, a logical interface, and the like for realizing interconnection of devices inside the extracting apparatus 800, and an interface for realizing interconnection of the extracting apparatus 800 with other apparatuses (e.g., other extracting apparatuses or user apparatuses). The physical interface may be an ethernet interface, a fiber optic interface, an ATM interface, or the like.

Bus 840 may be any type of communication bus, such as a system bus, used to interconnect processor 820, memory 820, and communication interface 840.

The above devices may be respectively disposed on separate chips, or at least a part or all of the devices may be disposed on the same chip. Whether each device is separately located on a different chip or integrated on one or more chips is often dependent on the needs of the product design. The embodiment of the present application does not limit the specific implementation form of the above device.

The extraction device 800 shown in fig. 8 is merely exemplary, and in implementation, the extraction device 800 may further include other components, which are not listed here.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of the present application. It should be understood that, in the embodiment of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

The above-mentioned embodiments, objects, technical solutions and advantages of the present application are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present application, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present application should be included in the scope of the present application.

Claims

1. A building image extraction method is characterized by comprising the following steps:

acquiring a low-resolution remote sensing image, wherein the low-resolution remote sensing image comprises a building;

performing super-resolution reconstruction on the low-resolution remote sensing image by using a generator in a generative countermeasure network, and determining the super-resolution remote sensing image and the edge characteristics of the building;

and extracting the image of the building from the super-resolution remote sensing image by using an extraction model according to the edge characteristics.

2. The method of claim 1, wherein the generator comprises a first network and a second network, and wherein the determining the super-resolution remote sensing image and the edge feature of the building comprises:

inputting the low-resolution remote sensing image into the first network, and determining the super-resolution remote sensing image according to the output of the first network;

and determining the edge characteristics in the gradient map of the super-resolution remote sensing image by utilizing the second network according to the prior information of the building.

3. The method according to claim 1, characterized in that said generative confrontation network is obtained by:

according to the super-resolution ratio, performing down-sampling processing on the low-resolution remote sensing image to obtain a first remote sensing image sample, wherein the resolution of the first remote sensing image sample is smaller than that of the low-resolution remote sensing image;

and updating parameters of a generator and a discriminator in the generating countermeasure network according to the first remote sensing image sample and the low-resolution remote sensing image.

4. The method according to claim 1, wherein before the extracting the image of the building in the super-resolution remote sensing image by using an extraction model according to the edge feature, the method further comprises:

and enhancing the edge area of the building in the super-resolution remote sensing image according to the edge characteristics.

5. The method according to claim 1, wherein the extracting the image of the building in the super-resolution remote sensing image by using an extraction model according to the edge feature comprises:

and inputting the edge characteristics and the super-resolution remote sensing image into the extraction model, and determining the image of the building according to the output of the extraction model.

6. The method of claim 1, wherein the extraction model is obtained by training:

acquiring a second remote sensing image sample and a gradient map of the second remote sensing image sample;

and updating the parameters of the extraction model according to the second remote sensing image sample, the gradient map of the second remote sensing image sample and the building label corresponding to the second remote sensing image sample.

7. The method of claim 6, wherein the second remotely sensed image sample and a gradient map of the second remotely sensed image sample are obtained by the generator.

8. A building image extraction device, comprising:

the acquisition module is used for acquiring a low-resolution remote sensing image, and the low-resolution remote sensing image comprises a building;

the super-resolution module is used for performing super-resolution reconstruction on the low-resolution remote sensing image by using a generator in the generative countermeasure network and determining the super-resolution remote sensing image and the edge characteristics of the building;

and the extraction module is used for extracting the image of the building from the super-resolution remote sensing image by utilizing an extraction model according to the edge characteristics.

9. A computing device, comprising: a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the method of any of claims 1 to 7.

10. A computer-readable storage medium comprising instructions that, when executed on a computing device, cause the computing device to perform the method of any of claims 1 to 7.