CN111625608B

CN111625608B - Method and system for generating electronic map according to remote sensing image based on GAN model

Info

Publication number: CN111625608B
Application number: CN202010310125.XA
Authority: CN
Inventors: 陈占龙; 李静涛; 王润; 孙晨星
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2020-04-20
Filing date: 2020-04-20
Publication date: 2023-04-07
Anticipated expiration: 2040-04-20
Also published as: CN111625608A

Abstract

The invention discloses a method and a system for generating an electronic map according to remote sensing images based on a GAN model. The method and the system improve the generator architecture and the loss function of the GAN model and provide an adaptive solution for generating electronic map color rendering and road identification. The generator architecture of the GAN model consists of three parts: a down-sampling layer, a residual block, an up-sampling layer, comprising 6 residual blocks and 2 long-jump connections. The model loss function contains loss terms for optimizing color rendering and road generation for generating an electronic map, in addition to adaptive perceptual loss and adaptive countervailing loss. In addition, the present invention also controls color rendering using binary image channels of specific terrain elements. The results show that: the quality of the electronic map generated by the method and the system disclosed by the invention is superior to that of the existing picture translation model under the evaluation of visual vision and classical evaluation indexes, the pixel-level translation accuracy is improved by 40%, and the FID evaluation value is reduced by 38%.

Description

Method and system for generating electronic map according to remote sensing image based on GAN model

Technical Field

The invention belongs to the field of geographic science, and particularly relates to a method, a system and an electronic device for generating an electronic map according to a remote sensing image.

Background

The picture translation means that a picture in the field A is input, and the picture in the corresponding field B is output after processing and conversion. Many problems in the field of computer vision can be attributed to picture translation, for example, the super-resolution problem can be regarded as the conversion from a low-resolution picture to a high-resolution picture, and the picture colorization can be regarded as the conversion of a single-channel gray-scale image and a multi-channel color image. Although the convolutional neural network has a good effect in solving the picture translation task of the painting art, the problems of low quality of generated pictures, complex design of loss functions and the like exist. Until the appearance of a GAN (generation countermeasure network) network model, a new method is provided in a picture translation task, and the GAN well solves the problems existing in the convolutional neural network by training a discriminator to automatically learn a loss function and generates a picture with higher quality.

In recent years, there are many GAN-based picture translation models, such as: pix2pix is proposed based on CGAN improvement, the generator no longer generates data from random noise, but reads in from a given picture, which is the basis of many subsequent GAN-based picture translation models; the CycleGAN is composed of two generators and two discriminators, and a cycle limit is added on the basis of the loss function of the original GAN, so that the problem of training a picture translation model under the condition of no paired data sets is solved; the VAE + GAN model further solves the problem by adding a shared hidden space; pix2pixHD can generate a color map with resolution as high as 2048 x 1024 from a semantic segmentation map by using two generators, a global generation network and a local promotion network; textureGAN enables texture control of the generated picture by introducing local texture loss and local content loss.

At present, pix2pix and CycleGAN are used as a general picture translation frame and can be directly applied to a map generation task, but the generated electronic map cannot accurately identify and render surface feature elements such as forest lands, water areas, roads and the like, and meanwhile, the problems of blurred texture, low quality and the like exist.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a model established based on mapGAN from a specific scene generated by an electronic map, aiming at the problems of fuzzy texture, low quality and the like in the prior art, and to adopt a plurality of targeted optimization measures to improve the accuracy and the attractiveness of the generated map.

The technical scheme adopted by the invention for solving the technical problems is as follows: a method for generating an electronic map according to remote sensing images based on a GAN model is constructed, and comprises the following steps:

s1, constructing a GAN-generation confrontation network model, wherein the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:

the generator comprises a down-sampling layer, M residual blocks and an up-sampling layer, wherein the down-sampling layer adopts multilayer convolution to reduce the width and height of the characteristic matrix, and the up-sampling layer adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix;

the discriminator adopts the receptive field blocks with the size of 70 x 70 to judge the electronic map generated by the generator and outputs a matrix representing the discrimination result of each block;

s2, acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed in the step S1 for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;

in the training process, in order to promote GAN model learning, N pieces of binary image channel information are used to construct a first loss term to distinguish the difference of color rendering of the generated electronic map and the target electronic map in terms of N different surface feature elements, wherein the first loss term is defined as:

wherein i represents the ith feature element,

and &>

Respectively represents the binary image extracted for the i element in the electronic map generated by the generator and the target electronic map, and is/are selected>

The total number of pixels occupied by the i element in the target electronic map, lambda ₁ A weight coefficient being a first loss term;

and S3, inputting the remote sensing image to be processed into the trained GAN model to obtain a corresponding target electronic map.

The invention discloses a system for generating an electronic map according to remote sensing images based on a GAN model, which comprises the following modules:

the generation countermeasure network model building module is used for building a GAN-generation countermeasure network model, and the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:

the discriminator adopts reception field blocks to judge the electronic map generated by the generator and outputs a matrix representing the discrimination result of each block;

the model training module is used for acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed by the confrontation network model construction module for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;

the model training module comprises a first loss item construction module, which is used for constructing a first loss item defined by the following formula in order to promote the learning of the GAN model by using N binary image channel information in the training process so as to judge the difference of color rendering of the generated electronic map and the target electronic map in the aspects of N different surface feature elements:

in the formula, i representsThe ith feature element of the table is shown,

and &>

and the target electronic map generation module is used for inputting the remote sensing image to be processed into the trained GAN model to obtain a corresponding target electronic map.

The method and the system for generating the electronic map according to the remote sensing image based on the GAN model have the following beneficial effects that:

1. 6 residual blocks are added in a generator of the GAN model, and the performance of the model can be improved by properly increasing the network depth on the premise of not generating the problem of gradient propagation;

2. the model loss function contains loss terms for optimizing color rendering and road generation for generating an electronic map, in addition to adaptive perceptual loss and adaptive countervailing loss.

Drawings

The invention will be further described with reference to the following drawings and examples, in which:

FIG. 1 is a graph of the results of partial testing of pix2pix and mapGAN models on a Facades data set;

fig. 2 is a partial test diagram of the mapGAN model in the task of picture translation where the photo scene environment is converted from day to night.

FIG. 3 is a flowchart illustrating steps of a method for generating an electronic map from a remote sensing image based on a GAN model according to the present invention;

FIG. 4 is a GAN model generator architecture diagram;

FIG. 5 is a pair of training data for rendering green space into a square for aesthetics;

FIG. 6 is a model structure diagram of a method for generating an electronic map from a remote sensing image based on a GAN model according to the present invention.

Detailed Description

For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.

1. Description of the experimental conditions:

the map translation task of the invention trains the remote sensing-electronic data set at 1096 and tests the remote sensing-electronic data set at 1042, wherein the data set is from satellites and electronic map tiles on a Google map, and the picture size is 600 pixels by 600 pixels.

The experiment carried out by the invention is that 1 block of GPU with NVIDA M40, 4 blocks of CPU and RAM with Inter Xeon Platinum 8163@2.5GHz: 30 GiB.

2. Electronic map generation quality analysis

The present embodiment uses an improved GAN model (hereinafter referred to as mapGAN model) for tile map generation flip-over and compares the quality of the generated map with a pix2pix, cycleGAN model on the same test set. In the training phase, 3 binary image channels are acquired by using an OpenCV library. Table 1 shows the evaluation results of the generation quality of mapGAN and other model maps under different evaluation indexes, where the sample data set for calculating the true feature distribution of the electronic map includes 2000 electronic maps, and the sample data set for calculating the feature distribution of the model-generated electronic map includes 1000 electronic maps. The result shows that the map generation result of the mapGAN model is better than the pix2pix and cycleGAN models in pixel level translation accuracy, kernel MMD and FID evaluation.

Table 1 evaluation results of models under different evaluation indexes

3. Model expansibility analysis

Although the mapGAN model is proposed to solve the specific application problem of accurately generating an electronic map from a remote sensing image, the mapGAN model can be used as a brand-new image translation model after input control and related loss items of a binary image channel in the mapGAN are removed. To further explore the scalability of this model, the mapGAN model of this embodiment was modified as follows:

(1) Canceling binary map input related to an expressway, a water area and a forest land;

(2) Cancelling the use of L _road ，L _color And L _s A loss term;

(3) Mixing L with _{f_vgg} The calculated feature loss is instead a comparison between the generator generated picture and the target generated picture. The remaining model settings remain unchanged. The invention performs mapGAN expansibility test on two different picture translation tasks.

And (3) translating the semantic tags to generate photos: using a CMP concepts dataset, containing 400 training samples, a translation task was generated for the conversion from the building outside semantic segmentation map to the building outside photo. The experiment was trained and tested on the Facades dataset using pix2pix and mapGAN simultaneously, and the results of the experiment are shown in figure 1. Intuitively, the translation result of mapGAN on this task is not much different from pix2 pix. To give a more accurate comparison, the Kernel MMD index evaluation was further used and the data showed that the evaluation of pix2pix was 0.13 and the evaluation of mapgan was 0.11, which is 15.4% less than the comparison.

Converting the photo scene from day to night: the data set source is an outdoor scene data set used by P-Y.Laffont, and the translation task is to input a picture taken in the daytime of a place and output a picture of the place at night. The results of the experiment are shown in FIG. 2. The result shows that mapGAN can effectively separate the ground features and the environment in the input picture and correctly learn the state attributes of various ground features at night. The pix2pix model was also introduced here to train tests on the same dataset and evaluated against mapGAN under Kernel MMD evaluation criteria, showing a 33.3% reduction in the results for pix2pix of 0.12 and mapGAN of 0.08.

Example 1:

the following will explain in detail the steps of a method for generating an electronic map from a remote sensing image based on a GAN model, specifically refer to fig. 3:

the method comprises the following 3 steps:

step S1:

constructing a GAN-generation confrontation network model, wherein the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:

the generator comprises a down-sampling layer which adopts multilayer convolution to reduce the width and height of the characteristic matrix, M residual blocks and an up-sampling layer which adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix (please refer to FIG. 4 for the structure of a generator network model);

the discriminator comprises a receptive field block for judging the electronic map generated by the generator and outputting a matrix representing a discrimination result;

in the present step, the GAN-generated countermeasure network model includes a generative model (i.e., generator) and a discriminative model (i.e., discriminator). Where the discriminant model is used to determine whether a given picture is a true picture (a picture taken from a data set), the task of generating the model is to create a picture that looks like a true picture, i.e. to generate a picture through the model that closely resembles the picture you want.

At the beginning, the two models are not trained, the two models are subjected to antagonistic training together, the generated model generates a picture to deceive the discrimination model, then the discrimination model judges whether the picture is true or false, finally, the two models are stronger and stronger in the training process of the two models, and finally, the two models reach a steady state.

Under the current embodiment, the following 2 improvements are made to the generator:

1. the generator comprises a down-sampling layer, an up-sampling layer and a residual block, wherein a convolution kernel of 7*7 is cancelled, and 3 convolution kernels of 3*3 are adopted, so that the purpose of reducing the receptive field of the convolution kernel is to enhance the sensitivity of the generator to the detailed characteristics of each picture block and reduce the training parameters.

2. The number of the residual blocks is 6, that is, in the generator, 6 residual blocks are added between the lower sampling layer and the upper sampling layer, wherein each residual block is composed of 2 convolution layers, and the performance of the model can be improved by properly increasing the network depth on the premise of not generating the problem of gradient propagation.

In the present embodiment, the following improvements are made to the discriminator:

in the present embodiment, the discriminator uses the segments of the receptive field with the size of 70 × 70 to judge the electronic map generated by the generator. In the convolutional neural network, a Receptive Field (Receptive Field) refers to a region of an input image that can be seen by a certain point on a feature map, that is, a point on the feature map is obtained by calculating the size region of the Receptive Field in the input image.

In summary, based on the improvement points of the generator and the discriminator, the key point of step S1 is the structural design of the GAN model, and currently, by adding the residual block and reducing the receptive field of the convolution kernel, the performance of the network model is further improved, and the sensitivity of the model to the detailed features of each picture block is enhanced.

Step S2:

acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed in the step S1 for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;

in the training process, in order to promote the GAN model learning, a first loss term defined by formula (1) is constructed by using N binary image channel information to judge the difference of color rendering of the generated electronic map and the target electronic map in terms of N different ground feature elements:

wherein i represents the ith feature element,

and &>

Total number of pixels occupied by i element in target electronic map, lambda ₁ A weight coefficient being a first loss term;

in the current step, how to perform correct color rendering on the generated electronic map by model learning faces many difficulties, which are reflected as follows: (1) In the standard electronic map manufacturing, when color rendering is performed, attribute information of various aspects of the map area needs to be referred to, for example, an expressway is rendered to be orange according to geographic element information extracted from a geographic entity database, and a national-way provincial road is rendered to be yellow, but the attribute information neural network cannot be extracted from a remote sensing image. (2) The model needs to take into account aesthetics when rendering the colors of the generated electronic map. For example, sometimes to increase map aesthetics, standard electronic mapping will render the greens into standard squares, although this may contain some non-greens components, as shown in fig. 5.

In order to overcome the above (1) difficult point, in this embodiment, with the aid of the generation control concept of the CGAN model, the input channels of the GAN model are designed to be six, which are: remote sensing image RGB channel and 3 binary images; wherein, 3 binary images respectively represent the information of the forest land, the water area and the highway by using 0-1 codes, and the pixel range with the value of 1 represents the color rendering range of the corresponding element.

Currently, in order to reflect the color difference, the difference size of the color rendering of the generated electronic map and the target electronic map on 3 elements of the forest land, the water area and the expressway is judged by constructing the following first loss terms:

/>

(a1) Wherein i represents the ith feature element,

and &>

Respectively represents the binary image extracted aiming at the i element in the electronic map generated by the generator network and the target electronic map, and is/are selected>

The total number of pixels occupied by the i element in the target electronic map, lambda ₁ Is the weight coefficient of the first loss term.

The difference size of the color rendering is considered in the above, however, the correct identification and display of the roads in the remote sensing image is an important task for electronic map generation, which is only the basis for applications such as navigation. In the aspect of road identification generation, the model not only needs to learn to connect roads shielded by scattered trees to ensure the continuity of the roads, but also generates a road boundary which is straight and smooth. To do both, the present embodiment constructs a second loss term defined by formula (a 2), increasing the restriction of the generated electronic map in terms of road continuity and smoothness:

(a2) In the formula (I), the compound is shown in the specification,

and &>

Electronically representing an electronic map and an object generated by a generator network, respectivelyTwo-value map extracted for a road element in the map, based on the evaluation of the road condition>

Total number of pixels, lambda, occupied by a road in the target electronic map ₂ Is the loss term weight.

Because the countertraining is widely applied to the picture translation model, the countertraining can guide the generator to generate more vivid pictures by automatically learning the loss function by using a trainable arbiter in the training process.

However, the GAN original penalty function presents a problem: that is, when the generator is updated, when the generated false sample is far away from the decision boundary and still is on the side of the real sample, the original sigmoid cross entropy loss function is used to cause the gradient to disappear. Therefore, the present embodiment adopts the least square loss function proposed by the present embodiment to solve the problem of premature disappearance of the gradient, and combines the classical L1 loss function to further improve the stability of the mapGAN model training.

In order to improve the stability of the mapGAN model training, based on the formulas (a 1) - (a 2), the final loss function of the GAN model in this embodiment, i.e. the third loss term, is defined by the formula (a 3):

L(D)＝minL _adv (D)

L(G)＝min(L _adv (G)+L _color +L _road +L _p )； (a3)

in the formula, L (D) represents a loss function of the discriminator, and L (G) represents a loss function of the generator.

Wherein the adaptive counter-loss term L _adv Defined by equations (a 4) - (a 5) as:

in the formula, L _adv (D) Adaptive countering loss function, L, of finger arbiter _adv (G) Adaptive penalty function, L, for finger generators ₁ Is L ₁ A loss function; lambda ₆ And λ ₇ A weight coefficient representing a corresponding loss term; x represents a target electronic map, c represents all binary image channels in generator input, y represents a remote sensing image input to the GAN model, and p _ data (y) and p _ data (x) represent probability distribution obeyed by y and x respectively; g (y, c) represents an electronic map generated by the generator with y and c as input, D (G (y, c), c, y) represents the discrimination probability of the discriminator with G (y, c), c and y as input and further output; d (x, c, y) represents the discrimination probability of the discriminator with x, c, y as input and further output.

In summary, the key point of step S2 is that, in order to improve the stability of the mapGAN model training, the final loss function of the GAN model, that is, the third loss term is designed, and in consideration of the difference in color rendering between the generated electronic map and the target electronic map in terms of N different feature elements and the limitation of the electronic map in terms of road continuity and smoothness, a trainable discriminator is used to automatically learn the third loss term to guide the generator to generate a more realistic picture during the training process.

And step S3:

based on the two steps, the remote sensing image to be processed can be input into the trained GAN model at present, and therefore the corresponding target electronic map is obtained.

Example 2:

in order to effectively help the model reuse the features extracted by the downsampling during the electronic map construction in the upsampling stage, two long-jump connections are added between the downsampling layer and the upsampling layer in the embodiment, so that cross-layer information transfer is achieved. Please refer to fig. 4.

Example 3:

the map generation discussed in the invention can be regarded as a special style migration, in the scene of the invention, the content picture is a remote sensing image, the style picture is an electronic map, and the purpose of the map generation is to reserve a certain content of the remote sensing image and convert the remote sensing image into a picture with a network electronic map style.

The loss function during the training of the style migration model is generally designed into two items, one item measures the content similarity between the input content picture and the output composite picture, and the other item measures the style similarity between the input style picture and the output composite picture based on the Gram matrix.

The invention makes the following improvements on the basis of the method: (1) A feature loss term consisting of the features extracted by the discriminator is added. (2) And selecting a vgg-19 model for the layer number of the characteristic loss establishment by adopting a priori test method. In this embodiment, the adaptive perceptual loss function L _p The mathematical formula is as follows:

L _p ＝L _{f_d} +L _{f_vgg} +L _s

in the formula, L _{f_d} Representing a loss of characteristics of the discriminator, L _{f_vgg} Representing the loss of features of the feature extraction and matching structure in the generator by adopting a vgg-19 model, L _s Representing a generator style loss term constructed using the generated electronic map and the target electronic map, the mathematical expression of each of the above loss terms being defined by the following formula:

L _S ＝λ ₅ (Gram(s)-Gram(x))2；

in the above 3-term equation, λ ₃ 、λ ₄ 、λ ₅ A weight coefficient representing each loss term; t represents the number of network layers, n _t Features representing the t-th network layer; d represents a discriminator, s represents the generated electronic map, x represents the target electronic map, and c represents all binary image channels in the input of the generator; d _t The characteristics extracted by the discriminator at the t layer are taken as input; v represents the vgg-19 model employed in the generator, V _t (. Star) denotes vgg-19 model extraction with ". Star" as inputThe content characteristics of the obtained picture; style loss term L _s Expressing the generated picture style characteristics by using a Gram matrix, the Gram matrix being defined by the following formula:

in the formula (I), the compound is shown in the specification,

and &>

Respectively represent the j-th feature matrix and the k-th feature matrix in the t-th network layer of the vgg-19 model, and the meaning of the rest variables is the same as the formula in the foregoing. The Gram matrix is defined as the sum of the inner products of any two feature matrices in a particular network layer.

When the method for generating an electronic map from a remote sensing image based on a GAN model disclosed in the present invention is applied to a system, the system structure thereof refers to fig. 6.

The invention discloses a system for generating an electronic map according to a remote sensing image based on a GAN model, which comprises a generation confrontation network model building module L1, a model training module L2 and a target electronic map generating module L3, wherein:

a generation countermeasure network model construction module L1 that stores an execution program for executing the step S1 disclosed in the embodiments 1 to 3, specifically for constructing the GAN model;

a model training module L2 storing an execution program for executing the step S2 disclosed in embodiments 1 to 3, which includes constructing a first loss function by a first loss term constructing module L21 in the model training module L2, a second loss function by a second loss term constructing module L22 in the model training module L2, and a final loss function of the GAN model by a third loss term constructing module L23 in the model training module L2, during training;

a target electronic map generation module L3 that stores an execution program for executing step S3 disclosed in embodiment 1, specifically for generating a final target electronic map.

The above description is made of the content of the implementation method, the execution flow when the implementation method is applied to the system, and the system configuration. The implementation method is not limited to the experimental conditions and the experimental environment disclosed in the embodiment, and adaptive adjustment can be made on selection of the data source and the workstation to achieve a better implementation effect.

In summary, according to the method and system for generating the electronic map according to the remote sensing image based on the GAN model, 6 residual blocks are added in a generator of the GAN model, and the performance of the model can be improved by properly increasing the network depth on the premise of not generating the problem of gradient propagation; in the training process, the loss function of the model comprises an adaptive perception loss item and an adaptive countermeasure loss item which are set per se, and also comprises a loss item for optimizing color rendering of the generated electronic map and road generation, and by distinguishing the difference of the color rendering of the generated electronic map and a target electronic map on N different ground feature elements and increasing the limitation of the generated electronic map on road continuity and smoothness, a trainable discriminator is used for automatically learning the loss function to guide the generator to generate more vivid pictures in the training process.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for generating an electronic map according to remote sensing images based on a GAN model is characterized by comprising the following steps:

s1, constructing a GAN-generation confrontation network model, wherein the GAN model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:

the discriminator comprises a reception field block for judging the electronic map generated by the generator and outputting a matrix representing the discrimination result;

wherein i represents the ith feature element,

and &>

Occupied by i element in target electronic mapTotal number of pixels, λ ₁ A weight coefficient being a first loss term;

2. The method for generating an electronic map according to remote sensing images of claim 1, wherein in step S1, the downsampled layer, the residual block and the upsampled layer are 3 convolution kernels of 3*3.

3. The method for generating an electronic map according to the remote sensing image as claimed in claim 2, wherein in step S1, two long jump connections are added between the down-sampling layer and the up-sampling layer of the generator, so that the features extracted by the down-sampling layer are used for realizing cross-layer information transfer when the electronic map is constructed in the up-sampling stage.

4. A method of generating an electronic map from remotely sensed images as recited in claim 3, wherein in step S1, the generator comprises 6 residual blocks, each residual block being composed of two convolutional layers.

5. The method for generating an electronic map according to the remote sensing image of claim 4, wherein in step S2, the input channels of the GAN model comprise 3 binary map channels respectively representing information of forest land, water area and road;

constructing a second loss term defined by equation (2) to increase the restriction of the generated electronic map in terms of road continuity and smoothness:

in the formula (I), the compound is shown in the specification,

and &>

Respectively represents the two-value map extracted aiming at the road elements in the electronic map and the target electronic map generated by the generator>

Total number of pixels, lambda, occupied by a road in the target electronic map ₂ Is the weight coefficient of the loss term.

6. The method of claim 5, wherein the loss function of the GAN model further comprises an adaptive perceptual loss term L _p And adaptive countering loss term L _adv (ii) a In conjunction with equations (1) - (2), the final loss function of the GAN model is defined by equation (3):

L(D)＝minL _adv (D)

L(G)＝min(L _adv (G)+L _color +L _road +L _p )； (3)

wherein L (D) represents the loss function of the discriminator and L (G) represents the loss function of the generator;

the adaptive counter-loss term L _adv Defined by equations (4) - (5) as:

in the formula, L _adv (D) Adaptive countering loss function, L, of finger arbiter _adv (G) Adaptive countering loss function of finger generator, L ₁ Is L ₁ A loss function; lambda [ alpha ] ₆ And λ ₇ A weight coefficient representing a corresponding loss term; x represents a target electronic map, c represents all binary image channels in generator input, y represents a remote sensing image input to the GAN model, and p _ data (y) and p _ data (x) represent probability distribution obeyed by y and x respectively; g (y, c) represents that the generator takes y and c as inputD (G (y, c), c, y) represents the discrimination probability of the discriminator with G (y, c), c, y as input and further output; d (x, c, y) represents the discrimination probability of the discriminator which takes x, c and y as input and further outputs; e (, x) is the average of "".

7. The method of claim 6, wherein the adaptive perceptual loss term L is selected from the group consisting of _p Is defined as follows: l is _p ＝L _{f_d} +L _{f_vgg} +L _s Wherein:

L _{f_d} representing a loss of characteristics of the discriminator, L _{f_vgg} Representing the loss of features of the feature extraction and matching structure in the generator by adopting a vgg-19 model, L _s Representing the generator-style loss term constructed using the generation electronic map and the target electronic map, the mathematical expressions of the above-described respective loss terms are defined by equations (6) to (8):

L _s ＝λ ₅ (Gram(s)-Gram(x)) ² ； (8)

in the formula, λ ₃ 、λ ₄ 、λ ₅ A weight coefficient representing each loss term; t represents the number of network layers, n _t Features representing the t-th network layer; d represents a discriminator, s represents the generated electronic map, x represents the target electronic map, and c represents all binary image channels in the input of the generator; d _t The character extracted at the t-th layer by the discriminator with the character as input; v represents the vgg-19 model employed in the generator, V _t The (x) represents the vgg-19 model and takes 'x' as the input to extract the picture content characteristics;

style loss term L _s Expressing generated picture style features by using a Gram matrixThe definition is shown by equation (9):

in the formula (I), the compound is shown in the specification,

and &>

Respectively represent the jth and kth feature matrices in the t network layer of the model vgg-19.

8. A system for generating an electronic map according to remote sensing images based on a GAN model is characterized by comprising the following modules:

wherein i represents the ith feature element,

and &>

9. The system for generating an electronic map according to remote sensing images of claim 8, wherein the model training module further comprises a second loss item construction module;

the second loss item constructing module is configured to construct a second loss item when an input channel of the GAN model includes 3 binary map channels respectively representing forest land, water area, and road information, and increase a limit of the generated electronic map in terms of road continuity and smoothness, where a mathematical expression of the second loss item is:

in the formula (I), the compound is shown in the specification,

and &>

10. The system for generating an electronic map according to remote sensing images of claim 9, wherein the model training module further comprises a third loss term construction module;

the third loss term construction module is used for combining the self-adaptive perception loss term L _p And adaptive countering loss term L _adv And a first loss term and a second loss term, and constructing a final loss function of the GAN model, wherein the mathematical expression of the loss function is as follows:

L(D)＝minL _adv (D)

L(G)＝min(L _adv (G)+L _color +L _road +L _p )；

wherein the adaptive counter-loss term L _adv Defined by the following equation:

in the formula, L _adv (D) Adaptive countering loss function, L, of finger arbiter _adv (G) The adaptive counter-loss function of the finger generator,L ₁ is L ₁ A loss function; lambda ₆ And λ ₇ A weight coefficient representing a corresponding loss term; y represents a remote sensing image input to the GAN model, and p _ data (y) and p _ data (x) represent probability distribution obeyed by y and x respectively; g (y, c) represents an electronic map generated by the generator with y and c as input, D (G (y, c), c, y) represents the discrimination probability of the discriminator with G (y, c), c and y as input and further output; d (x, c, y) represents the discrimination probability of the discriminator which takes x, c and y as input and further outputs; e (, is an average of "");

wherein the adaptive perceptual loss term L _p Is defined as: l is _p ＝L _{f_d} +L _{f_vgg} +L _s (ii) a In the formula, L _{f_d} Representing a loss of characteristics of the discriminator, L _{f_vgg} Representing the loss of features of the feature extraction and matching structure in the generator by adopting a vgg-19 model, L _s Representing a generator style loss term constructed using the generated electronic map and the target electronic map, the mathematical expression of each of the above loss terms being defined by the following formula:

L _S ＝λ ₅ (Gram(s)-Gram(x)) ² ；

in the formula, λ ₃ 、λ ₄ 、λ ₅ A weight coefficient representing each loss term; t represents the number of network layers, _t nfeatures representing the t-th network layer; d represents a discriminator, s represents the generated electronic map, x represents the target electronic map, and c represents all binary image channels in the input of the generator; d _t The characteristics extracted by the discriminator at the t layer are taken as input; v represents the vgg-19 model employed in the generator, V _t The model vgg-19 takes "+" as input and extracted picture content characteristics; style loss term L _s By using a Gram matrixTo express the style characteristics of the picture, the Gram matrix definition is defined by the following formula:

in the formula (I), the compound is shown in the specification,

and &>

Respectively represent the jth and kth feature matrices in the t network layer of the model vgg-19. />