CN116051683B

CN116051683B - Remote sensing image generation method, storage medium and device based on style self-organization

Info

Publication number: CN116051683B
Application number: CN202211642255.9A
Authority: CN
Inventors: 许光銮; 陈佳良; 张文凯; 阮航; 李硕轲; 李霁豪; 袁志强; 周瑞雪
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2022-12-20
Filing date: 2022-12-20
Publication date: 2023-07-04
Anticipated expiration: 2042-12-20
Also published as: CN116051683A

Abstract

The invention relates to the field of remote sensing image generation, and discloses a remote sensing image generation method, a storage medium and equipment based on style self-organization. Including creating initial mask data. And acquiring preset control parameters. And inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image. When the initial mask data is created, the method can adaptively design a plurality of configuration parameters, and further quickly create the initial mask data corresponding to various scenes. To generate a corresponding target remote sensing image. Thus, a large number of remote sensing image samples of different scenes can be correspondingly generated. To adapt to training of different deep learning network models. Meanwhile, the cost of the generated remote sensing image sample is lower because the generation of the remote sensing image sample does not need to consume manpower to annotate the cost. In addition, the remote sensing sample generator can be controlled by the target iteration times, so that the corresponding target remote sensing image can be generated more quickly.

Description

Remote sensing image generation method, storage medium and device based on style self-organization

Technical Field

The present invention relates to the field of remote sensing image generation, and in particular, to a remote sensing image generation method, a storage medium, and a device based on style self-organization.

Background

With the development of artificial intelligence technology, neural network models such as machine recognition, target detection, semantic segmentation and the like are generated in various business fields. And the neural network model is trained by using the sample data. In order to improve the recognition accuracy of the neural network model, it is necessary to train the neural network model with as much sample data as possible. Because the number of samples of the existing remote sensing data in various complex environment backgrounds is limited, the training requirements of various deep learning networks cannot be met.

The existing remote sensing sample generation method mainly aims at acquiring a high-resolution hyperspectral remote sensing sample in the full-color sharpening direction, but the method has higher requirements on input data, and the generated sample corresponds to a single scene and cannot be suitable for training of deep learning network models in different directions. Therefore, the remote sensing sample generation method which is low in training cost and suitable for multi-direction research is an urgent problem to be solved.

Disclosure of Invention

Aiming at the technical problems, the invention adopts the following technical scheme:

according to one aspect of the present invention, there is provided a remote sensing image generation method based on style self-organization, the method comprising the steps of:

initial mask data is created. The initial mask data includes a background mask and at least one target mask. Each target mask has a plurality of configuration parameters, and the configuration parameters of different target masks are different. The configuration parameters comprise the types, the number and the setting positions of the target object masks in the background mask. Each target mask comprises contour information of a corresponding target.

And acquiring preset control parameters. The control parameters include a target image size and a target number of iterations a. A.epsilon.15, 25.

And inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image. The target remote sensing image comprises an object image of a type corresponding to the target object mask and a scene background image corresponding to the background mask. The size of the target remote sensing image is the size of the target image, and the calculation iteration number of the remote sensing sample generator in the process of generating the target remote sensing image is the target iteration number. The target remote sensing image has a label of a corresponding object image generated by each target mask class.

According to a second aspect of the present invention, there is provided a non-transitory computer readable storage medium storing a computer program which when executed by a processor implements a remote sensing image generation method based on style self-organization as described above.

According to a third aspect of the present invention, there is provided an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing a remote sensing image generation method based on style self-organization as described above when executing the computer program.

The invention has at least the following beneficial effects:

in the invention, when the initial mask data is created, the adaptive design can be carried out on a plurality of configuration parameters of the target mask according to the requirements, and then the initial mask data corresponding to various scenes can be quickly created according to the requirements. Then, the initial mask data and the control parameters are input into a remote sensing sample generator to generate a corresponding target remote sensing image. The target remote sensing image is the corresponding remote sensing sample data generated according to the initial mask data. Each object image in the target remote sensing image is provided with a label generated according to the corresponding target object mask type. Because a plurality of different initial mask data can be rapidly and massively generated through autonomous combination, a plurality of remote sensing image samples of different scenes can be correspondingly generated. To adapt to training of deep learning network models in different directions. Meanwhile, the remote sensing image sample generated by the method has lower cost because the generation process of the remote sensing image sample does not need to consume a large amount of manpower marking cost.

In addition, as the target iteration times A epsilon [15,25], the remote sensing sample generator can be controlled, and the corresponding target remote sensing image can be generated more quickly under the condition that higher image quality can be obtained, and the generation time can be controlled within 2 seconds generally.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a remote sensing image generating method based on style self-organization according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

As a possible embodiment of the present invention, as shown in fig. 1, there is provided a remote sensing image generating method based on style self-organization, the method including the steps of:

s100, creating initial mask data. The initial mask data includes a background mask and at least one target mask. Each target mask has a plurality of configuration parameters, and the configuration parameters of different target masks are different. The configuration parameters include the type, number and placement of the object masks in the background mask. Each object mask includes profile information of a corresponding object.

In this step, the object mask may be a contour image of an object in a corresponding scene, such as a corresponding contour image of a vehicle such as an automobile, a ship, or an airplane. The background mask can be a solid-color image background, the color of the background mask and the image color of the target object mask need to have larger difference, and the difference between the remote sensing sample generators is facilitated. For example, the background mask may be set to black in color and the target mask may be set to blue in the morning in color.

The category, the number and the placement position of the target object mask in the background mask can be set at will by adjusting the configuration parameters of the target object mask in the initial mask data. Thus, a plurality of initial mask data can be quickly created according to the use requirement. And the style of each initial mask data can be determined by self-combination, wherein the style is determined by the specific setting style of the target object mask in the background mask. Meanwhile, since the target mask can be freely determined by adjusting the configuration parameters, the styles of the initial mask data in the embodiment can be freely combined, so that a large number of initial mask data in different styles can be generated.

S200, acquiring preset control parameters. The control parameters include a target image size and a target number of iterations a. A.epsilon.15, 25.

Preferably, a=20. In this case, the remote sensing sample generator can be controlled, and the corresponding target remote sensing image can be generated more quickly in the case that a higher image quality can be obtained, and the generation time can be controlled within 2 seconds.

The target image size, that is, the length and width of the target image, is set to be the same in this embodiment. Specifically, the setting of the target image size is determined by the size of the resolution of the generated image and the size of the image processing capability corresponding to the hardware. The hardware may be a GPU (graphics processing unit, graphics processor).

Preferably, the target image size (length or width) is between 256dpi and 512dpi.

S300, inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image. The target remote sensing image comprises an object image of a type corresponding to the target object mask and a scene background image corresponding to the background mask. The size of the target remote sensing image is the size of the target image, and the calculation iteration number of the remote sensing sample generator in the process of generating the target remote sensing image is the target iteration number. The target remote sensing image has a label of the corresponding object image generated by each target mask class. Preferably, the remote sensing sample generator may be a target DDPM diffusion model generated by training a DDPM (Denoising Diffusion Probabilistic Models, denoising diffusion probability model) diffusion model.

Specifically, the target remote sensing image is a sample image with a label. Each object corresponds to a label, and the label may be category information of the corresponding object. Because the generated target remote sensing image has a corresponding label, the target remote sensing image can be used as a training sample of the deep learning network.

As another embodiment of the present invention, the method further comprises:

s400, acquiring an initial training sample set of an initial sample generator from the ISAID data set.

Specifically, the ISAID dataset is an existing dataset, which has 2806 images, contains a total of 655451 target objects in 15 categories, and is the first large-scale example segmentation dataset in the remote sensing field. When the initial training sample is selected, the initial training sample needs to be determined according to the scene possibly related to the corresponding task and the learning help size of the remote sensing sample generator.

In this embodiment, the selected initial training sample set relates to eight target types of a trolley, a cart, a basketball court, a ship, an airplane, an track field, a bridge and a storage tank, and a total of 1411 remote sensing images are used as initial training data. Specifically, remote sensing images in an ISAID data set and mask data corresponding to the images are used as a pair of training data; the remote sensing image is an actual training image, and the corresponding mask data is used as a training label. And performing supervised training on the initial sample generator to generate the remote sensing sample generator. The pattern of the mask data corresponding to the image is the same as the pattern of the initial mask data. I.e. a background mask having the same color.

S500, cutting each sample in the initial training sample set to generate a first initial training sample set. The first initial training sample set includes a plurality of cropped samples. The size of each cutting sample is a preset size.

Specifically, when the cutting process is performed, the preset size may be any size specification within the interval of 128dpi to 512dpi.

And S600, randomly overturning a plurality of cut samples in the first initial training sample set to generate a first target training sample set.

Specifically, after the cutting process and the overturning process, the first target training sample set includes training images with clear pictures and smaller target object masks of 57 pairs. The prediction capability of the remote sensing sample generator on initial mask data with smaller target mask can be trained.

On the premise of ensuring the integrity of large targets such as sports ground and the resolution of small targets such as trolley, 453 RS (Remote Sensing) images with the same specification and 453 Zhang Duiying mask data are also included. The 453 pairs of image data relate to eight target types of carts, basketball courts, ships, airplanes, track and field, bridges, and storage tanks. Mask information in the image data is different, corresponding target scenes are diversified, and categories of labels contained in the target scenes are richer. Thus, the learning of the remote sensing sample generator on the diversified scenes and the labels can be improved.

In addition, the first target training sample set further comprises a plurality of sample data with incomplete target object masks. Such as including a portion of an athletic field or basketball court in the sample image. Therefore, the computing capacity of the remote sensing sample generator for the edge of the incomplete target object mask is improved. Meanwhile, 46 pure background images are included to evaluate the scene restoration capability of the model.

In this embodiment, the number of initial training data may be increased by cutting and random overturn, and meanwhile, the diversity of the initial training data may be increased, so that the learning ability of the initial sample generator may be improved, and further, the prediction ability of the remote sensing sample generator generated by final training may be improved.

As another embodiment of the present invention, the preset size is 512dpi x 512dpi.

After generating the first set of target training samples, the method further comprises:

and S610, reducing the size of each sample in the first target training sample set to 256dpi and 256dpi to generate a second target training sample set.

Because of the large size of the remote sensing image, when the size of the cropped image is 128dpi, there is often a problem that the target object is not cropped or only a small part of the image of the target object is cropped. Therefore, the initial sample generator can learn incompletely and irregularly, and the obtained remote sensing sample generator cannot have higher prediction precision. Meanwhile, the obtained remote sensing sample generator has poor effect on a predicted image generated by an image with low resolution, and the category of a generated target object is unstable when a target area of an input image is blurred. If the target object mask corresponding to the trolley is misjudged as the tag class of the trolley.

In order to solve the above-mentioned newly-emerging problem, the present embodiment sets the preset size to 512dpi x 512dpi. Because the cutting range is larger, the problem that the target object cannot be cut or only a small part of the image of the target object is cut can be avoided as much as possible. Then, the image of 512dpi×512dpi was reduced to a image of 256dpi×256dpi in total. Therefore, the image at the cutting position can be ensured to contain the target object or most areas of the target object as far as possible, and the size of the image is reduced, so that the CPU can be ensured to have enough computing resources for processing. And furthermore, the learning capacity of the initial sample generator can be improved, and finally, the obtained remote sensing sample generator has higher prediction precision.

As another embodiment of the present invention, after generating the second target training sample set, at S610, the method further comprises:

and S620, performing noise adding processing on each sample in the second target training sample set to generate a third target training sample set.

The noise addition process includes:

s621 generates background noise in the background image area of the sample. Background noise is subject to a μ=10, σ ² Gaussian distributed noise of =2. Where μ is the average value of the gaussian distribution. Sigma (sigma) ² Is the variance of the gaussian distribution.

After training by the training data in the above embodiment, the target remote sensing image generated by the remote sensing sample generator finally obtained has the problems of lower rendering effect of background color and single generated image scene.

Since the color of the background mask in the existing training samples is also a single color. That is, the values of the data matrix corresponding to the background mask are all the same, so that the initial sample generator cannot perform differentiation processing on the background mask during training, and further the generated target remote sensing image has the problems that the rendering effect of the background color is low and the generated image scene is single.

Different noise values are set for different backgrounds by generating background noise in each sample background image area in the second target training sample set in this embodiment. The difference between each pixel in the sample background image can be enlarged, and the generating capacity of the initial sample generator on the background image in the training process can be improved. Correspondingly, the target remote sensing image generated by the final remote sensing sample generator has more detailed and rich detail images, so that the rendering effect of background color is improved, the scene details of the generated image are more abundant and more real.

As another embodiment of the present invention, the method further comprises:

and S630, training the initial sample generator by using the first target training sample set or the second target training sample set or the third target training sample set to generate a plurality of groups of model parameters corresponding to the initial sample generator. And obtaining model parameters corresponding to a group of initial sample generators after completing training for 1 ten thousand times.

S640, obtaining a plurality of groups of model parameters from the training time intervals of [95 ten thousand times and 105 ten thousand times ] to perform mean value calculation processing, and generating target model parameters.

Preferably, the average value calculation processing is performed on model parameters obtained after 97 ten thousand times, 98 ten thousand times, 99 ten thousand times, 100 ten thousand times and 101 ten thousand times of training are completed, so as to generate target model parameters.

S650, configuring the target model parameters into an initial sample generator to generate a remote sensing sample generator.

Specifically, in the training process, if the training times are too many, the problem of overfitting of the finally obtained model is easy to occur; the training times are too small, and the problem of under fitting of the finally obtained model is easy to occur. Therefore, in this embodiment, in the interval with a better model training effect, multiple groups of model parameters are selected to perform mean value calculation processing, so that the generated target model parameters can avoid errors existing in a single group of model parameters as much as possible, and finally the generated remote sensing sample generator has better prediction precision.

As another embodiment of the present invention, prior to training the initial sample generator using the first target training sample set or the second target training sample set or the third target training sample set, S630, the method further comprises:

and S700, performing model optimization processing on the first initial sample generator to generate an initial sample generator. The model optimization process is used to adjust the structure of the first initial sample generator to improve the prediction accuracy of the generated initial sample generator.

In this embodiment, an existing DDPM diffusion model is optimized and improved, so that the generated initial sample generator can have higher prediction accuracy.

As another embodiment of the present invention, the first initial sample generator includes a DDPM diffusion model.

The model optimization process comprises the following steps:

s710, configuring a discriminator for the DDPM diffusion model, and generating an initial sample generator. The initial sample generator is a generative antagonism network. The discriminator is a discriminator built by a resnet network.

The present embodiment constructs the initial sample generator as a generation type countermeasure network by configuring a discriminator for the DDPM diffusion model. The generated countermeasure network (GAN, generative Adversarial Networks) is a deep learning model, and is one of the most promising methods for unsupervised learning on complex distribution in recent years. The model is built up by at least two modules in the frame: generating a Model (generating Model), namely a DDPM diffusion Model; and a discriminant model (Discriminative Model), i.e. a discriminator. And produces a reasonably good output by generating a model and learning the discriminant model in a game with each other.

Wherein, loss value G of DDPM diffusion model _loss Meets the following conditions:

G _loss ＝MSE(nosie，f(x))*(1-λ)+λ*C(out _ones ，D(fake))。

loss value D of discriminator _loss Meets the following conditions:

D _loss ＝C(out _ones ，D(true))+C(out _zeros ，D(fake))。

wherein C is a binary cross entropy function, and MSE is a mean square difference function. Both C and MSE are existing loss functions. nosie is random gaussian noise. f (x) is gaussian noise fitted to the DDPM diffusion model. fake is an image matrix generated by the DDPM diffusion model. D (fake) is a feature matrix after feature extraction by the fake discriminator, and the size of the feature matrix is 512 x 1.true is the true image in the training sample. D (wire) is a feature matrix after feature extraction by the discriminator, and the feature matrix has a size of 512×1.out of _zeros Out for an all 0 matrix of the same size as D (rake) _ones Is an all 1 matrix of the same size as D (rake). Lambda is the influence coefficient, lambda epsilon [0,1 ]]. Preferably, λ=0.2.

According to the embodiment, through introducing the GAN architecture concept, whether an image generated by the DDPM diffusion model is real or not can be automatically detected through the discriminator, and if the discriminator considers that the image is real, rewarding is carried out on the DDPM diffusion model; if the discriminator considers false, the DDPM diffusion model is penalized. At G _loss Adding lambda C (out _ones D (false)) and the final loss value can be adaptively adjusted. Specifically, when the discriminator considers that the image generated by the DDPM diffusion model is closer to the true value, λ×c (out _ones D (false)) is small, then G is caused to be _loss The corresponding loss function value is also smaller; and when the discriminator considers that the image generated by the DDPM diffusion model differs greatly from the true value, λ×c (out _ones D (false)) is larger, then G is caused to be _loss The corresponding loss function value is also larger.

Thus, by at G _loss Adding lambda C (out _ones D (false)) and the final loss value can be adaptively adjusted. And further, a remote sensing sample generator finally generated after training, namely a DDPM diffusion model after training, can generate a target remote sensing image which is closer to the real image and more real.

The model optimization process comprises the following steps:

s720, replacing the Unet network in the DDPM diffusion model with 3 parallel-arranged Unet networks, and generating an initial sample generator.

The first Unet network is used for processing noise data of the initial sample generator in the denoising iteration time of [0, n/3 ].

The second Unet network is used for processing noise data of which the number of denoising iterations belongs to (n/3, 2 n/3) of the initial sample generator.

The third Unet network is used for processing noise data of the initial sample generator in the (2 n/3, n) of denoising iteration times, wherein n is the total denoising iteration times corresponding to the initial sample generator when processing training samples each time.

Taking n=300 as an example, the following description is made:

the first Unet network is used for processing noise data of the initial sample generator in the denoising iteration number of [0, 100 ].

The second Unet network is used for processing noise data of which the number of denoising iterations belongs to (100, 200) of the initial sample generator.

The third Unet network is used for processing noise data of the initial sample generator in the denoising iteration number (200, 300).

In the actual generation process of the target remote sensing image, the model in the embodiment has the following specific processing procedures:

and inputting the initial sample into a first Unet network for iteration and denoising iteration, taking the output iterated initial sample as the first initial sample at the moment after the initial sample finishes the 100 th iteration, and then inputting the first initial sample into a second Unet network for iteration. And after finishing the 200 th iteration of the initial sample, taking the output iterated initial sample as a second initial sample, and then inputting the second initial sample into a third Unet network for iteration. And after finishing 300 times of iteration, taking the output initial sample after iteration as a third initial sample, namely the finally generated target remote sensing image.

In the embodiment, a backbone network Unet in the existing DDPM diffusion model is increased from one to three; and each Unet network is respectively responsible for noise processing work of the DDPM diffusion model in the front, middle and back 3 stages of iterative denoising. The reasoning process can thus be divided into 3 phases, initial, intermediate and final, and the different phases of the reasoning process are handled separately by the 3 Unet networks. Since the noise, the denoising proportion and the denoising degree of different stages are different, the different stages are respectively and pertinently processed by respectively setting 3 UNet networks. The coupling between UNet networks of different denoising stages is reduced, and the denoising process can be further refined. Although the model is enlarged, the denoising effect of each process is greatly improved, and the overall effect is more excellent than the original effect.

Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.

Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.

Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention described in the present specification when the program product is run on the electronic device.

While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims

1. A remote sensing image generation method based on style self-organization is characterized by comprising the following steps:

creating initial mask data; the initial mask data comprises a background mask and at least one target mask; each target object mask is provided with a plurality of configuration parameters, and the configuration parameters of different target object masks are different; the configuration parameters comprise the types and the number of the target object masks and the setting positions in the background mask; each target object mask comprises contour information of a corresponding target object;

acquiring preset control parameters; the control parameters comprise a target image size and a target iteration number A; a epsilon [15,25];

inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image; the target remote sensing image comprises an object image of a type corresponding to the target object mask and a scene background image corresponding to the background mask; the size of the target remote sensing image is the size of the target image, and the calculation iteration number of the remote sensing sample generator in the process of generating the target remote sensing image is the target iteration number; the target remote sensing image has a label of a corresponding object image generated by each target mask class.

2. The method according to claim 1, wherein the method further comprises:

acquiring an initial training sample set of an initial sample generator from an ISAID data set;

cutting each sample in the initial training sample set to generate a first initial training sample set; the first initial training sample set includes a plurality of cropped samples; the size of each cutting sample is a preset size;

and randomly turning over a plurality of cut samples in the first initial training sample set to generate a first target training sample set.

3. The method according to claim 2, wherein the preset size is 512dpi x 512dpi;

and reducing the size of each sample in the first target training sample set to 256dpi x 256dpi to generate a second target training sample set.

4. A method according to claim 3, wherein after generating the second set of target training samples, the method further comprises:

performing noise adding processing on each sample in the second target training sample set to generate a third target training sample set;

the noise adding process includes:

generating background noise in a background image region of the sample; the background noise is subject to a μ=10, σ ² Gaussian distributed noise of =2; wherein μ is an average value of the gaussian distribution; sigma (sigma) ² Is the variance of the gaussian distribution.

5. The method according to claim 4, wherein the method further comprises:

training an initial sample generator by using a first target training sample set or a second target training sample set or a third target training sample set to generate a plurality of groups of model parameters corresponding to the initial sample generator; after each 1 ten thousand times of training is completed, a group of model parameters corresponding to the initial sample generator are obtained;

obtaining a plurality of groups of model parameters from the training time interval of [95 ten thousand times and 105 ten thousand times ] to perform mean value calculation processing to generate target model parameters;

performing mean value calculation processing on the model parameters obtained after 97 ten thousand times, 98 ten thousand times, 99 ten thousand times, 100 ten thousand times and 101 ten thousand times of training respectively to generate target model parameters;

and configuring the target model parameters into the initial sample generator to generate the remote sensing sample generator.

6. The method of claim 5, wherein prior to training the initial sample generator using the first or second or third target training sample set, the method further comprises:

model optimization processing is carried out on the first initial sample generator, and an initial sample generator is generated; the model optimization process is used for adjusting the structure of the first initial sample generator to improve the prediction accuracy of the generated initial sample generator.

7. The method of claim 6, wherein the first initial sample generator comprises a DDPM diffusion model;

the model optimization process comprises the following steps:

configuring a discriminator for the DDPM diffusion model, generating an initial sample generator; the initial sample generator is a generation type countermeasure network; the discriminator is a discriminator constructed by a resnet network;

wherein the loss value G of the DDPM diffusion model _loss Meets the following conditions:

；

loss value D of the discriminator _loss Meets the following conditions:

；

wherein C is a binary cross entropy function, and MSE is a mean square difference function; nosie is random gaussian noise; f (x) is Gaussian noise fitted by a DDPM diffusion model; the fake is an image matrix generated by the DDPM diffusion model; d (fake) is a feature matrix obtained by extracting features from the fake by a discriminator, and is a feature matrixThe size of the sign matrix is 512 x 1; true is the real image in the training sample; d (wire) is a feature matrix after feature extraction by a discriminator, and the size of the feature matrix is 512 x 1; out of _zeros Out for an all 0 matrix of the same size as D (rake) _ones Is an all 1 matrix of the same size as D (fake);

to influence the coefficient +.>

∈[0,1]。

8. The method of claim 6, wherein the first initial sample generator comprises a DDPM diffusion model;

the model optimization process comprises the following steps:

replacing the Unet network in the DDPM diffusion model with 3 parallel-arranged Unet networks to generate an initial sample generator;

the first Unet network is used for processing noise data of the initial sample generator in the denoising iteration times of [0, n/3 ];

the second Unet network is used for processing noise data of which the denoising iteration times belong to (n/3, 2 n/3) of the initial sample generator;

the third Unet network is used for processing noise data of the initial sample generator in the (2 n/3, n) of denoising iteration times, and n is the total denoising iteration times corresponding to the initial sample generator when processing training samples each time.

9. A non-transitory computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a remote sensing image generation method based on style ad hoc as claimed in any one of claims 1 to 8.

10. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements a remote sensing image generation method based on style ad hoc as claimed in any one of claims 1 to 8.