CN116051683B - Remote sensing image generation method, storage medium and device based on style self-organization - Google Patents

Remote sensing image generation method, storage medium and device based on style self-organization Download PDF

Info

Publication number
CN116051683B
CN116051683B CN202211642255.9A CN202211642255A CN116051683B CN 116051683 B CN116051683 B CN 116051683B CN 202211642255 A CN202211642255 A CN 202211642255A CN 116051683 B CN116051683 B CN 116051683B
Authority
CN
China
Prior art keywords
target
remote sensing
initial
training
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211642255.9A
Other languages
Chinese (zh)
Other versions
CN116051683A (en
Inventor
许光銮
陈佳良
张文凯
阮航
李硕轲
李霁豪
袁志强
周瑞雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202211642255.9A priority Critical patent/CN116051683B/en
Publication of CN116051683A publication Critical patent/CN116051683A/en
Application granted granted Critical
Publication of CN116051683B publication Critical patent/CN116051683B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of remote sensing image generation, and discloses a remote sensing image generation method, a storage medium and equipment based on style self-organization. Including creating initial mask data. And acquiring preset control parameters. And inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image. When the initial mask data is created, the method can adaptively design a plurality of configuration parameters, and further quickly create the initial mask data corresponding to various scenes. To generate a corresponding target remote sensing image. Thus, a large number of remote sensing image samples of different scenes can be correspondingly generated. To adapt to training of different deep learning network models. Meanwhile, the cost of the generated remote sensing image sample is lower because the generation of the remote sensing image sample does not need to consume manpower to annotate the cost. In addition, the remote sensing sample generator can be controlled by the target iteration times, so that the corresponding target remote sensing image can be generated more quickly.

Description

Remote sensing image generation method, storage medium and device based on style self-organization
Technical Field
The present invention relates to the field of remote sensing image generation, and in particular, to a remote sensing image generation method, a storage medium, and a device based on style self-organization.
Background
With the development of artificial intelligence technology, neural network models such as machine recognition, target detection, semantic segmentation and the like are generated in various business fields. And the neural network model is trained by using the sample data. In order to improve the recognition accuracy of the neural network model, it is necessary to train the neural network model with as much sample data as possible. Because the number of samples of the existing remote sensing data in various complex environment backgrounds is limited, the training requirements of various deep learning networks cannot be met.
The existing remote sensing sample generation method mainly aims at acquiring a high-resolution hyperspectral remote sensing sample in the full-color sharpening direction, but the method has higher requirements on input data, and the generated sample corresponds to a single scene and cannot be suitable for training of deep learning network models in different directions. Therefore, the remote sensing sample generation method which is low in training cost and suitable for multi-direction research is an urgent problem to be solved.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme:
according to one aspect of the present invention, there is provided a remote sensing image generation method based on style self-organization, the method comprising the steps of:
initial mask data is created. The initial mask data includes a background mask and at least one target mask. Each target mask has a plurality of configuration parameters, and the configuration parameters of different target masks are different. The configuration parameters comprise the types, the number and the setting positions of the target object masks in the background mask. Each target mask comprises contour information of a corresponding target.
And acquiring preset control parameters. The control parameters include a target image size and a target number of iterations a. A.epsilon.15, 25.
And inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image. The target remote sensing image comprises an object image of a type corresponding to the target object mask and a scene background image corresponding to the background mask. The size of the target remote sensing image is the size of the target image, and the calculation iteration number of the remote sensing sample generator in the process of generating the target remote sensing image is the target iteration number. The target remote sensing image has a label of a corresponding object image generated by each target mask class.
According to a second aspect of the present invention, there is provided a non-transitory computer readable storage medium storing a computer program which when executed by a processor implements a remote sensing image generation method based on style self-organization as described above.
According to a third aspect of the present invention, there is provided an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing a remote sensing image generation method based on style self-organization as described above when executing the computer program.
The invention has at least the following beneficial effects:
in the invention, when the initial mask data is created, the adaptive design can be carried out on a plurality of configuration parameters of the target mask according to the requirements, and then the initial mask data corresponding to various scenes can be quickly created according to the requirements. Then, the initial mask data and the control parameters are input into a remote sensing sample generator to generate a corresponding target remote sensing image. The target remote sensing image is the corresponding remote sensing sample data generated according to the initial mask data. Each object image in the target remote sensing image is provided with a label generated according to the corresponding target object mask type. Because a plurality of different initial mask data can be rapidly and massively generated through autonomous combination, a plurality of remote sensing image samples of different scenes can be correspondingly generated. To adapt to training of deep learning network models in different directions. Meanwhile, the remote sensing image sample generated by the method has lower cost because the generation process of the remote sensing image sample does not need to consume a large amount of manpower marking cost.
In addition, as the target iteration times A epsilon [15,25], the remote sensing sample generator can be controlled, and the corresponding target remote sensing image can be generated more quickly under the condition that higher image quality can be obtained, and the generation time can be controlled within 2 seconds generally.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a remote sensing image generating method based on style self-organization according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
As a possible embodiment of the present invention, as shown in fig. 1, there is provided a remote sensing image generating method based on style self-organization, the method including the steps of:
s100, creating initial mask data. The initial mask data includes a background mask and at least one target mask. Each target mask has a plurality of configuration parameters, and the configuration parameters of different target masks are different. The configuration parameters include the type, number and placement of the object masks in the background mask. Each object mask includes profile information of a corresponding object.
In this step, the object mask may be a contour image of an object in a corresponding scene, such as a corresponding contour image of a vehicle such as an automobile, a ship, or an airplane. The background mask can be a solid-color image background, the color of the background mask and the image color of the target object mask need to have larger difference, and the difference between the remote sensing sample generators is facilitated. For example, the background mask may be set to black in color and the target mask may be set to blue in the morning in color.
The category, the number and the placement position of the target object mask in the background mask can be set at will by adjusting the configuration parameters of the target object mask in the initial mask data. Thus, a plurality of initial mask data can be quickly created according to the use requirement. And the style of each initial mask data can be determined by self-combination, wherein the style is determined by the specific setting style of the target object mask in the background mask. Meanwhile, since the target mask can be freely determined by adjusting the configuration parameters, the styles of the initial mask data in the embodiment can be freely combined, so that a large number of initial mask data in different styles can be generated.
S200, acquiring preset control parameters. The control parameters include a target image size and a target number of iterations a. A.epsilon.15, 25.
Preferably, a=20. In this case, the remote sensing sample generator can be controlled, and the corresponding target remote sensing image can be generated more quickly in the case that a higher image quality can be obtained, and the generation time can be controlled within 2 seconds.
The target image size, that is, the length and width of the target image, is set to be the same in this embodiment. Specifically, the setting of the target image size is determined by the size of the resolution of the generated image and the size of the image processing capability corresponding to the hardware. The hardware may be a GPU (graphics processing unit, graphics processor).
Preferably, the target image size (length or width) is between 256dpi and 512dpi.
S300, inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image. The target remote sensing image comprises an object image of a type corresponding to the target object mask and a scene background image corresponding to the background mask. The size of the target remote sensing image is the size of the target image, and the calculation iteration number of the remote sensing sample generator in the process of generating the target remote sensing image is the target iteration number. The target remote sensing image has a label of the corresponding object image generated by each target mask class. Preferably, the remote sensing sample generator may be a target DDPM diffusion model generated by training a DDPM (Denoising Diffusion Probabilistic Models, denoising diffusion probability model) diffusion model.
Specifically, the target remote sensing image is a sample image with a label. Each object corresponds to a label, and the label may be category information of the corresponding object. Because the generated target remote sensing image has a corresponding label, the target remote sensing image can be used as a training sample of the deep learning network.
In the invention, when the initial mask data is created, the adaptive design can be carried out on a plurality of configuration parameters of the target mask according to the requirements, and then the initial mask data corresponding to various scenes can be quickly created according to the requirements. Then, the initial mask data and the control parameters are input into a remote sensing sample generator to generate a corresponding target remote sensing image. The target remote sensing image is the corresponding remote sensing sample data generated according to the initial mask data. Each object image in the target remote sensing image is provided with a label generated according to the corresponding target object mask type. Because a plurality of different initial mask data can be rapidly and massively generated through autonomous combination, a plurality of remote sensing image samples of different scenes can be correspondingly generated. To adapt to training of deep learning network models in different directions. Meanwhile, the remote sensing image sample generated by the method has lower cost because the generation process of the remote sensing image sample does not need to consume a large amount of manpower marking cost.
In addition, as the target iteration times A epsilon [15,25], the remote sensing sample generator can be controlled, and the corresponding target remote sensing image can be generated more quickly under the condition that higher image quality can be obtained, and the generation time can be controlled within 2 seconds generally.
As another embodiment of the present invention, the method further comprises:
s400, acquiring an initial training sample set of an initial sample generator from the ISAID data set.
Specifically, the ISAID dataset is an existing dataset, which has 2806 images, contains a total of 655451 target objects in 15 categories, and is the first large-scale example segmentation dataset in the remote sensing field. When the initial training sample is selected, the initial training sample needs to be determined according to the scene possibly related to the corresponding task and the learning help size of the remote sensing sample generator.
In this embodiment, the selected initial training sample set relates to eight target types of a trolley, a cart, a basketball court, a ship, an airplane, an track field, a bridge and a storage tank, and a total of 1411 remote sensing images are used as initial training data. Specifically, remote sensing images in an ISAID data set and mask data corresponding to the images are used as a pair of training data; the remote sensing image is an actual training image, and the corresponding mask data is used as a training label. And performing supervised training on the initial sample generator to generate the remote sensing sample generator. The pattern of the mask data corresponding to the image is the same as the pattern of the initial mask data. I.e. a background mask having the same color.
S500, cutting each sample in the initial training sample set to generate a first initial training sample set. The first initial training sample set includes a plurality of cropped samples. The size of each cutting sample is a preset size.
Specifically, when the cutting process is performed, the preset size may be any size specification within the interval of 128dpi to 512dpi.
And S600, randomly overturning a plurality of cut samples in the first initial training sample set to generate a first target training sample set.
Specifically, after the cutting process and the overturning process, the first target training sample set includes training images with clear pictures and smaller target object masks of 57 pairs. The prediction capability of the remote sensing sample generator on initial mask data with smaller target mask can be trained.
On the premise of ensuring the integrity of large targets such as sports ground and the resolution of small targets such as trolley, 453 RS (Remote Sensing) images with the same specification and 453 Zhang Duiying mask data are also included. The 453 pairs of image data relate to eight target types of carts, basketball courts, ships, airplanes, track and field, bridges, and storage tanks. Mask information in the image data is different, corresponding target scenes are diversified, and categories of labels contained in the target scenes are richer. Thus, the learning of the remote sensing sample generator on the diversified scenes and the labels can be improved.
In addition, the first target training sample set further comprises a plurality of sample data with incomplete target object masks. Such as including a portion of an athletic field or basketball court in the sample image. Therefore, the computing capacity of the remote sensing sample generator for the edge of the incomplete target object mask is improved. Meanwhile, 46 pure background images are included to evaluate the scene restoration capability of the model.
In this embodiment, the number of initial training data may be increased by cutting and random overturn, and meanwhile, the diversity of the initial training data may be increased, so that the learning ability of the initial sample generator may be improved, and further, the prediction ability of the remote sensing sample generator generated by final training may be improved.
As another embodiment of the present invention, the preset size is 512dpi x 512dpi.
After generating the first set of target training samples, the method further comprises:
and S610, reducing the size of each sample in the first target training sample set to 256dpi and 256dpi to generate a second target training sample set.
Because of the large size of the remote sensing image, when the size of the cropped image is 128dpi, there is often a problem that the target object is not cropped or only a small part of the image of the target object is cropped. Therefore, the initial sample generator can learn incompletely and irregularly, and the obtained remote sensing sample generator cannot have higher prediction precision. Meanwhile, the obtained remote sensing sample generator has poor effect on a predicted image generated by an image with low resolution, and the category of a generated target object is unstable when a target area of an input image is blurred. If the target object mask corresponding to the trolley is misjudged as the tag class of the trolley.
In order to solve the above-mentioned newly-emerging problem, the present embodiment sets the preset size to 512dpi x 512dpi. Because the cutting range is larger, the problem that the target object cannot be cut or only a small part of the image of the target object is cut can be avoided as much as possible. Then, the image of 512dpi×512dpi was reduced to a image of 256dpi×256dpi in total. Therefore, the image at the cutting position can be ensured to contain the target object or most areas of the target object as far as possible, and the size of the image is reduced, so that the CPU can be ensured to have enough computing resources for processing. And furthermore, the learning capacity of the initial sample generator can be improved, and finally, the obtained remote sensing sample generator has higher prediction precision.
As another embodiment of the present invention, after generating the second target training sample set, at S610, the method further comprises:
and S620, performing noise adding processing on each sample in the second target training sample set to generate a third target training sample set.
The noise addition process includes:
s621 generates background noise in the background image area of the sample. Background noise is subject to a μ=10, σ 2 Gaussian distributed noise of =2. Where μ is the average value of the gaussian distribution. Sigma (sigma) 2 Is the variance of the gaussian distribution.
After training by the training data in the above embodiment, the target remote sensing image generated by the remote sensing sample generator finally obtained has the problems of lower rendering effect of background color and single generated image scene.
Since the color of the background mask in the existing training samples is also a single color. That is, the values of the data matrix corresponding to the background mask are all the same, so that the initial sample generator cannot perform differentiation processing on the background mask during training, and further the generated target remote sensing image has the problems that the rendering effect of the background color is low and the generated image scene is single.
Different noise values are set for different backgrounds by generating background noise in each sample background image area in the second target training sample set in this embodiment. The difference between each pixel in the sample background image can be enlarged, and the generating capacity of the initial sample generator on the background image in the training process can be improved. Correspondingly, the target remote sensing image generated by the final remote sensing sample generator has more detailed and rich detail images, so that the rendering effect of background color is improved, the scene details of the generated image are more abundant and more real.
As another embodiment of the present invention, the method further comprises:
and S630, training the initial sample generator by using the first target training sample set or the second target training sample set or the third target training sample set to generate a plurality of groups of model parameters corresponding to the initial sample generator. And obtaining model parameters corresponding to a group of initial sample generators after completing training for 1 ten thousand times.
S640, obtaining a plurality of groups of model parameters from the training time intervals of [95 ten thousand times and 105 ten thousand times ] to perform mean value calculation processing, and generating target model parameters.
Preferably, the average value calculation processing is performed on model parameters obtained after 97 ten thousand times, 98 ten thousand times, 99 ten thousand times, 100 ten thousand times and 101 ten thousand times of training are completed, so as to generate target model parameters.
S650, configuring the target model parameters into an initial sample generator to generate a remote sensing sample generator.
Specifically, in the training process, if the training times are too many, the problem of overfitting of the finally obtained model is easy to occur; the training times are too small, and the problem of under fitting of the finally obtained model is easy to occur. Therefore, in this embodiment, in the interval with a better model training effect, multiple groups of model parameters are selected to perform mean value calculation processing, so that the generated target model parameters can avoid errors existing in a single group of model parameters as much as possible, and finally the generated remote sensing sample generator has better prediction precision.
As another embodiment of the present invention, prior to training the initial sample generator using the first target training sample set or the second target training sample set or the third target training sample set, S630, the method further comprises:
and S700, performing model optimization processing on the first initial sample generator to generate an initial sample generator. The model optimization process is used to adjust the structure of the first initial sample generator to improve the prediction accuracy of the generated initial sample generator.
In this embodiment, an existing DDPM diffusion model is optimized and improved, so that the generated initial sample generator can have higher prediction accuracy.
As another embodiment of the present invention, the first initial sample generator includes a DDPM diffusion model.
The model optimization process comprises the following steps:
s710, configuring a discriminator for the DDPM diffusion model, and generating an initial sample generator. The initial sample generator is a generative antagonism network. The discriminator is a discriminator built by a resnet network.
The present embodiment constructs the initial sample generator as a generation type countermeasure network by configuring a discriminator for the DDPM diffusion model. The generated countermeasure network (GAN, generative Adversarial Networks) is a deep learning model, and is one of the most promising methods for unsupervised learning on complex distribution in recent years. The model is built up by at least two modules in the frame: generating a Model (generating Model), namely a DDPM diffusion Model; and a discriminant model (Discriminative Model), i.e. a discriminator. And produces a reasonably good output by generating a model and learning the discriminant model in a game with each other.
Wherein, loss value G of DDPM diffusion model loss Meets the following conditions:
G loss =MSE(nosie,f(x))*(1-λ)+λ*C(out ones ,D(fake))。
loss value D of discriminator loss Meets the following conditions:
D loss =C(out ones ,D(true))+C(out zeros ,D(fake))。
wherein C is a binary cross entropy function, and MSE is a mean square difference function. Both C and MSE are existing loss functions. nosie is random gaussian noise. f (x) is gaussian noise fitted to the DDPM diffusion model. fake is an image matrix generated by the DDPM diffusion model. D (fake) is a feature matrix after feature extraction by the fake discriminator, and the size of the feature matrix is 512 x 1.true is the true image in the training sample. D (wire) is a feature matrix after feature extraction by the discriminator, and the feature matrix has a size of 512×1.out of zeros Out for an all 0 matrix of the same size as D (rake) ones Is an all 1 matrix of the same size as D (rake). Lambda is the influence coefficient, lambda epsilon [0,1 ]]. Preferably, λ=0.2.
According to the embodiment, through introducing the GAN architecture concept, whether an image generated by the DDPM diffusion model is real or not can be automatically detected through the discriminator, and if the discriminator considers that the image is real, rewarding is carried out on the DDPM diffusion model; if the discriminator considers false, the DDPM diffusion model is penalized. At G loss Adding lambda C (out ones D (false)) and the final loss value can be adaptively adjusted. Specifically, when the discriminator considers that the image generated by the DDPM diffusion model is closer to the true value, λ×c (out ones D (false)) is small, then G is caused to be loss The corresponding loss function value is also smaller; and when the discriminator considers that the image generated by the DDPM diffusion model differs greatly from the true value, λ×c (out ones D (false)) is larger, then G is caused to be loss The corresponding loss function value is also larger.
Thus, by at G loss Adding lambda C (out ones D (false)) and the final loss value can be adaptively adjusted. And further, a remote sensing sample generator finally generated after training, namely a DDPM diffusion model after training, can generate a target remote sensing image which is closer to the real image and more real.
As another embodiment of the present invention, the first initial sample generator includes a DDPM diffusion model.
The model optimization process comprises the following steps:
s720, replacing the Unet network in the DDPM diffusion model with 3 parallel-arranged Unet networks, and generating an initial sample generator.
The first Unet network is used for processing noise data of the initial sample generator in the denoising iteration time of [0, n/3 ].
The second Unet network is used for processing noise data of which the number of denoising iterations belongs to (n/3, 2 n/3) of the initial sample generator.
The third Unet network is used for processing noise data of the initial sample generator in the (2 n/3, n) of denoising iteration times, wherein n is the total denoising iteration times corresponding to the initial sample generator when processing training samples each time.
Taking n=300 as an example, the following description is made:
the first Unet network is used for processing noise data of the initial sample generator in the denoising iteration number of [0, 100 ].
The second Unet network is used for processing noise data of which the number of denoising iterations belongs to (100, 200) of the initial sample generator.
The third Unet network is used for processing noise data of the initial sample generator in the denoising iteration number (200, 300).
In the actual generation process of the target remote sensing image, the model in the embodiment has the following specific processing procedures:
and inputting the initial sample into a first Unet network for iteration and denoising iteration, taking the output iterated initial sample as the first initial sample at the moment after the initial sample finishes the 100 th iteration, and then inputting the first initial sample into a second Unet network for iteration. And after finishing the 200 th iteration of the initial sample, taking the output iterated initial sample as a second initial sample, and then inputting the second initial sample into a third Unet network for iteration. And after finishing 300 times of iteration, taking the output initial sample after iteration as a third initial sample, namely the finally generated target remote sensing image.
In the embodiment, a backbone network Unet in the existing DDPM diffusion model is increased from one to three; and each Unet network is respectively responsible for noise processing work of the DDPM diffusion model in the front, middle and back 3 stages of iterative denoising. The reasoning process can thus be divided into 3 phases, initial, intermediate and final, and the different phases of the reasoning process are handled separately by the 3 Unet networks. Since the noise, the denoising proportion and the denoising degree of different stages are different, the different stages are respectively and pertinently processed by respectively setting 3 UNet networks. The coupling between UNet networks of different denoising stages is reduced, and the denoising process can be further refined. Although the model is enlarged, the denoising effect of each process is greatly improved, and the overall effect is more excellent than the original effect.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention described in the present specification when the program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims (10)

1. A remote sensing image generation method based on style self-organization is characterized by comprising the following steps:
creating initial mask data; the initial mask data comprises a background mask and at least one target mask; each target object mask is provided with a plurality of configuration parameters, and the configuration parameters of different target object masks are different; the configuration parameters comprise the types and the number of the target object masks and the setting positions in the background mask; each target object mask comprises contour information of a corresponding target object;
acquiring preset control parameters; the control parameters comprise a target image size and a target iteration number A; a epsilon [15,25];
inputting the initial mask data and the control parameters into a remote sensing sample generator to generate a target remote sensing image; the target remote sensing image comprises an object image of a type corresponding to the target object mask and a scene background image corresponding to the background mask; the size of the target remote sensing image is the size of the target image, and the calculation iteration number of the remote sensing sample generator in the process of generating the target remote sensing image is the target iteration number; the target remote sensing image has a label of a corresponding object image generated by each target mask class.
2. The method according to claim 1, wherein the method further comprises:
acquiring an initial training sample set of an initial sample generator from an ISAID data set;
cutting each sample in the initial training sample set to generate a first initial training sample set; the first initial training sample set includes a plurality of cropped samples; the size of each cutting sample is a preset size;
and randomly turning over a plurality of cut samples in the first initial training sample set to generate a first target training sample set.
3. The method according to claim 2, wherein the preset size is 512dpi x 512dpi;
after generating the first set of target training samples, the method further comprises:
and reducing the size of each sample in the first target training sample set to 256dpi x 256dpi to generate a second target training sample set.
4. A method according to claim 3, wherein after generating the second set of target training samples, the method further comprises:
performing noise adding processing on each sample in the second target training sample set to generate a third target training sample set;
the noise adding process includes:
generating background noise in a background image region of the sample; the background noise is subject to a μ=10, σ 2 Gaussian distributed noise of =2; wherein μ is an average value of the gaussian distribution; sigma (sigma) 2 Is the variance of the gaussian distribution.
5. The method according to claim 4, wherein the method further comprises:
training an initial sample generator by using a first target training sample set or a second target training sample set or a third target training sample set to generate a plurality of groups of model parameters corresponding to the initial sample generator; after each 1 ten thousand times of training is completed, a group of model parameters corresponding to the initial sample generator are obtained;
obtaining a plurality of groups of model parameters from the training time interval of [95 ten thousand times and 105 ten thousand times ] to perform mean value calculation processing to generate target model parameters;
performing mean value calculation processing on the model parameters obtained after 97 ten thousand times, 98 ten thousand times, 99 ten thousand times, 100 ten thousand times and 101 ten thousand times of training respectively to generate target model parameters;
and configuring the target model parameters into the initial sample generator to generate the remote sensing sample generator.
6. The method of claim 5, wherein prior to training the initial sample generator using the first or second or third target training sample set, the method further comprises:
model optimization processing is carried out on the first initial sample generator, and an initial sample generator is generated; the model optimization process is used for adjusting the structure of the first initial sample generator to improve the prediction accuracy of the generated initial sample generator.
7. The method of claim 6, wherein the first initial sample generator comprises a DDPM diffusion model;
the model optimization process comprises the following steps:
configuring a discriminator for the DDPM diffusion model, generating an initial sample generator; the initial sample generator is a generation type countermeasure network; the discriminator is a discriminator constructed by a resnet network;
wherein the loss value G of the DDPM diffusion model loss Meets the following conditions:
Figure QLYQS_1
loss value D of the discriminator loss Meets the following conditions:
Figure QLYQS_2
wherein C is a binary cross entropy function, and MSE is a mean square difference function; nosie is random gaussian noise; f (x) is Gaussian noise fitted by a DDPM diffusion model; the fake is an image matrix generated by the DDPM diffusion model; d (fake) is a feature matrix obtained by extracting features from the fake by a discriminator, and is a feature matrixThe size of the sign matrix is 512 x 1; true is the real image in the training sample; d (wire) is a feature matrix after feature extraction by a discriminator, and the size of the feature matrix is 512 x 1; out of zeros Out for an all 0 matrix of the same size as D (rake) ones Is an all 1 matrix of the same size as D (fake);
Figure QLYQS_3
to influence the coefficient +.>
Figure QLYQS_4
∈[0,1]。
8. The method of claim 6, wherein the first initial sample generator comprises a DDPM diffusion model;
the model optimization process comprises the following steps:
replacing the Unet network in the DDPM diffusion model with 3 parallel-arranged Unet networks to generate an initial sample generator;
the first Unet network is used for processing noise data of the initial sample generator in the denoising iteration times of [0, n/3 ];
the second Unet network is used for processing noise data of which the denoising iteration times belong to (n/3, 2 n/3) of the initial sample generator;
the third Unet network is used for processing noise data of the initial sample generator in the (2 n/3, n) of denoising iteration times, and n is the total denoising iteration times corresponding to the initial sample generator when processing training samples each time.
9. A non-transitory computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a remote sensing image generation method based on style ad hoc as claimed in any one of claims 1 to 8.
10. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements a remote sensing image generation method based on style ad hoc as claimed in any one of claims 1 to 8.
CN202211642255.9A 2022-12-20 2022-12-20 Remote sensing image generation method, storage medium and device based on style self-organization Active CN116051683B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211642255.9A CN116051683B (en) 2022-12-20 2022-12-20 Remote sensing image generation method, storage medium and device based on style self-organization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211642255.9A CN116051683B (en) 2022-12-20 2022-12-20 Remote sensing image generation method, storage medium and device based on style self-organization

Publications (2)

Publication Number Publication Date
CN116051683A CN116051683A (en) 2023-05-02
CN116051683B true CN116051683B (en) 2023-07-04

Family

ID=86117269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211642255.9A Active CN116051683B (en) 2022-12-20 2022-12-20 Remote sensing image generation method, storage medium and device based on style self-organization

Country Status (1)

Country Link
CN (1) CN116051683B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116909750B (en) * 2023-07-26 2023-12-22 江苏中天吉奥信息技术股份有限公司 Image-based scene white film rapid production method
CN116777906B (en) * 2023-08-17 2023-11-14 常州微亿智造科技有限公司 Abnormality detection method and abnormality detection device in industrial detection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884791A (en) * 2021-02-02 2021-06-01 重庆市地理信息和遥感应用中心 Method for constructing large-scale remote sensing image semantic segmentation model training sample set
WO2021114832A1 (en) * 2020-05-28 2021-06-17 平安科技(深圳)有限公司 Sample image data enhancement method, apparatus, electronic device, and storage medium
CN114066718A (en) * 2021-10-14 2022-02-18 特斯联科技集团有限公司 Image style migration method and device, storage medium and terminal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021114832A1 (en) * 2020-05-28 2021-06-17 平安科技(深圳)有限公司 Sample image data enhancement method, apparatus, electronic device, and storage medium
CN112884791A (en) * 2021-02-02 2021-06-01 重庆市地理信息和遥感应用中心 Method for constructing large-scale remote sensing image semantic segmentation model training sample set
CN114066718A (en) * 2021-10-14 2022-02-18 特斯联科技集团有限公司 Image style migration method and device, storage medium and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
遥感影像智能云掩膜方法研究与系统实现;高星宇;中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑;C028-302 *

Also Published As

Publication number Publication date
CN116051683A (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN109886121B (en) Human face key point positioning method for shielding robustness
CN116051683B (en) Remote sensing image generation method, storage medium and device based on style self-organization
CN109977918B (en) Target detection positioning optimization method based on unsupervised domain adaptation
CN109859190B (en) Target area detection method based on deep learning
CN110827213B (en) Super-resolution image restoration method based on generation type countermeasure network
CN112308860B (en) Earth observation image semantic segmentation method based on self-supervision learning
Zhang et al. NLDN: Non-local dehazing network for dense haze removal
CN108898145A (en) A kind of image well-marked target detection method of combination deep learning
CN111612008A (en) Image segmentation method based on convolution network
CN110059769B (en) Semantic segmentation method and system based on pixel rearrangement reconstruction and used for street view understanding
CN110443257B (en) Significance detection method based on active learning
CN112101364B (en) Semantic segmentation method based on parameter importance increment learning
CN112132145A (en) Image classification method and system based on model extended convolutional neural network
CN112861718A (en) Lightweight feature fusion crowd counting method and system
CN112580662A (en) Method and system for recognizing fish body direction based on image features
Wang et al. A feature-supervised generative adversarial network for environmental monitoring during hazy days
CN115063318A (en) Adaptive frequency-resolved low-illumination image enhancement method and related equipment
CN116563680A (en) Remote sensing image feature fusion method based on Gaussian mixture model and electronic equipment
CN111814693A (en) Marine ship identification method based on deep learning
CN113158860B (en) Deep learning-based multi-dimensional output face quality evaluation method and electronic equipment
CN115471831A (en) Image significance detection method based on text reinforcement learning
CN111310820A (en) Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration
CN110110651B (en) Method for identifying behaviors in video based on space-time importance and 3D CNN
CN110826563A (en) Finger vein segmentation method and device based on neural network and probability map model
CN111563462A (en) Image element detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant