CN114419195A - Image synthesis method and device based on relation embedding and storage medium - Google Patents

Image synthesis method and device based on relation embedding and storage medium Download PDF

Info

Publication number
CN114419195A
CN114419195A CN202111457354.5A CN202111457354A CN114419195A CN 114419195 A CN114419195 A CN 114419195A CN 202111457354 A CN202111457354 A CN 202111457354A CN 114419195 A CN114419195 A CN 114419195A
Authority
CN
China
Prior art keywords
image
synthesis
background
foreground
relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111457354.5A
Other languages
Chinese (zh)
Inventor
朱鹏飞
贾安
汪廉杰
刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202111457354.5A priority Critical patent/CN114419195A/en
Publication of CN114419195A publication Critical patent/CN114419195A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method, a device and a storage medium for image synthesis based on relationship embedding, wherein the method comprises the following steps: embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model; performing mutual supervision learning on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image; training an image synthesis model embedded in a relationship to obtain an image synthesizer, wherein the image synthesizer comprises: a generator and a discriminator; training a composite image score classifier to automatically score the composite image based on the data set of the composite image; and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship. The device comprises: a processor and a memory. The image synthesis relation designed by the invention enables the foreground image and the background image to complete better image synthesis.

Description

Image synthesis method and device based on relation embedding and storage medium
Technical Field
The present invention relates to the field of image synthesis, and in particular, to a method, an apparatus, and a storage medium for image synthesis based on relationship embedding.
Background
In early work on image synthesis, researchers used a combination of graphics and mathematical ideas to achieve image synthesis. The most classical is to use Poisson fusion to synthesize images, the idea is to introduce an image gradient domain based on a general interpolation mechanism for solving Poisson equation, and when the images are fused, color gradients are used for replacing color intensity to generate a more real synthesis effect. Later, with the development of deep learning research, data shortage is a more serious problem, a model cannot learn valuable characteristics from data, and researchers focus on solving image synthesis by using a deep learning method, so that the problems of data shortage and time and labor consumption of manual labeling are solved to a certain extent. GAN (generative confrontation network) helps solve the problem of image generation, DCGAN (deep convolution generative confrontation network) can generate images belonging to a specific class, LAPGAN (generative confrontation network based on laplacian pyramid) can generate images from coarse to fine using laplacian pyramid.
GAN does help the model learn valuable characterizations, but GAN alone cannot synthesize two independent images. Researchers have proposed a simple cut-and-paste method from the perspective of data synthesis, extracting objects under box-level labeling and pasting them into possible scenes, so as to obtain new and more realistic synthetic data, but the trueness cannot guarantee that the data will certainly enhance the performance of the training model. Therefore, subsequent research combines both image synthesis and countermeasure methods, proposing a new image synthesis network to learn the synthesized images, and then using this method to train a synthesizer to generate useful synthesis samples, thereby helping to improve the performance of the target network. At present, the image synthesis work is done less by using the copy and paste method entirely. People prefer to use virtual engines to generate large amounts of synthetic data, and some work on synthetic data sets based on virtual image scenes. In addition, synthetic data also helps people to better understand the real world. The composite image may be divided into real and non-real images and the real composite image regions re-colored to facilitate a better understanding of natural color statistics and color perception. Based on prior knowledge of the real-world understanding, people can find that the foreground of the synthetic image is not compatible with the background, and the data distribution of the synthetic data set and the real data set has difference.
In summary, the following problems mainly exist in the field of image synthesis at present:
1. data distribution difference exists between the synthetic data and the real data, so that the model learning difficulty is increased;
2. the relation between the synthetic foreground and the background cannot be regularized, and the relation between the foreground and the background cannot be well described by using deep learning;
3. designing a particular 3D model is costly, resulting in the data being synthesized using the 3D model being expensive;
4. the difference between the synthetic data and the real data set is difficult to evaluate, so that the effectiveness of the synthetic model cannot be quantified after the data is synthesized;
5. the mass images cannot be automatically synthesized, and the quality evaluation is finished.
Disclosure of Invention
The invention provides a method, a device and a storage medium for image synthesis based on relationship embedding, which enable foreground and background images to complete better image synthesis through the image synthesis relationship designed by the invention; the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded in the relation; useful data sets can be augmented as desired using the image synthesis of the present invention, as described in detail below:
in a first aspect, a method for relationship-embedding-based image synthesis, the method comprising the steps of:
embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
performing mutual supervision learning on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
training an image synthesis model embedded in a relationship to obtain an image synthesizer, wherein the image synthesizer comprises: a generator and a discriminator;
training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
In one embodiment, the reciprocal relationship is:
RSI=Relation(B,B+F)
wherein B represents a restored background image, F represents a foreground image required for synthesis, and relationship represents.
Wherein the input of the image synthesis model comprises: a background image x, a background image y with a foreground image, where x,
Figure BDA0003387007550000021
r is a feature space, H is a face image height, W is a width, and C is a channel number.
In one embodiment, the image synthesis model comprises: a generator and a discriminator, wherein the generator and the discriminator,
the generator is composed of a background image appearance encoder
Figure BDA0003387007550000022
Foreground image object encoder
Figure BDA0003387007550000023
And decoder (G)x,Gy) Composition is carried out;
encoder for encoding a video signal
Figure BDA0003387007550000024
For obtaining the background appearance characteristic vector by encoding the background image
Figure BDA0003387007550000025
Obtaining the characteristic vector of the object in the foreground image by the same method
Figure BDA0003387007550000026
And a decoder (G)x,Gy) Combining to generate a new image, GxTo generate a restoration image, GyTo generate a composite image;
(Dx,Dy) Two discriminators, D, for generating a restoration image and for generating a composite image, respectivelyxFor decoding
Figure BDA0003387007550000031
The generated restored image is distinguished from the natural image, DyFor decoding
Figure BDA0003387007550000032
The generated synthetic image is distinguished from the natural image.
Preferably, the image synthesis work includes a repair route mode and a synthesis route mode:
the repair route mode is as follows:
extracting target position characteristic f from target background image B1(ii) a Extracting target background feature B from natural image F + B2
Will f is1And B2As input to the generator, generate f1+B2A composite image of (a);
the synthetic route mode is as follows:
extracting background feature B from target background image B1(ii) a Foreground feature F is extracted from natural image F + B2
Will f is2+B1As input to the generator, generate f2+B1A composite image of (a).
Further, the method further comprises: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;
the automatic synthesis score is used for manually marking the appearance, size and position of the synthesized image by copying and pasting the synthesized image, and then training is completed, and the synthesized image is automatically scored;
the proportionality coefficients of the ideal synthetic scores are all a maximum of 1.
In a second aspect, an apparatus for relationship embedding based image synthesis, the apparatus comprising:
the embedding module is used for embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
the learning module is used for performing mutual supervision learning on the images based on the reciprocal relation so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
an obtaining module, configured to train an image synthesis model embedded in a relationship to obtain an image synthesizer, where the image synthesizer includes: a generator and a discriminator;
the training module is used for training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
and the scoring module is used for carrying out image synthesis work and scoring on the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
A third aspect, an apparatus for image composition based on relationship embedding, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause an apparatus to perform the method steps of any of the claims first aspect.
A fourth aspect, a computer-readable storage medium storing a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method steps of any of the first aspects.
The technical scheme provided by the invention has the beneficial effects that:
1. the method can better complete the image synthesis work and the image restoration work; and proves that the image synthesis and the image restoration are reversible;
2. the method can learn the position and the foreground proportion of image synthesis; the consistency of the appearance, size and position characteristics of image synthesis is improved;
3. the method can be applied to deep learning as a new data enhancement method; experiments prove that the method is suitable for application and popularization in the field of image synthesis.
Drawings
FIG. 1 is a diagram illustrating a relationship definition for image composition based on relationship embedding;
FIG. 2 is a diagram of a relationship-based embedded image synthesis model;
FIG. 3 is a logic flow diagram for relational embedding based on image composition for relational embedding;
FIG. 4 is a flow diagram of image synthesis work using a relationship-embedded image synthesis model;
FIG. 5 is an exemplary graph of evaluation scoring based on relationship-embedded image composition;
FIG. 6 is an overall flow diagram of a relationship-based embedding image composition;
FIG. 7 is a schematic diagram of an apparatus for image synthesis based on relationship embedding;
fig. 8 is another structural diagram of an apparatus for image synthesis based on relationship embedding.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.
Example 1
An embodiment of the present invention provides a method for image synthesis based on relationship embedding, and referring to fig. 1 to 4, the method includes the following steps:
101: embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
102: mutual supervision learning is carried out on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
103: training the image synthesis model of the relationship embedding to obtain an image synthesizer: the image synthesizer includes: a generator and a discriminator;
104: training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
105: and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
The input of the image synthesis model in step 102 includes:
a background image x, a background image y with a foreground image, where x,
Figure BDA0003387007550000051
r is a feature space, H is a face image height, W is a width, and C is a channel number.
In one embodiment, the image synthesis model in steps 101 and 102 comprises: a generator and a discriminator, wherein the generator and the discriminator,
the generator is composed of a background image appearance encoder
Figure BDA0003387007550000052
Foreground image object encoder
Figure BDA0003387007550000053
And decoder (G)x,Gy) Composition is carried out;
encoder for encoding a video signal
Figure BDA0003387007550000054
For obtaining the background appearance characteristic vector by encoding the background image
Figure BDA0003387007550000055
Obtaining the characteristic vector of the object in the foreground image by the same method
Figure BDA0003387007550000056
And a decoder (G)x,Gy) Combining to generate a new image, GxFor generating a restoration image,GyTo generate a composite image;
(Dx,Dy) Two discriminators, D, for generating a restoration image and for generating a composite image, respectivelyxFor decoding
Figure BDA0003387007550000057
The generated restored image is distinguished from the natural image, DyFor decoding
Figure BDA0003387007550000058
The generated synthetic image is distinguished from the natural image.
In one embodiment, the image composition operation of step 105 comprises: repair route and synthesis route;
the repair route mode is as follows:
extracting target position characteristic f from target background image B1(ii) a Extracting target background feature B from natural image F + B2
Will f is1And B2As input to the generator, generate f1+B2A repair image of (1);
the synthetic route mode is as follows:
extracting background feature B from target background image B1(ii) a Foreground feature F is extracted from natural image F + B2
Will f is2+B1As input to the generator, generate f2+B1A composite image of (a).
In one embodiment, the method further comprises: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;
the automatic synthesis score is used for copying and pasting the synthetic image, training is completed after manual grading and marking of appearance, size and position, and the synthetic image is automatically graded; the proportionality coefficients for the ideal synthetic scores are all at a maximum of 1.
In summary, the embodiment of the present invention realizes better synthesis of foreground and background images through the steps 101 to 105; and the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded by the relationship.
Example 2
The scheme of example 1 is further described below with reference to specific calculation formulas and examples, which are described in detail below:
first, relation embedded image synthesis model
1. Definition of relationship embedding
In the embodiment of the present invention, the relationship embedding is defined as a relationship Between Image Synthesis and Image restoration, which is abbreviated as RSI (relationship Between Image Synthesis and inpainting), and to a certain extent, Image Synthesis and restoration can be regarded as a reciprocal relationship, and then RSI is embedded into a Synthesis model in the embodiment of the present invention.
Thus, the learning objectives of the synthetic model are: the reciprocal relation between image synthesis and image restoration is learned, so that the model learns the information such as the position of the synthesized foreground, the foreground proportion and the like, and the appearance of the synthesized image, the size of the foreground image and the consistency in the background image are improved as much as possible.
The figure of RSI is illustrated in fig. 1, which shows the relationship of RSI by taking the face without glasses and the face with glasses as an example, the upper line shows the synthesis process in RSI, the lower line shows the repair process in RSI, and the two processes form the relationship of RSI, and the relationship is expressed in a formalization:
RSI=Relation(B,B+F) (1)
wherein, B represents the background image with good Inpainting (repairing), F represents the foreground image required by Synthesis, and relationship represents the relationship.
2. Relationship-embedded image synthesis model
The image synthesis model framework embedded in the relationship of the embodiment of the present invention is designed based on VAE (variant auto-encoder) and GAN (generic adaptive Network, Generative countermeasure Network). As shown in fig. 2, the input to the model is two graphs, one of which is a background graph, defined as x, and the other of which is a background graph of the foreground image, defined as y, where x,
Figure BDA0003387007550000061
r is a feature space, H is a face image height, W is a width, and C is a channel number. For convenience of description, the relationship-embedded Image synthesis model is referred to as ISRE (i.e., Image Synthesizer Based on Relational Embedding), and the details of the model are described below.
2.1 Generator design of ISRE
The generator is constructed mainly of an encoder and a decoder, as shown in fig. 2, mainly by a background image appearance encoder
Figure BDA0003387007550000071
Foreground image object encoder
Figure BDA0003387007550000072
And decoder (G)x,Gy) And (4) forming. Wherein the encoder
Figure BDA0003387007550000073
The method has the effect of acquiring a background appearance characteristic vector by encoding a background image
Figure BDA0003387007550000074
Obtaining the characteristic vector of the object in the foreground image by the same method
Figure BDA0003387007550000075
Then with a decoder (G)x,Gy) Combined to generate a new image. GxGenerating a restored image, GyA composite image is generated.
It is noted that the image X e X represents a natural background image not containing the foreground of the composite image, the foreground object O e O of the composite image represents the foreground image of the target composite image, the image Y e Y represents a natural image containing the foreground image of the composite image, and
Figure BDA0003387007550000076
2.2 arbiter design of ISRE
D in FIG. 2 denotes an arbiter, where (D)x,Dy) Two discriminators, D, for generating a restoration image and for generating a composite image, respectivelyxIs aimed at decoding
Figure BDA0003387007550000077
The generated restored image is distinguished from the natural image,
wherein
Figure BDA0003387007550000078
And
Figure BDA0003387007550000079
as GxTwo inputs, parameters
Figure BDA00033870075500000710
Appearance coding information, parameters representing background image
Figure BDA00033870075500000711
Encoded information representing an object of the foreground image. DyIs aimed at decoding
Figure BDA00033870075500000717
The generated synthetic image is distinguished from the natural image, wherein
Figure BDA00033870075500000713
And
Figure BDA00033870075500000714
as GyTwo inputs, parameters
Figure BDA00033870075500000715
Background appearance coding information representing a foreground image,
Figure BDA00033870075500000716
representing the key in the background imageAnd obtaining the position characteristic information of the foreground object.
Image synthesis algorithm with two-step relationship embedding
The overall flow of the algorithm designed by the embodiment of the invention is shown in fig. 3, wherein B represents a single background image, and F + B represents a foreground + background natural image. Based on the RSI definition, two learning routes of the algorithm are designed, the upper route is a repair route of the algorithm, and the lower route is a synthesis route of the algorithm, in detail:
1. repair route algorithm
The first step is as follows: extracting target position characteristic f from target background image B1
The second step is that: extracting target background feature B from natural image F + B2
The third step: will f is1And B2As input to the generator, generate f1+B2A repair image.
2. Synthetic route algorithm
The first step is as follows: extracting background feature B from target background image B1
The second step is that: foreground feature F is extracted from natural image F + B2
The third step: will f is2+B1As input to the generator, generate f2+B1A composite image of (a).
Finally, an object of embodiments of the present invention is to synthesize massive amounts of f2+B1The image is then applied as a data set to a specific engineering scene.
3. Design of loss function
In order to make the model learning better and make two learning routes effectively learn the relationship of 'synthesis-restoration', the embodiment of the invention designs five loss functions to constrain the target model, and the detailed description is as follows:
1)f1+B2associated loss function
Is f1+B2Two loss functions are designed to constrain its composite effect. Wherein loss1Constraint f1+B2Foreground feature of the composite image, loss2Constraint f1+B2The background features of the composite image are respectively defined as follows:
Figure BDA0003387007550000081
Figure BDA0003387007550000082
2)f2+B1associated loss function
Is f2+B1Two loss functions are designed to constrain its composite effect. Therein, loss3Constraint f2+B1Foreground feature of the composite image, loss4Constraint f2+B1The background features of the composite image are respectively defined as follows:
Figure BDA0003387007550000083
Figure BDA0003387007550000084
3) generating a penalty function
To encourage better generated composite images, i.e., indistinguishable from natural images, embodiments of the present invention employ generation of a countermeasure loss, where GxAnd GyAttempting to generate a true composite image, DxAnd DyIn an attempt to distinguish between natural images and synthetic images generated, embodiments of the present invention define the generation of a countermeasure loss as follows:
Figure BDA0003387007550000086
wherein, ByBackground features representing y, FxForeground of xCharacteristic; corresponding to, BxRepresenting the background features of x, FyRepresenting the foreground feature of y, Dx(x) And Dy(y) denotes a discriminator for discriminating a synthetic image from a natural image, and E denotes a mathematical expectation.
4) Target learning function
The above-mentioned loss function needs to be used simultaneously when training the generator and the discriminator, so the embodiment of the present invention defines a complete target learning function as:
Figure BDA0003387007550000085
wherein alpha is f1+B2The composite image loses weight, β is f2+B1The synthetic image loss weight is between 0 and 1.
5) Image synthesis using ISRE model
After the ISRE model is trained based on the above steps, the ISRE model can be used for image synthesis. The process is shown in fig. 4, and the embodiment of the present invention uses the street view data set as an example, after training the ISRE model using the street view data set, a foreground image (such as the car in fig. 4) and a background image (such as the street map in fig. 4) are input into the ISRE model, then the synthesis of the foreground image and the background image is completed inside the ISRE model, and finally the output image is the expected synthesis image. Similarly, the algorithm in the embodiment of the invention can be used for completing training in other data sets and can also be used as a synthesis model for image synthesis, thereby completing the expansion of the data set and solving the problem of insufficient data set in deep learning.
Third, design of new evaluation index
Two general evaluation indexes for image synthesis are provided in this embodiment, and according to research, the evaluation index provided in the embodiment of the present invention is the first general evaluation index in the field of image synthesis, and therefore, it is also one of the important inventions in the embodiment of the present invention, and the detailed description is as follows:
1. auto-synthetic score ACS
Auto-synthesis score ACS (Autom)atic Composite Score) was aimed at evaluating the performance of an automated synthesis strategy proposed by a relationship-embedded image synthesis algorithm. Let CiThe synthesis result of the i-th synthesis image is represented. The auto-composite score ACS is defined as follows:
ACS=λ1a(Ci)+λ2s(Ci)+λ3p(Ci) (8)
wherein a, s, p represent the composite scores of appearance, size and position, respectively, { λ1、λ2、λ3And the values of the proportional coefficients corresponding to a, s and p are between 0 and 1, and the significance in the score calculation is represented. a. s, p are predicted by a Composite Image Score Classifier (CISC). And the composite image score classifier is obtained by training, specifically:
a batch of images are synthesized in advance by using a copy-paste mode, and then the three images are scored manually, wherein one image score comprises three parts, namely appearance, size and position. After the classifier is trained, inputting a composite image, and obtaining three scores of a, s and p of the image, wherein the value is between 0 and 1, and the closer to 1, the better the composite effect is. Namely:
{a,s,p}=CISC(Ci) (9)
2. ideal composition score ICS
The ideal synthesis score ics (ideal Composite score) is to evaluate the maximum performance of the conventional image synthesis algorithm achieved by the ideal synthesis strategy, and it can be considered as the upper bound of the automatic synthesis score, i.e. the case that the scaling coefficients all take the maximum value of 1, aiming to stimulate the image synthesis algorithm embedded in the relationship to propose a better and more effective automatic synthesis strategy. Let CiThe synthesis result of the i-th synthesis image is represented. The definition of the ideal fusion score ICS is mainly as follows:
ICS=a(Ci)+s(Ci)+p(Ci) (10)
as shown in fig. 5, after training of the model is completed by using the ISRE model, the test image is scored to obtain scores of a, s, and p corresponding to the synthesized image, where fig. 5 illustrates an ideal synthesis score, and the automatic synthesis score needs to be multiplied by a corresponding coefficient weight.
In summary, the embodiment of the present invention is an image synthesis method based on relationship embedding, and the overall flow is shown in fig. 6. Firstly, the ISRE model designed by the embodiment needs to be obtained through training, then the model is used for completing image synthesis work, and finally quality evaluation of the synthesized image is carried out.
Example 3
An apparatus for image composition based on relationship embedding, referring to fig. 7, the apparatus comprising:
the embedding module 1 is used for embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
the learning module 2 is used for performing mutual supervision learning on the images based on the reciprocal relation so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
an obtaining module 3, configured to train an image synthesis model embedded in a relationship to obtain an image synthesizer, where the image synthesizer includes: a generator and a discriminator;
the training module 4 is used for training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
and the scoring module 5 is used for performing image synthesis work and scoring on the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
Wherein the image synthesis model comprises: a generator and a discriminator, wherein the generator and the discriminator,
the generator is composed of a background image appearance encoder
Figure BDA0003387007550000101
Foreground image object encoder
Figure BDA0003387007550000102
And decoder (G)x,Gy) Composition is carried out;
encoder for encoding a video signal
Figure BDA0003387007550000103
For obtaining the background appearance characteristic vector by encoding the background image
Figure BDA0003387007550000104
Obtaining the characteristic vector of the object in the foreground image by the same method
Figure BDA0003387007550000105
And a decoder (G)x,Gy) Combining to generate a new image, GxTo generate a restoration image, GyTo generate a composite image;
(Dx,Dy) Two discriminators, D, for generating a restoration image and for generating a composite image, respectivelyxFor decoding
Figure BDA0003387007550000106
The generated restored image is distinguished from the natural image, DyFor decoding
Figure BDA0003387007550000107
The generated synthetic image is distinguished from the natural image.
In summary, the embodiment of the present invention realizes better synthesis of foreground and background images through the above modules; and the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded by the relationship.
Example 4
An apparatus for image composition based on relationship embedding, referring to fig. 8, the apparatus comprising: a processor 6 and a memory 7, the memory 7 having stored therein program instructions, the processor 6 calling the program instructions stored in the memory 7 to cause the apparatus to perform the following method steps in embodiment 1:
embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
performing mutual supervision learning on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
training an image synthesis model embedded in a relationship to obtain an image synthesizer, wherein the image synthesizer comprises: a generator and a discriminator;
training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
In one embodiment, the reciprocal relationship is:
RSI=Relation(B,B+F)
wherein B represents a restored background image, F represents a foreground image required for synthesis, and relationship represents.
Wherein the input of the image synthesis model comprises: a background image x, a background image y with a foreground image, where x,
Figure BDA0003387007550000111
r is a feature space, H is a face image height, W is a width, and C is a channel number.
In one embodiment, an image synthesis model includes: a generator and a discriminator, wherein the generator and the discriminator,
the generator is composed of a background image appearance encoder
Figure BDA0003387007550000112
Foreground image object encoder
Figure BDA0003387007550000113
And decoder (G)x,Gy) Composition is carried out;
encoder for encoding a video signal
Figure BDA0003387007550000114
For obtaining the background appearance characteristic vector by encoding the background image
Figure BDA0003387007550000115
Obtaining the characteristic vector of the object in the foreground image by the same method
Figure BDA0003387007550000116
And a decoder (G)x,Gy) Combining to generate a new image, GxTo generate a restoration image, GyTo generate a composite image;
(Dx,Dy) Two discriminators, D, for generating a restoration image and for generating a composite image, respectivelyxFor decoding
Figure BDA0003387007550000117
The generated restored image is distinguished from the natural image, DyFor decoding
Figure BDA0003387007550000118
The generated synthetic image is distinguished from the natural image.
Preferably, the image composition job comprises: repair route and synthesis route:
the repair route mode is as follows: extracting target position characteristic f from target background image B1(ii) a Extracting target background feature B from natural image F + B2
Will f is1And B2As input to the generator, generate f1+B2A composite image of (a);
wherein, the synthetic route mode is as follows: extracting background feature B from target background image B1(ii) a Foreground feature F is extracted from natural image F + B2
Will f is2+B1As input to the generator, generate f2+B1A composite image of (a).
Further, the present invention also includes: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;
the automatic synthesis score is used for copying and pasting the synthetic image, finishing training after manually marking the appearance, the size and the position, and automatically scoring the synthetic image;
the proportionality coefficients for the ideal synthetic scores are all at a maximum of 1.
In summary, the embodiment of the present invention realizes better synthesis of foreground and background images through the processor and the memory; and the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded by the relationship.
It should be noted that the device description in the above embodiments corresponds to the method description in the embodiments, and the embodiments of the present invention are not described herein again.
The execution main bodies of the processor 6 and the memory 7 may be devices having a calculation function, such as a computer, a single chip, a microcontroller, and the like, and in the specific implementation, the execution main bodies are not limited in the embodiment of the present invention, and are selected according to the needs in the practical application.
The memory 7 and the processor 6 transmit data signals through the bus 8, which is not described in detail in the embodiment of the present invention.
Example 5
Based on the same inventive concept, an embodiment of the present invention further provides a computer-readable storage medium, where the storage medium includes a stored program, and when the program runs, the apparatus on which the storage medium is located is controlled to execute the method steps in the foregoing embodiments.
The computer readable storage medium includes, but is not limited to, flash memory, hard disk, solid state disk, and the like.
It should be noted that the descriptions of the readable storage medium in the above embodiments correspond to the descriptions of the method in the embodiments, and the descriptions of the embodiments of the present invention are not repeated here.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer.
The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium or a semiconductor medium, etc.
In the embodiment of the present invention, except for the specific description of the model of each device, the model of other devices is not limited, as long as the device can perform the above functions.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for relationship-embedding-based image synthesis, the method comprising the steps of:
embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
performing mutual supervision learning on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
training an image synthesis model embedded in a relationship to obtain an image synthesizer, wherein the image synthesizer comprises: a generator and a discriminator;
training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
2. The method of claim 1, wherein the reciprocal relationship is:
RSI=Relation(B,B+F)
wherein B represents a restored background image, F represents a foreground image required for synthesis, and relationship represents.
3. The method of claim 1, wherein the inputting of the image synthesis model comprises: background image x, background image y with foreground image, wherein
Figure FDA0003387007540000011
R is a feature space, H is a face image height, W is a width, and C is a channel number.
4. The method of claim 2, wherein the image synthesis model comprises: a generator and a discriminator, wherein the generator and the discriminator,
the generator is composed of a background image appearance encoder
Figure FDA0003387007540000012
Foreground image object encoder
Figure FDA0003387007540000013
And decoder (G)x,Gy) Composition is carried out;
encoder for encoding a video signal
Figure FDA0003387007540000014
For obtaining the background appearance characteristic vector by encoding the background image
Figure FDA0003387007540000015
Obtaining the characteristic vector of the object in the foreground image by the same method
Figure FDA0003387007540000016
And a decoder (G)x,Gy) Combining to generate a new image, GxTo generate a restoration image, GyTo generate a composite image;
(Dx,Dy) Two discriminators, D, for generating a restoration image and for generating a composite image, respectivelyxFor decoding
Figure FDA0003387007540000017
The generated restored image is distinguished from the natural image, DyFor decoding
Figure FDA0003387007540000018
The generated synthetic image is distinguished from the natural image.
5. The method of claim 4, wherein the image synthesis task comprises a repair route mode and a synthesis route mode:
the repair route mode is as follows:
extracting target position characteristic f from target background image B1(ii) a Extracting target background feature B from natural image F + B2
Will f is1And B2As input to the generator, generate f1+B2A repair image of (1);
the synthetic route mode is as follows:
extracting background feature B from target background image B1(ii) a Foreground feature F is extracted from natural image F + B2
Will f is2+B1As input to the generator, generate f2+B1A composite image of (a).
6. The method of claim 5, wherein the image synthesis model has a loss function of:
1)f1+B2associated loss function
loss1Constraint f1+B2Foreground feature of the composite image, loss2Constraint f1+B2Background characteristics of the composite image:
Figure FDA0003387007540000021
Figure FDA0003387007540000022
2)f2+B1associated loss function
loss3Constraint f2+B1Foreground feature of the composite image, loss4Constraint f2+B1Background characteristics of the composite image:
Figure FDA0003387007540000023
Figure FDA0003387007540000024
3) generating a penalty function
GxAnd GyFor generating realistic synthetic images, DxAnd DyFor distinguishing between natural images and generated synthetic images:
Ladv=E[log Dx(x)]+E{log[1-Dx(Gx(By,Fx))]}+E[log Dy(y)]+E{log[1-Dy(Gy(Bx,Fy))]}
wherein, ByBackground of yCharacteristic of (F)xA foreground feature representing x; b isxRepresenting the background features of x, FyRepresenting the foreground feature of y, Dx(x) And Dy(y) denotes a discriminator for discriminating a synthetic image from a natural image, and E denotes a mathematical expectation.
7. The method of image composition based on relationship embedding of claim 1, wherein the method further comprises: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;
the automatic synthesis score is used for manually marking the appearance, size and position of the synthesized image by copying and pasting the synthesized image, and then training is completed, and the synthesized image is automatically scored;
the proportionality coefficients of the ideal synthetic scores are all a maximum of 1.
8. An apparatus for image composition based on relationship embedding, the apparatus comprising:
the embedding module is used for embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;
the learning module is used for performing mutual supervision learning on the images based on the reciprocal relation so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;
an obtaining module, configured to train an image synthesis model embedded in a relationship to obtain an image synthesizer, where the image synthesizer includes: a generator and a discriminator;
the training module is used for training a composite image score classifier to automatically score the composite image based on the data set of the composite image;
and the scoring module is used for carrying out image synthesis work and scoring on the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.
9. An apparatus for image composition based on relationship embedding, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause the apparatus to perform the method steps of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method steps of any of claims 1-7.
CN202111457354.5A 2021-12-01 2021-12-01 Image synthesis method and device based on relation embedding and storage medium Pending CN114419195A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111457354.5A CN114419195A (en) 2021-12-01 2021-12-01 Image synthesis method and device based on relation embedding and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111457354.5A CN114419195A (en) 2021-12-01 2021-12-01 Image synthesis method and device based on relation embedding and storage medium

Publications (1)

Publication Number Publication Date
CN114419195A true CN114419195A (en) 2022-04-29

Family

ID=81264669

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111457354.5A Pending CN114419195A (en) 2021-12-01 2021-12-01 Image synthesis method and device based on relation embedding and storage medium

Country Status (1)

Country Link
CN (1) CN114419195A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333495A (en) * 2023-12-01 2024-01-02 浙江口碑网络技术有限公司 Image detection method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333495A (en) * 2023-12-01 2024-01-02 浙江口碑网络技术有限公司 Image detection method, device, equipment and storage medium
CN117333495B (en) * 2023-12-01 2024-03-19 浙江口碑网络技术有限公司 Image detection method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110310221B (en) Multi-domain image style migration method based on generation countermeasure network
JP6395158B2 (en) How to semantically label acquired images of a scene
CN109255831A (en) The method that single-view face three-dimensional reconstruction and texture based on multi-task learning generate
CN110728219A (en) 3D face generation method based on multi-column multi-scale graph convolution neural network
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
Van Hoorick Image outpainting and harmonization using generative adversarial networks
CN113255813A (en) Multi-style image generation method based on feature fusion
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
CN114463492B (en) Self-adaptive channel attention three-dimensional reconstruction method based on deep learning
CN110852935A (en) Image processing method for human face image changing with age
CN115049556A (en) StyleGAN-based face image restoration method
CN112819951A (en) Three-dimensional human body reconstruction method with shielding function based on depth map restoration
Zhu et al. Label-guided generative adversarial network for realistic image synthesis
CN113781324A (en) Old photo repairing method
CN113379715A (en) Underwater image enhancement and data set true value image acquisition method
Shi et al. Improving 3d-aware image synthesis with a geometry-aware discriminator
CN114419195A (en) Image synthesis method and device based on relation embedding and storage medium
CN108924528A (en) A kind of binocular stylization real-time rendering method based on deep learning
CN110097615B (en) Stylized and de-stylized artistic word editing method and system
CN114494387A (en) Data set network generation model and fog map generation method
Hu et al. Image style transfer based on generative adversarial network
CN116523985B (en) Structure and texture feature guided double-encoder image restoration method
CN117078556A (en) Water area self-adaptive underwater image enhancement method
CN116863053A (en) Point cloud rendering enhancement method based on knowledge distillation
CN110197226A (en) A kind of unsupervised image interpretation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination