CN114419195A

CN114419195A - Image synthesis method and device based on relation embedding and storage medium

Info

Publication number: CN114419195A
Application number: CN202111457354.5A
Authority: CN
Inventors: 朱鹏飞; 贾安; 汪廉杰; 刘洋
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2021-12-01
Filing date: 2021-12-01
Publication date: 2022-04-29

Abstract

The invention discloses a method, a device and a storage medium for image synthesis based on relationship embedding, wherein the method comprises the following steps: embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model; performing mutual supervision learning on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image; training an image synthesis model embedded in a relationship to obtain an image synthesizer, wherein the image synthesizer comprises: a generator and a discriminator; training a composite image score classifier to automatically score the composite image based on the data set of the composite image; and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship. The device comprises: a processor and a memory. The image synthesis relation designed by the invention enables the foreground image and the background image to complete better image synthesis.

Description

Image synthesis method and device based on relation embedding and storage medium

Technical Field

The present invention relates to the field of image synthesis, and in particular, to a method, an apparatus, and a storage medium for image synthesis based on relationship embedding.

Background

In early work on image synthesis, researchers used a combination of graphics and mathematical ideas to achieve image synthesis. The most classical is to use Poisson fusion to synthesize images, the idea is to introduce an image gradient domain based on a general interpolation mechanism for solving Poisson equation, and when the images are fused, color gradients are used for replacing color intensity to generate a more real synthesis effect. Later, with the development of deep learning research, data shortage is a more serious problem, a model cannot learn valuable characteristics from data, and researchers focus on solving image synthesis by using a deep learning method, so that the problems of data shortage and time and labor consumption of manual labeling are solved to a certain extent. GAN (generative confrontation network) helps solve the problem of image generation, DCGAN (deep convolution generative confrontation network) can generate images belonging to a specific class, LAPGAN (generative confrontation network based on laplacian pyramid) can generate images from coarse to fine using laplacian pyramid.

GAN does help the model learn valuable characterizations, but GAN alone cannot synthesize two independent images. Researchers have proposed a simple cut-and-paste method from the perspective of data synthesis, extracting objects under box-level labeling and pasting them into possible scenes, so as to obtain new and more realistic synthetic data, but the trueness cannot guarantee that the data will certainly enhance the performance of the training model. Therefore, subsequent research combines both image synthesis and countermeasure methods, proposing a new image synthesis network to learn the synthesized images, and then using this method to train a synthesizer to generate useful synthesis samples, thereby helping to improve the performance of the target network. At present, the image synthesis work is done less by using the copy and paste method entirely. People prefer to use virtual engines to generate large amounts of synthetic data, and some work on synthetic data sets based on virtual image scenes. In addition, synthetic data also helps people to better understand the real world. The composite image may be divided into real and non-real images and the real composite image regions re-colored to facilitate a better understanding of natural color statistics and color perception. Based on prior knowledge of the real-world understanding, people can find that the foreground of the synthetic image is not compatible with the background, and the data distribution of the synthetic data set and the real data set has difference.

In summary, the following problems mainly exist in the field of image synthesis at present:

1. data distribution difference exists between the synthetic data and the real data, so that the model learning difficulty is increased;

2. the relation between the synthetic foreground and the background cannot be regularized, and the relation between the foreground and the background cannot be well described by using deep learning;

3. designing a particular 3D model is costly, resulting in the data being synthesized using the 3D model being expensive;

4. the difference between the synthetic data and the real data set is difficult to evaluate, so that the effectiveness of the synthetic model cannot be quantified after the data is synthesized;

5. the mass images cannot be automatically synthesized, and the quality evaluation is finished.

Disclosure of Invention

The invention provides a method, a device and a storage medium for image synthesis based on relationship embedding, which enable foreground and background images to complete better image synthesis through the image synthesis relationship designed by the invention; the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded in the relation; useful data sets can be augmented as desired using the image synthesis of the present invention, as described in detail below:

in a first aspect, a method for relationship-embedding-based image synthesis, the method comprising the steps of:

embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;

performing mutual supervision learning on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;

training an image synthesis model embedded in a relationship to obtain an image synthesizer, wherein the image synthesizer comprises: a generator and a discriminator;

training a composite image score classifier to automatically score the composite image based on the data set of the composite image;

and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.

In one embodiment, the reciprocal relationship is:

RSI＝Relation(B，B+F)

wherein B represents a restored background image, F represents a foreground image required for synthesis, and relationship represents.

Wherein the input of the image synthesis model comprises: a background image x, a background image y with a foreground image, where x,

r is a feature space, H is a face image height, W is a width, and C is a channel number.

In one embodiment, the image synthesis model comprises: a generator and a discriminator, wherein the generator and the discriminator,

the generator is composed of a background image appearance encoder

Foreground image object encoder

And decoder (G)_x，G_y) Composition is carried out;

encoder for encoding a video signal

For obtaining the background appearance characteristic vector by encoding the background image

Obtaining the characteristic vector of the object in the foreground image by the same method

And a decoder (G)_x，G_y) Combining to generate a new image, G_xTo generate a restoration image, G_yTo generate a composite image;

(D_x，D_y) Two discriminators, D, for generating a restoration image and for generating a composite image, respectively_xFor decoding

The generated restored image is distinguished from the natural image, D_yFor decoding

The generated synthetic image is distinguished from the natural image.

Preferably, the image synthesis work includes a repair route mode and a synthesis route mode:

the repair route mode is as follows:

extracting target position characteristic f from target background image B₁(ii) a Extracting target background feature B from natural image F + B₂；

Will f is₁And B₂As input to the generator, generate f₁+B₂A composite image of (a);

the synthetic route mode is as follows:

extracting background feature B from target background image B₁(ii) a Foreground feature F is extracted from natural image F + B₂；

Will f is₂+B₁As input to the generator, generate f₂+B₁A composite image of (a).

Further, the method further comprises: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;

the automatic synthesis score is used for manually marking the appearance, size and position of the synthesized image by copying and pasting the synthesized image, and then training is completed, and the synthesized image is automatically scored;

the proportionality coefficients of the ideal synthetic scores are all a maximum of 1.

In a second aspect, an apparatus for relationship embedding based image synthesis, the apparatus comprising:

the embedding module is used for embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;

the learning module is used for performing mutual supervision learning on the images based on the reciprocal relation so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;

an obtaining module, configured to train an image synthesis model embedded in a relationship to obtain an image synthesizer, where the image synthesizer includes: a generator and a discriminator;

the training module is used for training a composite image score classifier to automatically score the composite image based on the data set of the composite image;

and the scoring module is used for carrying out image synthesis work and scoring on the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.

A third aspect, an apparatus for image composition based on relationship embedding, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause an apparatus to perform the method steps of any of the claims first aspect.

A fourth aspect, a computer-readable storage medium storing a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method steps of any of the first aspects.

The technical scheme provided by the invention has the beneficial effects that:

1. the method can better complete the image synthesis work and the image restoration work; and proves that the image synthesis and the image restoration are reversible;

2. the method can learn the position and the foreground proportion of image synthesis; the consistency of the appearance, size and position characteristics of image synthesis is improved;

3. the method can be applied to deep learning as a new data enhancement method; experiments prove that the method is suitable for application and popularization in the field of image synthesis.

Drawings

FIG. 1 is a diagram illustrating a relationship definition for image composition based on relationship embedding;

FIG. 2 is a diagram of a relationship-based embedded image synthesis model;

FIG. 3 is a logic flow diagram for relational embedding based on image composition for relational embedding;

FIG. 4 is a flow diagram of image synthesis work using a relationship-embedded image synthesis model;

FIG. 5 is an exemplary graph of evaluation scoring based on relationship-embedded image composition;

FIG. 6 is an overall flow diagram of a relationship-based embedding image composition;

FIG. 7 is a schematic diagram of an apparatus for image synthesis based on relationship embedding;

fig. 8 is another structural diagram of an apparatus for image synthesis based on relationship embedding.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.

Example 1

An embodiment of the present invention provides a method for image synthesis based on relationship embedding, and referring to fig. 1 to 4, the method includes the following steps:

101: embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;

102: mutual supervision learning is carried out on the images based on the reciprocal relation, so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;

103: training the image synthesis model of the relationship embedding to obtain an image synthesizer: the image synthesizer includes: a generator and a discriminator;

104: training a composite image score classifier to automatically score the composite image based on the data set of the composite image;

105: and performing image synthesis work and grading the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.

The input of the image synthesis model in step 102 includes:

a background image x, a background image y with a foreground image, where x,

In one embodiment, the image synthesis model in steps 101 and 102 comprises: a generator and a discriminator, wherein the generator and the discriminator,

the generator is composed of a background image appearance encoder

Foreground image object encoder

And decoder (G)_x，G_y) Composition is carried out;

encoder for encoding a video signal

And a decoder (G)_x，G_y) Combining to generate a new image, G_xFor generating a restoration image，G_yTo generate a composite image;

The generated synthetic image is distinguished from the natural image.

In one embodiment, the image composition operation of step 105 comprises: repair route and synthesis route;

the repair route mode is as follows:

Will f is₁And B₂As input to the generator, generate f₁+B₂A repair image of (1);

the synthetic route mode is as follows:

In one embodiment, the method further comprises: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;

the automatic synthesis score is used for copying and pasting the synthetic image, training is completed after manual grading and marking of appearance, size and position, and the synthetic image is automatically graded; the proportionality coefficients for the ideal synthetic scores are all at a maximum of 1.

In summary, the embodiment of the present invention realizes better synthesis of foreground and background images through the steps 101 to 105; and the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded by the relationship.

Example 2

The scheme of example 1 is further described below with reference to specific calculation formulas and examples, which are described in detail below:

first, relation embedded image synthesis model

1. Definition of relationship embedding

In the embodiment of the present invention, the relationship embedding is defined as a relationship Between Image Synthesis and Image restoration, which is abbreviated as RSI (relationship Between Image Synthesis and inpainting), and to a certain extent, Image Synthesis and restoration can be regarded as a reciprocal relationship, and then RSI is embedded into a Synthesis model in the embodiment of the present invention.

Thus, the learning objectives of the synthetic model are: the reciprocal relation between image synthesis and image restoration is learned, so that the model learns the information such as the position of the synthesized foreground, the foreground proportion and the like, and the appearance of the synthesized image, the size of the foreground image and the consistency in the background image are improved as much as possible.

The figure of RSI is illustrated in fig. 1, which shows the relationship of RSI by taking the face without glasses and the face with glasses as an example, the upper line shows the synthesis process in RSI, the lower line shows the repair process in RSI, and the two processes form the relationship of RSI, and the relationship is expressed in a formalization:

RSI＝Relation(B，B+F) (1)

wherein, B represents the background image with good Inpainting (repairing), F represents the foreground image required by Synthesis, and relationship represents the relationship.

2. Relationship-embedded image synthesis model

The image synthesis model framework embedded in the relationship of the embodiment of the present invention is designed based on VAE (variant auto-encoder) and GAN (generic adaptive Network, Generative countermeasure Network). As shown in fig. 2, the input to the model is two graphs, one of which is a background graph, defined as x, and the other of which is a background graph of the foreground image, defined as y, where x,

r is a feature space, H is a face image height, W is a width, and C is a channel number. For convenience of description, the relationship-embedded Image synthesis model is referred to as ISRE (i.e., Image Synthesizer Based on Relational Embedding), and the details of the model are described below.

2.1 Generator design of ISRE

The generator is constructed mainly of an encoder and a decoder, as shown in fig. 2, mainly by a background image appearance encoder

Foreground image object encoder

And decoder (G)_x，G_y) And (4) forming. Wherein the encoder

The method has the effect of acquiring a background appearance characteristic vector by encoding a background image

Then with a decoder (G)_x，G_y) Combined to generate a new image. G_xGenerating a restored image, G_yA composite image is generated.

It is noted that the image X e X represents a natural background image not containing the foreground of the composite image, the foreground object O e O of the composite image represents the foreground image of the target composite image, the image Y e Y represents a natural image containing the foreground image of the composite image, and

2.2 arbiter design of ISRE

D in FIG. 2 denotes an arbiter, where (D)_x，D_y) Two discriminators, D, for generating a restoration image and for generating a composite image, respectively_xIs aimed at decoding

The generated restored image is distinguished from the natural image,

wherein

And

as G_xTwo inputs, parameters

Appearance coding information, parameters representing background image

Encoded information representing an object of the foreground image. D_yIs aimed at decoding

The generated synthetic image is distinguished from the natural image, wherein

And

as G_yTwo inputs, parameters

Background appearance coding information representing a foreground image,

representing the key in the background imageAnd obtaining the position characteristic information of the foreground object.

Image synthesis algorithm with two-step relationship embedding

The overall flow of the algorithm designed by the embodiment of the invention is shown in fig. 3, wherein B represents a single background image, and F + B represents a foreground + background natural image. Based on the RSI definition, two learning routes of the algorithm are designed, the upper route is a repair route of the algorithm, and the lower route is a synthesis route of the algorithm, in detail:

1. repair route algorithm

The first step is as follows: extracting target position characteristic f from target background image B₁；

The second step is that: extracting target background feature B from natural image F + B₂；

The third step: will f is₁And B₂As input to the generator, generate f₁+B₂A repair image.

2. Synthetic route algorithm

The first step is as follows: extracting background feature B from target background image B₁；

The second step is that: foreground feature F is extracted from natural image F + B₂；

The third step: will f is₂+B₁As input to the generator, generate f₂+B₁A composite image of (a).

Finally, an object of embodiments of the present invention is to synthesize massive amounts of f₂+B₁The image is then applied as a data set to a specific engineering scene.

3. Design of loss function

In order to make the model learning better and make two learning routes effectively learn the relationship of 'synthesis-restoration', the embodiment of the invention designs five loss functions to constrain the target model, and the detailed description is as follows:

1)f₁+B₂associated loss function

Is f₁+B₂Two loss functions are designed to constrain its composite effect. Wherein loss₁Constraint f₁+B₂Foreground feature of the composite image, loss₂Constraint f₁+B₂The background features of the composite image are respectively defined as follows:

2)f₂+B₁associated loss function

Is f₂+B₁Two loss functions are designed to constrain its composite effect. Therein, loss₃Constraint f₂+B₁Foreground feature of the composite image, loss₄Constraint f₂+B₁The background features of the composite image are respectively defined as follows:

3) generating a penalty function

To encourage better generated composite images, i.e., indistinguishable from natural images, embodiments of the present invention employ generation of a countermeasure loss, where G_xAnd G_yAttempting to generate a true composite image, D_xAnd D_yIn an attempt to distinguish between natural images and synthetic images generated, embodiments of the present invention define the generation of a countermeasure loss as follows:

wherein, B_yBackground features representing y, F_xForeground of xCharacteristic; corresponding to, B_xRepresenting the background features of x, F_yRepresenting the foreground feature of y, D_x(x) And D_y(y) denotes a discriminator for discriminating a synthetic image from a natural image, and E denotes a mathematical expectation.

4) Target learning function

The above-mentioned loss function needs to be used simultaneously when training the generator and the discriminator, so the embodiment of the present invention defines a complete target learning function as:

wherein alpha is f₁+B₂The composite image loses weight, β is f₂+B₁The synthetic image loss weight is between 0 and 1.

5) Image synthesis using ISRE model

After the ISRE model is trained based on the above steps, the ISRE model can be used for image synthesis. The process is shown in fig. 4, and the embodiment of the present invention uses the street view data set as an example, after training the ISRE model using the street view data set, a foreground image (such as the car in fig. 4) and a background image (such as the street map in fig. 4) are input into the ISRE model, then the synthesis of the foreground image and the background image is completed inside the ISRE model, and finally the output image is the expected synthesis image. Similarly, the algorithm in the embodiment of the invention can be used for completing training in other data sets and can also be used as a synthesis model for image synthesis, thereby completing the expansion of the data set and solving the problem of insufficient data set in deep learning.

Third, design of new evaluation index

Two general evaluation indexes for image synthesis are provided in this embodiment, and according to research, the evaluation index provided in the embodiment of the present invention is the first general evaluation index in the field of image synthesis, and therefore, it is also one of the important inventions in the embodiment of the present invention, and the detailed description is as follows:

1. auto-synthetic score ACS

Auto-synthesis score ACS (Autom)atic Composite Score) was aimed at evaluating the performance of an automated synthesis strategy proposed by a relationship-embedded image synthesis algorithm. Let C_iThe synthesis result of the i-th synthesis image is represented. The auto-composite score ACS is defined as follows:

ACS＝λ₁a(C_i)+λ₂s(C_i)+λ₃p(C_i) (8)

wherein a, s, p represent the composite scores of appearance, size and position, respectively, { λ₁、λ₂、λ₃And the values of the proportional coefficients corresponding to a, s and p are between 0 and 1, and the significance in the score calculation is represented. a. s, p are predicted by a Composite Image Score Classifier (CISC). And the composite image score classifier is obtained by training, specifically:

a batch of images are synthesized in advance by using a copy-paste mode, and then the three images are scored manually, wherein one image score comprises three parts, namely appearance, size and position. After the classifier is trained, inputting a composite image, and obtaining three scores of a, s and p of the image, wherein the value is between 0 and 1, and the closer to 1, the better the composite effect is. Namely:

{a，s，p}＝CISC(C_i) (9)

2. ideal composition score ICS

The ideal synthesis score ics (ideal Composite score) is to evaluate the maximum performance of the conventional image synthesis algorithm achieved by the ideal synthesis strategy, and it can be considered as the upper bound of the automatic synthesis score, i.e. the case that the scaling coefficients all take the maximum value of 1, aiming to stimulate the image synthesis algorithm embedded in the relationship to propose a better and more effective automatic synthesis strategy. Let C_iThe synthesis result of the i-th synthesis image is represented. The definition of the ideal fusion score ICS is mainly as follows:

ICS＝a(C_i)+s(C_i)+p(C_i) (10)

as shown in fig. 5, after training of the model is completed by using the ISRE model, the test image is scored to obtain scores of a, s, and p corresponding to the synthesized image, where fig. 5 illustrates an ideal synthesis score, and the automatic synthesis score needs to be multiplied by a corresponding coefficient weight.

In summary, the embodiment of the present invention is an image synthesis method based on relationship embedding, and the overall flow is shown in fig. 6. Firstly, the ISRE model designed by the embodiment needs to be obtained through training, then the model is used for completing image synthesis work, and finally quality evaluation of the synthesized image is carried out.

Example 3

An apparatus for image composition based on relationship embedding, referring to fig. 7, the apparatus comprising:

the embedding module 1 is used for embedding the reciprocal relation between image synthesis and image restoration into an image synthesis model;

the learning module 2 is used for performing mutual supervision learning on the images based on the reciprocal relation so that the image synthesis model learns the characteristics of the object of the foreground image and the background image;

an obtaining module 3, configured to train an image synthesis model embedded in a relationship to obtain an image synthesizer, where the image synthesizer includes: a generator and a discriminator;

the training module 4 is used for training a composite image score classifier to automatically score the composite image based on the data set of the composite image;

and the scoring module 5 is used for performing image synthesis work and scoring on the synthesized image based on the image synthesizer, the synthesized image score classifier, the foreground image, the background image and the trained image synthesis model embedded in the relationship.

Wherein the image synthesis model comprises: a generator and a discriminator, wherein the generator and the discriminator,

the generator is composed of a background image appearance encoder

Foreground image object encoder

And decoder (G)_x，G_y) Composition is carried out;

encoder for encoding a video signal

The generated synthetic image is distinguished from the natural image.

In summary, the embodiment of the present invention realizes better synthesis of foreground and background images through the above modules; and the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded by the relationship.

Example 4

An apparatus for image composition based on relationship embedding, referring to fig. 8, the apparatus comprising: a processor 6 and a memory 7, the memory 7 having stored therein program instructions, the processor 6 calling the program instructions stored in the memory 7 to cause the apparatus to perform the following method steps in embodiment 1:

In one embodiment, the reciprocal relationship is:

RSI＝Relation(B，B+F)

In one embodiment, an image synthesis model includes: a generator and a discriminator, wherein the generator and the discriminator,

the generator is composed of a background image appearance encoder

Foreground image object encoder

And decoder (G)_x，G_y) Composition is carried out;

encoder for encoding a video signal

The generated synthetic image is distinguished from the natural image.

Preferably, the image composition job comprises: repair route and synthesis route:

the repair route mode is as follows: extracting target position characteristic f from target background image B₁(ii) a Extracting target background feature B from natural image F + B₂；

wherein, the synthetic route mode is as follows: extracting background feature B from target background image B₁(ii) a Foreground feature F is extracted from natural image F + B₂；

Further, the present invention also includes: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;

the automatic synthesis score is used for copying and pasting the synthetic image, finishing training after manually marking the appearance, the size and the position, and automatically scoring the synthetic image;

the proportionality coefficients for the ideal synthetic scores are all at a maximum of 1.

In summary, the embodiment of the present invention realizes better synthesis of foreground and background images through the processor and the memory; and the consistency of the synthesized images is improved based on the training and learning of the image synthesis model embedded by the relationship.

It should be noted that the device description in the above embodiments corresponds to the method description in the embodiments, and the embodiments of the present invention are not described herein again.

The execution main bodies of the processor 6 and the memory 7 may be devices having a calculation function, such as a computer, a single chip, a microcontroller, and the like, and in the specific implementation, the execution main bodies are not limited in the embodiment of the present invention, and are selected according to the needs in the practical application.

The memory 7 and the processor 6 transmit data signals through the bus 8, which is not described in detail in the embodiment of the present invention.

Example 5

Based on the same inventive concept, an embodiment of the present invention further provides a computer-readable storage medium, where the storage medium includes a stored program, and when the program runs, the apparatus on which the storage medium is located is controlled to execute the method steps in the foregoing embodiments.

The computer readable storage medium includes, but is not limited to, flash memory, hard disk, solid state disk, and the like.

It should be noted that the descriptions of the readable storage medium in the above embodiments correspond to the descriptions of the method in the embodiments, and the descriptions of the embodiments of the present invention are not repeated here.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer.

The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium or a semiconductor medium, etc.

In the embodiment of the present invention, except for the specific description of the model of each device, the model of other devices is not limited, as long as the device can perform the above functions.

Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method for relationship-embedding-based image synthesis, the method comprising the steps of:

2. The method of claim 1, wherein the reciprocal relationship is:

RSI＝Relation(B，B+F)

3. The method of claim 1, wherein the inputting of the image synthesis model comprises: background image x, background image y with foreground image, wherein

4. The method of claim 2, wherein the image synthesis model comprises: a generator and a discriminator, wherein the generator and the discriminator,

the generator is composed of a background image appearance encoder

Foreground image object encoder

And decoder (G)_x，G_y) Composition is carried out;

encoder for encoding a video signal

The generated synthetic image is distinguished from the natural image.

5. The method of claim 4, wherein the image synthesis task comprises a repair route mode and a synthesis route mode:

the repair route mode is as follows:

the synthetic route mode is as follows:

6. The method of claim 5, wherein the image synthesis model has a loss function of:

1)f₁+B₂associated loss function

loss₁Constraint f₁+B₂Foreground feature of the composite image, loss₂Constraint f₁+B₂Background characteristics of the composite image:

2)f₂+B₁associated loss function

loss₃Constraint f₂+B₁Foreground feature of the composite image, loss₄Constraint f₂+B₁Background characteristics of the composite image:

3) generating a penalty function

G_xAnd G_yFor generating realistic synthetic images, D_xAnd D_yFor distinguishing between natural images and generated synthetic images:

L_adv＝E[log D_x(x)]+E{log[1-D_x(G_x(B_y，F_x))]}+E[log D_y(y)]+E{log[1-D_y(G_y(B_x，F_y))]}

wherein, B_yBackground of yCharacteristic of (F)_xA foreground feature representing x; b is_xRepresenting the background features of x, F_yRepresenting the foreground feature of y, D_x(x) And D_y(y) denotes a discriminator for discriminating a synthetic image from a natural image, and E denotes a mathematical expectation.

7. The method of image composition based on relationship embedding of claim 1, wherein the method further comprises: evaluating the quality of image synthesis based on the automatic synthesis score index and the ideal synthesis score index;

8. An apparatus for image composition based on relationship embedding, the apparatus comprising:

9. An apparatus for image composition based on relationship embedding, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause the apparatus to perform the method steps of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method steps of any of claims 1-7.