CN111583168A

CN111583168A - Image synthesis method, image synthesis device, computer equipment and storage medium

Info

Publication number: CN111583168A
Application number: CN202010559322.5A
Authority: CN
Inventors: 周康明; 高凯珺
Original assignee: Shanghai Eye Control Technology Co Ltd
Current assignee: Shanghai Eye Control Technology Co Ltd
Priority date: 2020-06-18
Filing date: 2020-06-18
Publication date: 2020-08-25

Abstract

The application relates to an image synthesis method, an image synthesis device, a computer device and a storage medium, firstly, acquiring a first target area image from a target area image group, and acquiring a first background image from a background image group; secondly, calculating the domain of the first target area image and the domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image, so that the foreground and the background are transferred into the same domain; and finally, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

Description

Image synthesis method, image synthesis device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image synthesis method and apparatus, a computer device, and a storage medium.

Background

Training the model requires a large amount of data and labeling the data, and labeling is a time-consuming and labor-consuming task, so people begin to expand the data by using an image synthesis method.

Image synthesis is a common operation of image processing, and in a conventional image synthesis method, an area of interest is extracted from one image and the area of interest is attached to another image, so that a new synthesized image is generated, and labeling information already exists in the synthesized image, so that labeling is not needed. Therefore, the synthetic graph can be used for data augmentation and has wide application prospect.

However, the composite image obtained by the conventional image synthesis method has the technical problem that the image content is not harmonious, for example, the region of interest appears to be obtrusive in the composite image.

Disclosure of Invention

In view of the above, it is necessary to provide an image synthesis method, an apparatus, a computer device, and a storage medium capable of harmonizing image contents of a synthesized graph in view of the above technical problems.

A method of image synthesis, the method comprising:

acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group;

calculating a domain of the first target area image and a domain of the first background image by a domain decision model;

converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image;

and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

In one embodiment, the domain decision model includes a global decision model and a target background decision model, and the global decision model is used for assisting in training the domain conversion model; the calculating, by a domain decision model, a domain of the first target area image and a domain of the first background image includes:

calculating a domain of the first target area image and a domain of the first background image by the target background determination model.

In one embodiment, the training of the domain decision model and the domain conversion model comprises:

constructing a real sample set and a fake sample set;

training the domain judgment model and the domain conversion model by using the real sample set and the fake sample set in a single alternate iterative training mode; wherein a domain verification loss of the target context decision model is calculated.

In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of synthetic images; the calculation of the domain verification loss of the target background model comprises the following steps:

calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image;

calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image;

and determining the domain verification loss of the target background model according to the first similarity and the second similarity.

In one embodiment, the domain validation loss is calculated as follows:

wherein the content of the first and second substances,

verifying the loss for the domain, I is a feature map of the real image,

for the feature map of the composite image, M isA feature map of the target area image.

In one embodiment, the converting, according to the domain of the first target area image and the domain of the first background image, the first target area image and the first background image into the same domain through a domain conversion model to obtain at least one of a second target area image and a second background image includes:

converting the domain of the first target area image to the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or

And converting the domain of the first background image into the domain of the first target area image through a domain conversion model according to the domain of the first target area image to obtain the second background image.

In one embodiment, before the acquiring the first target area image from the target area image group and the acquiring the first background image from the background image group, the method further includes:

acquiring a target image group;

and extracting the first target area image from each target image in the target image group to obtain a plurality of first target area images to form the target area image group.

An image synthesis apparatus, the apparatus comprising:

the acquisition module is used for acquiring a first target area image from the target area image group and acquiring a first background image from the background image group;

a calculation module, configured to calculate, through a domain determination model, a domain of the first target area image and a domain of the first background image;

the conversion module is used for converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image;

and the synthesis module is used for synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the image synthesis method of any of the embodiments described above when the computer program is executed.

A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps of the image synthesis method of any of the embodiments described above.

According to the image synthesis method, the image synthesis device, the computer equipment and the storage medium, firstly, a first target area image is obtained from the target area image group, and a first background image is obtained from the background image group; secondly, calculating the domain of the first target area image and the domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image, so that the foreground and the background are transferred into the same domain; and finally, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image. Because the foreground and the background are in the same domain, the brightness, the contrast, the color and other parameters of the foreground and the background are matched, and the content of the synthesized image is harmonious.

Drawings

FIG. 1 is a diagram showing an example of an application environment of an image synthesis method;

FIG. 2 is a schematic flow chart diagram of an image synthesis method according to an embodiment;

FIG. 3 is a flowchart illustrating the steps of training a model in one embodiment;

FIG. 4 is a flow diagram illustrating the calculation of a domain authentication loss in another embodiment;

FIG. 5a is a schematic flow chart diagram illustrating an exemplary image synthesis method;

5 b-5 g are images generated during annual inspection of a vehicle in one embodiment;

FIG. 6 is a block diagram showing the configuration of an image synthesizing apparatus according to an embodiment;

FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Image Composition (Image Composition) is a common operation in Image processing, and can be used to acquire an Image of interest, and also can be used for data augmentation, and has a wide application prospect. However, the composite image obtained by the conventional image synthesis method has many problems, such as the foreground and the background look discordant. Specifically, in the composite image, the foreground and the background are photographed under different photographing conditions (such as time, season, illumination, weather), so there is a problem of a significant mismatch in brightness, saturation, color, contrast, and the like.

Based on this, the present application provides an image synthesis method, which can be applied to the application environment as shown in fig. 1. The application environment may include: a first computer device 110, a second computer device 120, and an image acquisition device 130. The first Computer device 110 and the second Computer device 120 refer to electronic devices with strong data storage and computation capabilities, for example, the first Computer device 110 and the second Computer device 120 may be a PC (Personal Computer) or a server. And acquiring the annual inspection image of the vehicle through the image acquisition equipment 130 to obtain a target image group. And sends the set of target images to the first computer device 110 over the network connection. Before image synthesis is performed, a technician is required to construct a domain decision model and a domain conversion model on the second computer device 120 and train the constructed domain decision model and domain conversion model through the second computer device 120. The trained domain judgment model and domain conversion model can be issued from the second computer device 120 to the first computer device 110, and the first computer device 110 can extract a first target area image from any target image in the target image group to obtain a plurality of first target area images to form a target area image group; acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group; calculating a domain of the first target area image and a domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image; and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image. It is understood that the first computer device 110 may also take the form of a terminal, which may be an electronic device such as a cell phone, a tablet, an e-book reader, a multimedia player device, a wearable device, a PC, etc. And the terminal completes the work of image synthesis through the domain judgment model and the domain conversion model.

In one embodiment, as shown in fig. 2, there is provided an image synthesis method, which is described by way of example as applied to the first computer device 110 in fig. 1, and includes the following steps:

s210, acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group.

The target area image group comprises a plurality of target area images. The target area image is an image corresponding to an area of interest in an image. The background image group comprises a plurality of background images, the background images refer to images without any interested areas, and the background images can provide different scenes. For example, the scene may be a scene of a vehicle annual inspection.

Specifically, the first computer device 110 acquires the target area image group and the background image group from a local computer device or a computer device in communication connection therewith, wherein the target area image group includes a plurality of first target area images, and the background image group includes a plurality of first background images, the first computer device 110 acquires any one of the first target area images from the target area image group, and acquires any one of the first background images from the background image group.

S220, calculating a domain of the first target area image and a domain of the first background image through a domain judgment model;

and S230, converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image.

The domain determination model is a machine learning model for calculating an image domain. The domain conversion model refers to a machine learning model for performing image domain conversion. The image domain is related to the photographing conditions of the image, including time of day, season, illumination, weather, and the like. The images obtained under different shooting conditions have different domains, and the shooting conditions are specifically expressed in the aspects of brightness, saturation, color, contrast and the like of the images. In order to ensure the content of the synthesized image is harmonious, it is necessary to ensure that the foreground and the background are obtained under the same shooting condition, and the harmony of the content of the synthesized foreground and the background can be ensured by adjusting parameters such as brightness, saturation, color, contrast and the like of the image. And the image domain may be a data set comprising values of brightness, saturation, color, contrast, etc. of the image. For example, the image domain may be a one-dimensional vector.

Specifically, the shooting conditions of the first target area image and the first background image may be the same or different, and the domain of the first target area image obtained under different shooting conditions may be the same or different from the domain of the first background image. Firstly, a first target area image and a first background image are input into a domain judgment model, and image domains of the first target area image and the first background image are calculated by using the domain judgment model to obtain a domain of the first target area image and a domain of the first background image. Secondly, in order to ensure that the foreground in the synthesized image is harmonious with the background, domain conversion needs to be performed on the first target area image and the first background image, the first target area image and the first background image are migrated into the same domain, and at least one of the second target area image and the second background image is obtained, so that the second target area image and the second background image have the same domain. It should be noted that the structure of the domain determination model is well known to those skilled in the art, and the calculation of the domain of the first target area image and the domain of the first background image by the domain determination model is prior art and is not described herein again.

Further, according to the domain of the first target area image and the domain of the first background image, the first target area image and the first background image are converted into the same domain through a domain conversion model, so as to obtain at least one of a second target area image and a second background image, including the following three conditions:

and converting the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image. Or

And converting the domain of the first background image into the domain of the first target area image through a domain conversion model according to the domain of the first target area image to obtain a second background image. Or

And setting a new image domain according to the domain of the first background image and the domain of the first target area image, and transferring the first background image and the first target area image into the new image domain to obtain a second target area image and a second background image.

S240, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

The domain of the first target area image is denoted as a, the domain of the first background image is denoted as B, and the two domains are migrated into the same domain C, which may be the same as the domain a of the first target area image, the same as the domain B of the first background image, or different from both the domains a and B. Specifically, if the domain C is the same as the domain a of the first target area image and indicates that the domain a of the first background image needs to be moved to the domain B of the first target area image, the second background image is obtained, and the first target area image and the second background image are synthesized. If the domain C is the same as the domain B of the first background image and indicates that the domain A of the first target area image needs to be transferred into the domain B of the first background image, a second target area image is obtained, and the second target area image and the first background image are synthesized. And if the domain C is different from the domain A and the domain B, the domain A of the first target area image and the domain B of the first target area image are transferred to the domain C to obtain a second target area image and a second background image, so that the second target area image and the second background image are synthesized.

In this embodiment, first, a first target area image is acquired from a target area image group, and a first background image is acquired from a background image group, and sizes and numbers of the two groups of images are not required to be consistent, and the first target area image and the first background image are not required to be paired one by one, so that the workload of data preparation is reduced; secondly, calculating the domain of the first target area image and the domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image, so that the foreground and the background are transferred into the same domain; and finally, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image. Because the foreground and the background are in the same domain, the brightness, the contrast, the color and other parameters of the foreground and the background are matched, and the content of the synthesized image is harmonious. Furthermore, data expansion is carried out through the more vivid synthetic image, so that the cost of data collection and labeling can be reduced, service can be better provided for a machine learning model, and the requirement of model training on data is met.

In one embodiment, the domain decision model includes a global decision model and a target background decision model. The global decision model is used to assist in training the domain conversion model. Calculating a domain of the first target area image and a domain of the first background image by a domain decision model, comprising: the domain of the first target area image and the domain of the first background image are calculated by the target background determination model.

Specifically, the global decision model may adopt a decision model in a conventional GAN (Generative adaptive Networks) network. The target context decision model may employ a generic classification network structure such as Resnet 18. The global decision model is used for assisting in training the domain conversion model and does not participate in the calculation process of the image domain. And inputting the first target area image and the first background image into the domain judgment model, and calculating the domain of the first target area image and the domain of the first background image through the target background judgment model to obtain the brightness, saturation, color and contrast of the first target area image and the first background image.

In this embodiment, the training of the domain conversion model is assisted by the global decision model, so that the domain conversion model can be helped to generate a real and credible synthetic image. And calculating the domain of the first target area image and the domain of the first background image through the target background judgment model, laying a foundation for the migration of the first target area image and the domain of the first background image into the same domain, and being beneficial to generating a synthetic image with more harmonious content.

In one embodiment, before acquiring the first target area image from the target area image group and acquiring the first background image from the background image group, the method further comprises: acquiring a target image group; and extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.

The image acquisition equipment shoots a plurality of target images under different shooting conditions (such as illumination and angles) to form a target image group. The first computer device obtains the target image group, and the mask of the target area image in the target image can be obtained by adopting an artificial edge tracing mode to extract the target area image from the target image, so that the first target area image can be extracted. And extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images, thereby forming the target area image group.

In one embodiment, as shown in FIG. 3, the training of the domain decision model and the domain transformation model comprises the following steps:

s310, constructing a real sample set and a fake sample set;

and S320, training a domain decision model and a domain conversion model by utilizing the real sample set and the fake sample set in a single alternative iterative training mode. Wherein a domain verification loss of the target background decision model is calculated.

The real sample set is a set of real images, and the real images are images including targets captured in a specific scene. The fake sample set is a set of synthetic images, and the obtaining of the synthetic images comprises: acquiring real images under different shooting conditions (such as illumination and angles), extracting a target area in the real images to obtain target area images, and processing the extracted target area images, such as modifying at least one of brightness, saturation, color and contrast, so that the domain of the target area images is different from the domain of the real images. And fitting the target area image after the domain change into the background of the real image to obtain a composite image. Further, the target area image can be extracted from the real image by obtaining a mask of the target area image in the real image in an artificial edge tracing manner, so that the target area image can be extracted.

Specifically, a real sample set is constructed by taking a series of real images containing the target. And (3) attaching the target area images after the domain change to the background image phase to obtain a plurality of synthetic images, namely constructing a fake sample set.

The domain judgment model comprises a global judgment model and a target background judgment model which are arranged in parallel, the global judgment model is a judgment model in the traditional GAN, and a real sample and a fake sample are integrally judged by using standard judgment loss to help the domain conversion model to generate a real and credible image. In addition to the global decision model, a target background decision model is also provided in this embodiment, a domain verification loss is provided, and the similarity between the target area image and the background image is calculated to determine whether the target area image and the background image are from the same domain. The domain conversion model uses the structure of a U-net (relational network for biological Image segmentation) convolutional neural network, because the network uses jump connection, so that features of different levels can be fused. Of course, the domain transformation model may also use other scene segmentation model structures, such as SegNet, PSPNet or deep series, which all can obtain an output image with the same size as the input image.

In this embodiment, the domain decision model and the domain conversion model are trained in a single alternating iterative training manner, which specifically includes: and fixing parameters of the domain decision model, training a domain conversion model by utilizing the real sample set and the fake sample set, and calculating the global decision loss of the global decision model and the domain verification loss of the target background decision model. After the domain conversion model is trained to a certain degree, the parameters of the domain conversion model are fixed, the domain judgment model is trained by utilizing the real sample set and the fake sample set, and the loss of the domain conversion model is calculated. And after the domain judgment model is trained to a further degree again, fixing the parameters of the domain judgment model again, training the domain conversion model again by using the real sample set and the fake sample set, and calculating the global judgment loss of the global judgment model and the domain verification loss of the target background judgment model again. By analogy, the domain decision model and the domain conversion model are trained in the above-mentioned way of single alternate iterative training.

In one embodiment, the set of authentic samples includes a number of authentic images, and the set of counterfeit samples includes a number of composite images. As shown in fig. 4, the calculation of the domain verification loss of the target background model includes the following steps:

s410, calculating a first similarity according to the feature map of the real image and the feature map of the target area image;

s420, calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image;

and S430, determining the domain verification loss of the target background model according to the first similarity and the second similarity.

The image is processed by a convolutional neural network feature map, and the feature map is an output result of a convolutional filter and can represent data of a certain feature distribution on previous input. The similarity is mainly used for calculating the similarity of the contents between the two images, and the similarity of the contents of the two images is judged according to the size of the calculation result. The domain verification is to judge whether the foreground and the background belong to the same domain, and a domain verification loss calculation mode is designed for zooming in the foreground and the background domains to make the image content more harmonious. Specifically, the convolution neural network is used for processing the real image to obtain a characteristic diagram of the real image. And processing the target area image by using the convolutional neural network to obtain a characteristic diagram of the target area image. And processing the synthetic image by using the convolutional neural network to obtain a characteristic diagram of the synthetic image. In order to improve the capability of distinguishing a real image from a synthetic image of a target background model, on one hand, a first similarity between the characteristic graph of the real image and the characteristic graph of a target area image is calculated; on the other hand, the feature map of the composite image and the feature map of the target area image are used to calculate a second similarity between the two. And finally, determining the domain verification loss of the target background model according to the first similarity and the second similarity.

Further, the calculation of the domain authentication loss is performed according to the following formula:

wherein the content of the first and second substances,

for domain verification loss, I is a feature map of the real image,

feature maps for composite imagesAnd M is a characteristic diagram of the target area image. D_VRepresenting calculating the similarity of the synthetic image or the real image and the target area image, and Dv (I, M) is I · M;

"·" denotes the inner product.

In this embodiment, by designing the domain verification loss of the similarity between the real image and the target region image and the similarity between the synthetic image and the target region image, the real image and the synthetic image appear in opposition, and the computation of the domain characterization similarity design verification loss based on the target region and the background is essentially used to determine whether the target region and the background are from the same domain.

In one embodiment, the present application provides an image synthesis method, as shown in fig. 5a, comprising the steps of:

and S510, acquiring a target image group.

S520, extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.

S530, acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group.

And S540, calculating the domain of the first target area image and the domain of the first background image through the domain judgment model.

The domain judgment model comprises a global judgment model and a target background judgment model. The global decision model is used to assist in training the domain conversion model. The domain of the first target area image and the domain of the first background image are calculated by the target background determination model.

And S550, converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image.

And S560, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

Illustratively, the method for synthesizing the images is applied to annual inspection of the vehicle. Referring to fig. 5b, the real image captured by the image capturing device during the annual inspection of the vehicle includes a background image and a target area image (such as a person in the figure). Referring to fig. 5c, a target area image is extracted from the real image. Referring to fig. 5d, the background image is captured by the image capturing device during the annual inspection of the vehicle. And performing domain conversion on the extracted target area image by combining the domain of the background image, please refer to fig. 5e, which is the target area image obtained after the domain conversion. Referring to fig. 5f, the target area image and the background image are synthesized after the domain conversion to obtain a synthesized image. Referring to fig. 5g, the target area image without domain conversion is synthesized with the background image to obtain a synthesized image. Comparing the two combined images, and if domain conversion is not performed, directly combining the target area image without domain conversion with the background image, which is more obtrusive and unnatural.

It should be understood that, although the steps in the flowcharts of the above embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the above embodiments may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the sub-steps or the stages of other steps.

In one embodiment, as shown in fig. 6, there is provided an image synthesizing apparatus including: an obtaining module 610, a calculating module 620, a converting module 630 and a synthesizing module 640, wherein:

an obtaining module 610, configured to obtain a first target area image from the target area image group, and obtain a first background image from the background image group;

a calculating module 620, configured to calculate, through a domain determination model, a domain of the first target region image and a domain of the first background image;

a conversion module 630, configured to convert the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image, so as to obtain at least one of a second target area image and a second background image;

a synthesizing module 640, configured to synthesize the second target area image with the first background image or the second background image, or synthesize the first target area image with the second background image.

In one embodiment, the domain decision model comprises a global decision model and a target background decision model, wherein the global decision model is used for assisting in training the domain conversion model; the calculating module 620 is further configured to calculate a domain of the first target area image and a domain of the first background image through the target background determination model.

In one embodiment, the training of the domain decision model and the domain conversion model comprises: constructing a real sample set and a fake sample set; training a domain decision model and a domain conversion model by utilizing a real sample set and a fake sample set in a single alternative iterative training mode; wherein a domain verification loss of the target background decision model is calculated.

In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of composite images; calculating a domain verification loss of the target background model, comprising: calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image; calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image; and determining the domain verification loss of the target background model according to the first similarity and the second similarity.

In one embodiment, the calculation of the domain validation loss is performed according to the following formula:

wherein the content of the first and second substances,

for domain verification loss, I is a feature map of the real image,

m is a feature map of the target area image.

In an embodiment, the converting module 630 is further configured to convert the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image, so as to obtain a second target area image; or converting the domain of the first background image into the domain of the first target area image through the domain conversion model according to the domain of the first target area image to obtain a second background image.

In one embodiment, the synthesizing apparatus further comprises a target image group acquiring module and a target region image group composing module, wherein: the target image group acquisition module is used for acquiring a target image group; and the target area image group forming module is used for extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.

For specific limitations of the image synthesizing device, reference may be made to the above limitations of the image synthesizing method, which are not described herein again. The respective modules in the image synthesizing apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image composition method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group; calculating a domain of the first target area image and a domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image; and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

In one embodiment, the domain decision model comprises a global decision model and a target background decision model, wherein the global decision model is used for assisting in training the domain conversion model; the processor, when executing the computer program, further performs the steps of: the domain of the first target area image and the domain of the first background image are calculated by the target background determination model.

In one embodiment, the processor, when executing the computer program, further performs the steps of: constructing a real sample set and a fake sample set; training a domain decision model and a domain conversion model by utilizing a real sample set and a fake sample set in a single alternative iterative training mode; wherein a domain verification loss of the target background decision model is calculated.

In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of composite images; the processor, when executing the computer program, further performs the steps of: calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image; calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image; and determining the domain verification loss of the target background model according to the first similarity and the second similarity.

In one embodiment, the processor, when executing the computer program, further performs the steps of: the domain verification loss is calculated according to the following formula:

wherein the content of the first and second substances,

for domain verification loss, I is a feature map of the real image,

is a feature map of the composite image, M is an eyeAnd (5) feature maps of the target area images.

In one embodiment, the processor, when executing the computer program, further performs the steps of: converting the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or converting the domain of the first background image into the domain of the first target area image through the domain conversion model according to the domain of the first target area image to obtain a second background image.

In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a target image group; and extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

In one embodiment, the domain decision model comprises a global decision model and a target background decision model, wherein the global decision model is used for assisting in training the domain conversion model; the computer program when executed by the processor further realizes the steps of: the domain of the first target area image and the domain of the first background image are calculated by the target background determination model.

In one embodiment, the computer program when executed by the processor further performs the steps of: constructing a real sample set and a fake sample set; training a domain decision model and a domain conversion model by utilizing a real sample set and a fake sample set in a single alternative iterative training mode; wherein a domain verification loss of the target background decision model is calculated.

In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of composite images; the computer program when executed by the processor further realizes the steps of: calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image;

In one embodiment, the computer program when executed by the processor further performs the steps of: the domain verification loss is calculated according to the following formula:

wherein the content of the first and second substances,

for domain verification loss, I is a feature map of the real image,

m is a feature map of the target area image.

In one embodiment, the computer program when executed by the processor further performs the steps of: converting the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or converting the domain of the first background image into the domain of the first target area image through the domain conversion model according to the domain of the first target area image to obtain a second background image.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring a target image group; and extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile memory may include Read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An image synthesis method, characterized in that the method comprises:

2. The method of claim 1, wherein the domain decision model comprises a global decision model and a target context decision model, and the global decision model is used to assist in training the domain transformation model; the calculating, by a domain decision model, a domain of the first target area image and a domain of the first background image includes:

3. The method of claim 2, wherein the training of the domain decision model and the domain transformation model comprises:

constructing a real sample set and a fake sample set;

and training the domain judgment model and the domain conversion model by using the real sample set and the fake sample set in a single alternate iterative training mode.

4. The method of claim 3, wherein the set of real samples comprises a number of real images, and the set of counterfeit samples comprises a number of composite images; the calculation of the domain verification loss of the target background model comprises the following steps:

5. The method of claim 4, wherein the calculation of the domain authentication loss is performed according to the following formula:

wherein the content of the first and second substances,

verifying the loss for the domain, I is a feature map of the real image,

and M is the characteristic diagram of the target area image.

6. The method according to any one of claims 1 to 5, wherein the converting the first target area image and the first background image into the same domain by a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image comprises:

7. The method according to any one of claims 1 to 5, wherein before the acquiring the first target area image from the target area image group and the acquiring the first background image from the background image group, the method further comprises:

acquiring a target image group;

8. An image synthesizing apparatus, characterized in that the apparatus comprises:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.