CN111583168A - Image synthesis method, image synthesis device, computer equipment and storage medium - Google Patents

Image synthesis method, image synthesis device, computer equipment and storage medium Download PDF

Info

Publication number
CN111583168A
CN111583168A CN202010559322.5A CN202010559322A CN111583168A CN 111583168 A CN111583168 A CN 111583168A CN 202010559322 A CN202010559322 A CN 202010559322A CN 111583168 A CN111583168 A CN 111583168A
Authority
CN
China
Prior art keywords
domain
image
target area
area image
background
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010559322.5A
Other languages
Chinese (zh)
Inventor
周康明
高凯珺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eye Control Technology Co Ltd
Original Assignee
Shanghai Eye Control Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eye Control Technology Co Ltd filed Critical Shanghai Eye Control Technology Co Ltd
Priority to CN202010559322.5A priority Critical patent/CN111583168A/en
Publication of CN111583168A publication Critical patent/CN111583168A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The application relates to an image synthesis method, an image synthesis device, a computer device and a storage medium, firstly, acquiring a first target area image from a target area image group, and acquiring a first background image from a background image group; secondly, calculating the domain of the first target area image and the domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image, so that the foreground and the background are transferred into the same domain; and finally, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.

Description

Image synthesis method, image synthesis device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an image synthesis method and apparatus, a computer device, and a storage medium.
Background
Training the model requires a large amount of data and labeling the data, and labeling is a time-consuming and labor-consuming task, so people begin to expand the data by using an image synthesis method.
Image synthesis is a common operation of image processing, and in a conventional image synthesis method, an area of interest is extracted from one image and the area of interest is attached to another image, so that a new synthesized image is generated, and labeling information already exists in the synthesized image, so that labeling is not needed. Therefore, the synthetic graph can be used for data augmentation and has wide application prospect.
However, the composite image obtained by the conventional image synthesis method has the technical problem that the image content is not harmonious, for example, the region of interest appears to be obtrusive in the composite image.
Disclosure of Invention
In view of the above, it is necessary to provide an image synthesis method, an apparatus, a computer device, and a storage medium capable of harmonizing image contents of a synthesized graph in view of the above technical problems.
A method of image synthesis, the method comprising:
acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group;
calculating a domain of the first target area image and a domain of the first background image by a domain decision model;
converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image;
and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
In one embodiment, the domain decision model includes a global decision model and a target background decision model, and the global decision model is used for assisting in training the domain conversion model; the calculating, by a domain decision model, a domain of the first target area image and a domain of the first background image includes:
calculating a domain of the first target area image and a domain of the first background image by the target background determination model.
In one embodiment, the training of the domain decision model and the domain conversion model comprises:
constructing a real sample set and a fake sample set;
training the domain judgment model and the domain conversion model by using the real sample set and the fake sample set in a single alternate iterative training mode; wherein a domain verification loss of the target context decision model is calculated.
In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of synthetic images; the calculation of the domain verification loss of the target background model comprises the following steps:
calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image;
calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image;
and determining the domain verification loss of the target background model according to the first similarity and the second similarity.
In one embodiment, the domain validation loss is calculated as follows:
Figure BDA0002545705370000021
wherein the content of the first and second substances,
Figure BDA0002545705370000022
verifying the loss for the domain, I is a feature map of the real image,
Figure BDA0002545705370000023
for the feature map of the composite image, M isA feature map of the target area image.
In one embodiment, the converting, according to the domain of the first target area image and the domain of the first background image, the first target area image and the first background image into the same domain through a domain conversion model to obtain at least one of a second target area image and a second background image includes:
converting the domain of the first target area image to the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or
And converting the domain of the first background image into the domain of the first target area image through a domain conversion model according to the domain of the first target area image to obtain the second background image.
In one embodiment, before the acquiring the first target area image from the target area image group and the acquiring the first background image from the background image group, the method further includes:
acquiring a target image group;
and extracting the first target area image from each target image in the target image group to obtain a plurality of first target area images to form the target area image group.
An image synthesis apparatus, the apparatus comprising:
the acquisition module is used for acquiring a first target area image from the target area image group and acquiring a first background image from the background image group;
a calculation module, configured to calculate, through a domain determination model, a domain of the first target area image and a domain of the first background image;
the conversion module is used for converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image;
and the synthesis module is used for synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the image synthesis method of any of the embodiments described above when the computer program is executed.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps of the image synthesis method of any of the embodiments described above.
According to the image synthesis method, the image synthesis device, the computer equipment and the storage medium, firstly, a first target area image is obtained from the target area image group, and a first background image is obtained from the background image group; secondly, calculating the domain of the first target area image and the domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image, so that the foreground and the background are transferred into the same domain; and finally, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image. Because the foreground and the background are in the same domain, the brightness, the contrast, the color and other parameters of the foreground and the background are matched, and the content of the synthesized image is harmonious.
Drawings
FIG. 1 is a diagram showing an example of an application environment of an image synthesis method;
FIG. 2 is a schematic flow chart diagram of an image synthesis method according to an embodiment;
FIG. 3 is a flowchart illustrating the steps of training a model in one embodiment;
FIG. 4 is a flow diagram illustrating the calculation of a domain authentication loss in another embodiment;
FIG. 5a is a schematic flow chart diagram illustrating an exemplary image synthesis method;
5 b-5 g are images generated during annual inspection of a vehicle in one embodiment;
FIG. 6 is a block diagram showing the configuration of an image synthesizing apparatus according to an embodiment;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Image Composition (Image Composition) is a common operation in Image processing, and can be used to acquire an Image of interest, and also can be used for data augmentation, and has a wide application prospect. However, the composite image obtained by the conventional image synthesis method has many problems, such as the foreground and the background look discordant. Specifically, in the composite image, the foreground and the background are photographed under different photographing conditions (such as time, season, illumination, weather), so there is a problem of a significant mismatch in brightness, saturation, color, contrast, and the like.
Based on this, the present application provides an image synthesis method, which can be applied to the application environment as shown in fig. 1. The application environment may include: a first computer device 110, a second computer device 120, and an image acquisition device 130. The first Computer device 110 and the second Computer device 120 refer to electronic devices with strong data storage and computation capabilities, for example, the first Computer device 110 and the second Computer device 120 may be a PC (Personal Computer) or a server. And acquiring the annual inspection image of the vehicle through the image acquisition equipment 130 to obtain a target image group. And sends the set of target images to the first computer device 110 over the network connection. Before image synthesis is performed, a technician is required to construct a domain decision model and a domain conversion model on the second computer device 120 and train the constructed domain decision model and domain conversion model through the second computer device 120. The trained domain judgment model and domain conversion model can be issued from the second computer device 120 to the first computer device 110, and the first computer device 110 can extract a first target area image from any target image in the target image group to obtain a plurality of first target area images to form a target area image group; acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group; calculating a domain of the first target area image and a domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image; and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image. It is understood that the first computer device 110 may also take the form of a terminal, which may be an electronic device such as a cell phone, a tablet, an e-book reader, a multimedia player device, a wearable device, a PC, etc. And the terminal completes the work of image synthesis through the domain judgment model and the domain conversion model.
In one embodiment, as shown in fig. 2, there is provided an image synthesis method, which is described by way of example as applied to the first computer device 110 in fig. 1, and includes the following steps:
s210, acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group.
The target area image group comprises a plurality of target area images. The target area image is an image corresponding to an area of interest in an image. The background image group comprises a plurality of background images, the background images refer to images without any interested areas, and the background images can provide different scenes. For example, the scene may be a scene of a vehicle annual inspection.
Specifically, the first computer device 110 acquires the target area image group and the background image group from a local computer device or a computer device in communication connection therewith, wherein the target area image group includes a plurality of first target area images, and the background image group includes a plurality of first background images, the first computer device 110 acquires any one of the first target area images from the target area image group, and acquires any one of the first background images from the background image group.
S220, calculating a domain of the first target area image and a domain of the first background image through a domain judgment model;
and S230, converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image.
The domain determination model is a machine learning model for calculating an image domain. The domain conversion model refers to a machine learning model for performing image domain conversion. The image domain is related to the photographing conditions of the image, including time of day, season, illumination, weather, and the like. The images obtained under different shooting conditions have different domains, and the shooting conditions are specifically expressed in the aspects of brightness, saturation, color, contrast and the like of the images. In order to ensure the content of the synthesized image is harmonious, it is necessary to ensure that the foreground and the background are obtained under the same shooting condition, and the harmony of the content of the synthesized foreground and the background can be ensured by adjusting parameters such as brightness, saturation, color, contrast and the like of the image. And the image domain may be a data set comprising values of brightness, saturation, color, contrast, etc. of the image. For example, the image domain may be a one-dimensional vector.
Specifically, the shooting conditions of the first target area image and the first background image may be the same or different, and the domain of the first target area image obtained under different shooting conditions may be the same or different from the domain of the first background image. Firstly, a first target area image and a first background image are input into a domain judgment model, and image domains of the first target area image and the first background image are calculated by using the domain judgment model to obtain a domain of the first target area image and a domain of the first background image. Secondly, in order to ensure that the foreground in the synthesized image is harmonious with the background, domain conversion needs to be performed on the first target area image and the first background image, the first target area image and the first background image are migrated into the same domain, and at least one of the second target area image and the second background image is obtained, so that the second target area image and the second background image have the same domain. It should be noted that the structure of the domain determination model is well known to those skilled in the art, and the calculation of the domain of the first target area image and the domain of the first background image by the domain determination model is prior art and is not described herein again.
Further, according to the domain of the first target area image and the domain of the first background image, the first target area image and the first background image are converted into the same domain through a domain conversion model, so as to obtain at least one of a second target area image and a second background image, including the following three conditions:
and converting the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image. Or
And converting the domain of the first background image into the domain of the first target area image through a domain conversion model according to the domain of the first target area image to obtain a second background image. Or
And setting a new image domain according to the domain of the first background image and the domain of the first target area image, and transferring the first background image and the first target area image into the new image domain to obtain a second target area image and a second background image.
S240, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
The domain of the first target area image is denoted as a, the domain of the first background image is denoted as B, and the two domains are migrated into the same domain C, which may be the same as the domain a of the first target area image, the same as the domain B of the first background image, or different from both the domains a and B. Specifically, if the domain C is the same as the domain a of the first target area image and indicates that the domain a of the first background image needs to be moved to the domain B of the first target area image, the second background image is obtained, and the first target area image and the second background image are synthesized. If the domain C is the same as the domain B of the first background image and indicates that the domain A of the first target area image needs to be transferred into the domain B of the first background image, a second target area image is obtained, and the second target area image and the first background image are synthesized. And if the domain C is different from the domain A and the domain B, the domain A of the first target area image and the domain B of the first target area image are transferred to the domain C to obtain a second target area image and a second background image, so that the second target area image and the second background image are synthesized.
In this embodiment, first, a first target area image is acquired from a target area image group, and a first background image is acquired from a background image group, and sizes and numbers of the two groups of images are not required to be consistent, and the first target area image and the first background image are not required to be paired one by one, so that the workload of data preparation is reduced; secondly, calculating the domain of the first target area image and the domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image, so that the foreground and the background are transferred into the same domain; and finally, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image. Because the foreground and the background are in the same domain, the brightness, the contrast, the color and other parameters of the foreground and the background are matched, and the content of the synthesized image is harmonious. Furthermore, data expansion is carried out through the more vivid synthetic image, so that the cost of data collection and labeling can be reduced, service can be better provided for a machine learning model, and the requirement of model training on data is met.
In one embodiment, the domain decision model includes a global decision model and a target background decision model. The global decision model is used to assist in training the domain conversion model. Calculating a domain of the first target area image and a domain of the first background image by a domain decision model, comprising: the domain of the first target area image and the domain of the first background image are calculated by the target background determination model.
Specifically, the global decision model may adopt a decision model in a conventional GAN (Generative adaptive Networks) network. The target context decision model may employ a generic classification network structure such as Resnet 18. The global decision model is used for assisting in training the domain conversion model and does not participate in the calculation process of the image domain. And inputting the first target area image and the first background image into the domain judgment model, and calculating the domain of the first target area image and the domain of the first background image through the target background judgment model to obtain the brightness, saturation, color and contrast of the first target area image and the first background image.
In this embodiment, the training of the domain conversion model is assisted by the global decision model, so that the domain conversion model can be helped to generate a real and credible synthetic image. And calculating the domain of the first target area image and the domain of the first background image through the target background judgment model, laying a foundation for the migration of the first target area image and the domain of the first background image into the same domain, and being beneficial to generating a synthetic image with more harmonious content.
In one embodiment, before acquiring the first target area image from the target area image group and acquiring the first background image from the background image group, the method further comprises: acquiring a target image group; and extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.
The image acquisition equipment shoots a plurality of target images under different shooting conditions (such as illumination and angles) to form a target image group. The first computer device obtains the target image group, and the mask of the target area image in the target image can be obtained by adopting an artificial edge tracing mode to extract the target area image from the target image, so that the first target area image can be extracted. And extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images, thereby forming the target area image group.
In one embodiment, as shown in FIG. 3, the training of the domain decision model and the domain transformation model comprises the following steps:
s310, constructing a real sample set and a fake sample set;
and S320, training a domain decision model and a domain conversion model by utilizing the real sample set and the fake sample set in a single alternative iterative training mode. Wherein a domain verification loss of the target background decision model is calculated.
The real sample set is a set of real images, and the real images are images including targets captured in a specific scene. The fake sample set is a set of synthetic images, and the obtaining of the synthetic images comprises: acquiring real images under different shooting conditions (such as illumination and angles), extracting a target area in the real images to obtain target area images, and processing the extracted target area images, such as modifying at least one of brightness, saturation, color and contrast, so that the domain of the target area images is different from the domain of the real images. And fitting the target area image after the domain change into the background of the real image to obtain a composite image. Further, the target area image can be extracted from the real image by obtaining a mask of the target area image in the real image in an artificial edge tracing manner, so that the target area image can be extracted.
Specifically, a real sample set is constructed by taking a series of real images containing the target. And (3) attaching the target area images after the domain change to the background image phase to obtain a plurality of synthetic images, namely constructing a fake sample set.
The domain judgment model comprises a global judgment model and a target background judgment model which are arranged in parallel, the global judgment model is a judgment model in the traditional GAN, and a real sample and a fake sample are integrally judged by using standard judgment loss to help the domain conversion model to generate a real and credible image. In addition to the global decision model, a target background decision model is also provided in this embodiment, a domain verification loss is provided, and the similarity between the target area image and the background image is calculated to determine whether the target area image and the background image are from the same domain. The domain conversion model uses the structure of a U-net (relational network for biological Image segmentation) convolutional neural network, because the network uses jump connection, so that features of different levels can be fused. Of course, the domain transformation model may also use other scene segmentation model structures, such as SegNet, PSPNet or deep series, which all can obtain an output image with the same size as the input image.
In this embodiment, the domain decision model and the domain conversion model are trained in a single alternating iterative training manner, which specifically includes: and fixing parameters of the domain decision model, training a domain conversion model by utilizing the real sample set and the fake sample set, and calculating the global decision loss of the global decision model and the domain verification loss of the target background decision model. After the domain conversion model is trained to a certain degree, the parameters of the domain conversion model are fixed, the domain judgment model is trained by utilizing the real sample set and the fake sample set, and the loss of the domain conversion model is calculated. And after the domain judgment model is trained to a further degree again, fixing the parameters of the domain judgment model again, training the domain conversion model again by using the real sample set and the fake sample set, and calculating the global judgment loss of the global judgment model and the domain verification loss of the target background judgment model again. By analogy, the domain decision model and the domain conversion model are trained in the above-mentioned way of single alternate iterative training.
In one embodiment, the set of authentic samples includes a number of authentic images, and the set of counterfeit samples includes a number of composite images. As shown in fig. 4, the calculation of the domain verification loss of the target background model includes the following steps:
s410, calculating a first similarity according to the feature map of the real image and the feature map of the target area image;
s420, calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image;
and S430, determining the domain verification loss of the target background model according to the first similarity and the second similarity.
The image is processed by a convolutional neural network feature map, and the feature map is an output result of a convolutional filter and can represent data of a certain feature distribution on previous input. The similarity is mainly used for calculating the similarity of the contents between the two images, and the similarity of the contents of the two images is judged according to the size of the calculation result. The domain verification is to judge whether the foreground and the background belong to the same domain, and a domain verification loss calculation mode is designed for zooming in the foreground and the background domains to make the image content more harmonious. Specifically, the convolution neural network is used for processing the real image to obtain a characteristic diagram of the real image. And processing the target area image by using the convolutional neural network to obtain a characteristic diagram of the target area image. And processing the synthetic image by using the convolutional neural network to obtain a characteristic diagram of the synthetic image. In order to improve the capability of distinguishing a real image from a synthetic image of a target background model, on one hand, a first similarity between the characteristic graph of the real image and the characteristic graph of a target area image is calculated; on the other hand, the feature map of the composite image and the feature map of the target area image are used to calculate a second similarity between the two. And finally, determining the domain verification loss of the target background model according to the first similarity and the second similarity.
Further, the calculation of the domain authentication loss is performed according to the following formula:
Figure BDA0002545705370000101
wherein the content of the first and second substances,
Figure BDA0002545705370000111
for domain verification loss, I is a feature map of the real image,
Figure BDA0002545705370000112
feature maps for composite imagesAnd M is a characteristic diagram of the target area image. DVRepresenting calculating the similarity of the synthetic image or the real image and the target area image, and Dv (I, M) is I · M;
Figure BDA0002545705370000113
"·" denotes the inner product.
In this embodiment, by designing the domain verification loss of the similarity between the real image and the target region image and the similarity between the synthetic image and the target region image, the real image and the synthetic image appear in opposition, and the computation of the domain characterization similarity design verification loss based on the target region and the background is essentially used to determine whether the target region and the background are from the same domain.
In one embodiment, the present application provides an image synthesis method, as shown in fig. 5a, comprising the steps of:
and S510, acquiring a target image group.
S520, extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.
S530, acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group.
And S540, calculating the domain of the first target area image and the domain of the first background image through the domain judgment model.
The domain judgment model comprises a global judgment model and a target background judgment model. The global decision model is used to assist in training the domain conversion model. The domain of the first target area image and the domain of the first background image are calculated by the target background determination model.
And S550, converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image.
And S560, synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
Illustratively, the method for synthesizing the images is applied to annual inspection of the vehicle. Referring to fig. 5b, the real image captured by the image capturing device during the annual inspection of the vehicle includes a background image and a target area image (such as a person in the figure). Referring to fig. 5c, a target area image is extracted from the real image. Referring to fig. 5d, the background image is captured by the image capturing device during the annual inspection of the vehicle. And performing domain conversion on the extracted target area image by combining the domain of the background image, please refer to fig. 5e, which is the target area image obtained after the domain conversion. Referring to fig. 5f, the target area image and the background image are synthesized after the domain conversion to obtain a synthesized image. Referring to fig. 5g, the target area image without domain conversion is synthesized with the background image to obtain a synthesized image. Comparing the two combined images, and if domain conversion is not performed, directly combining the target area image without domain conversion with the background image, which is more obtrusive and unnatural.
It should be understood that, although the steps in the flowcharts of the above embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the above embodiments may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the sub-steps or the stages of other steps.
In one embodiment, as shown in fig. 6, there is provided an image synthesizing apparatus including: an obtaining module 610, a calculating module 620, a converting module 630 and a synthesizing module 640, wherein:
an obtaining module 610, configured to obtain a first target area image from the target area image group, and obtain a first background image from the background image group;
a calculating module 620, configured to calculate, through a domain determination model, a domain of the first target region image and a domain of the first background image;
a conversion module 630, configured to convert the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image, so as to obtain at least one of a second target area image and a second background image;
a synthesizing module 640, configured to synthesize the second target area image with the first background image or the second background image, or synthesize the first target area image with the second background image.
In one embodiment, the domain decision model comprises a global decision model and a target background decision model, wherein the global decision model is used for assisting in training the domain conversion model; the calculating module 620 is further configured to calculate a domain of the first target area image and a domain of the first background image through the target background determination model.
In one embodiment, the training of the domain decision model and the domain conversion model comprises: constructing a real sample set and a fake sample set; training a domain decision model and a domain conversion model by utilizing a real sample set and a fake sample set in a single alternative iterative training mode; wherein a domain verification loss of the target background decision model is calculated.
In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of composite images; calculating a domain verification loss of the target background model, comprising: calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image; calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image; and determining the domain verification loss of the target background model according to the first similarity and the second similarity.
In one embodiment, the calculation of the domain validation loss is performed according to the following formula:
Figure BDA0002545705370000131
wherein the content of the first and second substances,
Figure BDA0002545705370000132
for domain verification loss, I is a feature map of the real image,
Figure BDA0002545705370000133
m is a feature map of the target area image.
In an embodiment, the converting module 630 is further configured to convert the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image, so as to obtain a second target area image; or converting the domain of the first background image into the domain of the first target area image through the domain conversion model according to the domain of the first target area image to obtain a second background image.
In one embodiment, the synthesizing apparatus further comprises a target image group acquiring module and a target region image group composing module, wherein: the target image group acquisition module is used for acquiring a target image group; and the target area image group forming module is used for extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.
For specific limitations of the image synthesizing device, reference may be made to the above limitations of the image synthesizing method, which are not described herein again. The respective modules in the image synthesizing apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image composition method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group; calculating a domain of the first target area image and a domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image; and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
In one embodiment, the domain decision model comprises a global decision model and a target background decision model, wherein the global decision model is used for assisting in training the domain conversion model; the processor, when executing the computer program, further performs the steps of: the domain of the first target area image and the domain of the first background image are calculated by the target background determination model.
In one embodiment, the processor, when executing the computer program, further performs the steps of: constructing a real sample set and a fake sample set; training a domain decision model and a domain conversion model by utilizing a real sample set and a fake sample set in a single alternative iterative training mode; wherein a domain verification loss of the target background decision model is calculated.
In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of composite images; the processor, when executing the computer program, further performs the steps of: calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image; calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image; and determining the domain verification loss of the target background model according to the first similarity and the second similarity.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the domain verification loss is calculated according to the following formula:
Figure BDA0002545705370000151
wherein the content of the first and second substances,
Figure BDA0002545705370000152
for domain verification loss, I is a feature map of the real image,
Figure BDA0002545705370000153
is a feature map of the composite image, M is an eyeAnd (5) feature maps of the target area images.
In one embodiment, the processor, when executing the computer program, further performs the steps of: converting the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or converting the domain of the first background image into the domain of the first target area image through the domain conversion model according to the domain of the first target area image to obtain a second background image.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a target image group; and extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group; calculating a domain of the first target area image and a domain of the first background image through a domain judgment model; converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image; and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
In one embodiment, the domain decision model comprises a global decision model and a target background decision model, wherein the global decision model is used for assisting in training the domain conversion model; the computer program when executed by the processor further realizes the steps of: the domain of the first target area image and the domain of the first background image are calculated by the target background determination model.
In one embodiment, the computer program when executed by the processor further performs the steps of: constructing a real sample set and a fake sample set; training a domain decision model and a domain conversion model by utilizing a real sample set and a fake sample set in a single alternative iterative training mode; wherein a domain verification loss of the target background decision model is calculated.
In one embodiment, the real sample set comprises a plurality of real images, and the fake sample set comprises a plurality of composite images; the computer program when executed by the processor further realizes the steps of: calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image;
calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image;
and determining the domain verification loss of the target background model according to the first similarity and the second similarity.
In one embodiment, the computer program when executed by the processor further performs the steps of: the domain verification loss is calculated according to the following formula:
Figure BDA0002545705370000161
wherein the content of the first and second substances,
Figure BDA0002545705370000162
for domain verification loss, I is a feature map of the real image,
Figure BDA0002545705370000163
m is a feature map of the target area image.
In one embodiment, the computer program when executed by the processor further performs the steps of: converting the domain of the first target area image into the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or converting the domain of the first background image into the domain of the first target area image through the domain conversion model according to the domain of the first target area image to obtain a second background image.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring a target image group; and extracting a first target area image from each target image in the target image group to obtain a plurality of first target area images to form a target area image group.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile memory may include Read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. An image synthesis method, characterized in that the method comprises:
acquiring a first target area image from the target area image group, and acquiring a first background image from the background image group;
calculating a domain of the first target area image and a domain of the first background image by a domain decision model;
converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image;
and synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
2. The method of claim 1, wherein the domain decision model comprises a global decision model and a target context decision model, and the global decision model is used to assist in training the domain transformation model; the calculating, by a domain decision model, a domain of the first target area image and a domain of the first background image includes:
calculating a domain of the first target area image and a domain of the first background image by the target background determination model.
3. The method of claim 2, wherein the training of the domain decision model and the domain transformation model comprises:
constructing a real sample set and a fake sample set;
and training the domain judgment model and the domain conversion model by using the real sample set and the fake sample set in a single alternate iterative training mode.
4. The method of claim 3, wherein the set of real samples comprises a number of real images, and the set of counterfeit samples comprises a number of composite images; the calculation of the domain verification loss of the target background model comprises the following steps:
calculating a first similarity according to the characteristic diagram of the real image and the characteristic diagram of the target area image;
calculating a second similarity according to the feature map of the synthetic image and the feature map of the target area image;
and determining the domain verification loss of the target background model according to the first similarity and the second similarity.
5. The method of claim 4, wherein the calculation of the domain authentication loss is performed according to the following formula:
Figure FDA0002545705360000021
wherein the content of the first and second substances,
Figure FDA0002545705360000022
verifying the loss for the domain, I is a feature map of the real image,
Figure FDA0002545705360000023
and M is the characteristic diagram of the target area image.
6. The method according to any one of claims 1 to 5, wherein the converting the first target area image and the first background image into the same domain by a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image comprises:
converting the domain of the first target area image to the domain of the first background image through a domain conversion model according to the domain of the first background image to obtain a second target area image; or
And converting the domain of the first background image into the domain of the first target area image through a domain conversion model according to the domain of the first target area image to obtain the second background image.
7. The method according to any one of claims 1 to 5, wherein before the acquiring the first target area image from the target area image group and the acquiring the first background image from the background image group, the method further comprises:
acquiring a target image group;
and extracting the first target area image from each target image in the target image group to obtain a plurality of first target area images to form the target area image group.
8. An image synthesizing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a first target area image from the target area image group and acquiring a first background image from the background image group;
a calculation module, configured to calculate, through a domain determination model, a domain of the first target area image and a domain of the first background image;
the conversion module is used for converting the first target area image and the first background image into the same domain through a domain conversion model according to the domain of the first target area image and the domain of the first background image to obtain at least one of a second target area image and a second background image;
and the synthesis module is used for synthesizing the second target area image with the first background image or the second background image, or synthesizing the first target area image with the second background image.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202010559322.5A 2020-06-18 2020-06-18 Image synthesis method, image synthesis device, computer equipment and storage medium Pending CN111583168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010559322.5A CN111583168A (en) 2020-06-18 2020-06-18 Image synthesis method, image synthesis device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010559322.5A CN111583168A (en) 2020-06-18 2020-06-18 Image synthesis method, image synthesis device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111583168A true CN111583168A (en) 2020-08-25

Family

ID=72121837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010559322.5A Pending CN111583168A (en) 2020-06-18 2020-06-18 Image synthesis method, image synthesis device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111583168A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836756A (en) * 2021-02-04 2021-05-25 上海明略人工智能(集团)有限公司 Image recognition model training method and system and computer equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101631189A (en) * 2008-07-15 2010-01-20 鸿富锦精密工业(深圳)有限公司 Image synthesis system and method
TW201007611A (en) * 2008-08-01 2010-02-16 Hon Hai Prec Ind Co Ltd Image systhesizing system and method thereof
CN106303250A (en) * 2016-08-26 2017-01-04 维沃移动通信有限公司 A kind of image processing method and mobile terminal
WO2017140182A1 (en) * 2016-02-15 2017-08-24 努比亚技术有限公司 Image synthesis method and apparatus, and storage medium
AU2017101166A4 (en) * 2017-08-25 2017-11-02 Lai, Haodong MR A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
CN107968917A (en) * 2017-12-05 2018-04-27 广东欧珀移动通信有限公司 Image processing method and device, computer equipment, computer-readable recording medium
CN109727264A (en) * 2019-01-10 2019-05-07 南京旷云科技有限公司 Image generating method, the training method of neural network, device and electronic equipment
CN110288019A (en) * 2019-06-21 2019-09-27 北京百度网讯科技有限公司 Image labeling method, device and storage medium
CN110378432A (en) * 2019-07-24 2019-10-25 网易无尾熊(杭州)科技有限公司 Picture Generation Method, device, medium and electronic equipment
CN110956654A (en) * 2019-12-02 2020-04-03 Oppo广东移动通信有限公司 Image processing method, device, equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101631189A (en) * 2008-07-15 2010-01-20 鸿富锦精密工业(深圳)有限公司 Image synthesis system and method
TW201007611A (en) * 2008-08-01 2010-02-16 Hon Hai Prec Ind Co Ltd Image systhesizing system and method thereof
WO2017140182A1 (en) * 2016-02-15 2017-08-24 努比亚技术有限公司 Image synthesis method and apparatus, and storage medium
CN106303250A (en) * 2016-08-26 2017-01-04 维沃移动通信有限公司 A kind of image processing method and mobile terminal
AU2017101166A4 (en) * 2017-08-25 2017-11-02 Lai, Haodong MR A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
CN107968917A (en) * 2017-12-05 2018-04-27 广东欧珀移动通信有限公司 Image processing method and device, computer equipment, computer-readable recording medium
CN109727264A (en) * 2019-01-10 2019-05-07 南京旷云科技有限公司 Image generating method, the training method of neural network, device and electronic equipment
CN110288019A (en) * 2019-06-21 2019-09-27 北京百度网讯科技有限公司 Image labeling method, device and storage medium
CN110378432A (en) * 2019-07-24 2019-10-25 网易无尾熊(杭州)科技有限公司 Picture Generation Method, device, medium and electronic equipment
CN110956654A (en) * 2019-12-02 2020-04-03 Oppo广东移动通信有限公司 Image processing method, device, equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FANGNENG ZHAN,ET AL: "Hierarchy composition GAN for high-fidelity image synthesis" *
WENYAN CONG,ET AL: "DoveNet: Deep Image Harmonization via Domain Verification" *
代烁: "基于CycleGAN的图像翻译算法研究" *
李君艺 等: "基于感知对抗网络的图像风格迁移方法研究" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836756A (en) * 2021-02-04 2021-05-25 上海明略人工智能(集团)有限公司 Image recognition model training method and system and computer equipment
CN112836756B (en) * 2021-02-04 2024-02-27 上海明略人工智能(集团)有限公司 Image recognition model training method, system and computer equipment

Similar Documents

Publication Publication Date Title
Ren et al. Low-light image enhancement via a deep hybrid network
CN106778928B (en) Image processing method and device
CN109886077B (en) Image recognition method and device, computer equipment and storage medium
CN110246181B (en) Anchor point-based attitude estimation model training method, attitude estimation method and system
CN111199531A (en) Interactive data expansion method based on Poisson image fusion and image stylization
CN112528969B (en) Face image authenticity detection method and system, computer equipment and storage medium
CN108833785A (en) Fusion method, device, computer equipment and the storage medium of multi-view image
CN109871845B (en) Certificate image extraction method and terminal equipment
CN112132741B (en) Face photo image and sketch image conversion method and system
CN109977832B (en) Image processing method, device and storage medium
CN113673530A (en) Remote sensing image semantic segmentation method and device, computer equipment and storage medium
CN110298829A (en) A kind of lingual diagnosis method, apparatus, system, computer equipment and storage medium
Li et al. Adaptive representation-based face sketch-photo synthesis
CN112651333B (en) Silence living body detection method, silence living body detection device, terminal equipment and storage medium
CN116701706B (en) Data processing method, device, equipment and medium based on artificial intelligence
CN111583168A (en) Image synthesis method, image synthesis device, computer equipment and storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
Zeng et al. \mathrm 3D^ 2Unet 3 D 2 U net: 3D Deformable Unet for Low-Light Video Enhancement
Peng et al. Mpib: An mpi-based bokeh rendering framework for realistic partial occlusion effects
Wang et al. A multi-scale attentive recurrent network for image dehazing
CN112464924A (en) Method and device for constructing training set
CN113792807B (en) Skin disease classification model training method, system, medium and electronic equipment
CN110490950B (en) Image sample generation method and device, computer equipment and storage medium
Halperin et al. Clear Skies Ahead: Towards Real‐Time Automatic Sky Replacement in Video
CN114758054A (en) Light spot adding method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination