CN110009044B

CN110009044B - Model training method and device, and image processing method and device

Info

Publication number: CN110009044B
Application number: CN201910280851.9A
Authority: CN
Inventors: 聂凤梅; 姚涛; 黄通兵
Original assignee: Qixin Yiwei Shenzhen Technology Co ltd; Beijing 7Invensun Technology Co Ltd
Current assignee: Qixin Yiwei Shenzhen Technology Co ltd; Beijing 7Invensun Technology Co Ltd
Priority date: 2019-04-09
Filing date: 2019-04-09
Publication date: 2021-09-03
Anticipated expiration: 2039-04-09
Also published as: CN110009044A

Abstract

The embodiment of the application discloses a model training method and device, and an image processing method and device, wherein a first generation model can process an image with a first characteristic into an image with a second characteristic, a second generation model can process the image with the second characteristic into an image with the first characteristic, if the second generation model can restore the image processed by the first generation model into an original image, and the first generation model can restore the image processed by the second generation model into the original image, then the first generation model and the second generation model can be considered to be more accurate in processing the images, therefore, constraint conditions for minimizing a first objective function and/or minimizing a second objective function can be set for the training of the image processing model, and the image processing model is updated by using a model optimization algorithm, the obtained image processing model can accurately process the image, the quality of the output image is high, and the user experience is improved.

Description

Model training method and device, and image processing method and device

Technical Field

The present application relates to the field of computers, and in particular, to a model training method and apparatus, and an image processing method and apparatus.

Background

In the image processing process, there is usually a need to process the target position in the image, such as removing glasses and beard from the portrait, adjusting facial expression of the person, and adding glasses and beard to the portrait. To meet this requirement, the image processing model may be trained to perform specific processing on the image. However, in an output image of the existing image processing model, the detail loss is serious, the image quality is poor, and the requirements of users cannot be met.

Disclosure of Invention

In order to solve the problem of poor image quality of an output image in a machine learning model in the prior art, the embodiment of the application provides a model training method and device, and an image processing method and device.

The embodiment of the application provides a model training method, which comprises the following steps:

acquiring a training image, wherein the training image comprises a first training image and/or a second training image, the first training image is an image comprising a first feature, the second training image is an image comprising a second feature, and the first feature and the second feature are features aiming at the same attribute and having different presentations;

training an image processing model by using the training image, wherein the image processing model comprises a first generative model and a second generative model;

the training process comprises the following steps:

inputting the first training image into the first generation model to obtain a third training image, and inputting the third training image into the second generation model to obtain a fourth training image, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;

inputting the second training image into the second generation model to obtain a fifth training image, and inputting the fifth training image into the first generation model to obtain a sixth training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature;

the training process further includes the following constraints: a first objective function minimization and/or a second objective function minimization;

the first objective function is used to express a difference between the fourth training image and the first training image, and the second objective function is used to express a difference between the sixth training image and the second training image.

Optionally, the first objective function

l_aid ₀＝||x₀-G₁(G₀(x₀))||₁

Wherein x is₀Representing said first training image, G₀Representing said first generative model, G₀(x₀) Representing said third training image, G₁Representing said second generative model, G₁(G₀(x₀) Represents the fourth training image, | x₀-G₁(G₀(x₀))||₁Represents a pair x₀-G₁(G₀(x₀) Calculate the L1 norm.

Optionally, the second objective function

l_aid ₁＝||x₁-G₀(G₁(x₁))||₁

Wherein x is₁Representing said second training image, G₁Representing said second generative model, G₁(x₁) Representing said fifth training image, G₀Representing said first generative model, G₀(G₁(x₁) Represents the sixth training image, | x₁-G₀(G₁(x₁))||₁Represents a pair x₁-G₀(G₁(x₁) Calculate the L1 norm.

Optionally, the first characteristic is that the target object has a spot on the face, and the second characteristic is that the target object has no spot on the face.

Optionally, the first characteristic is that the target object has glasses on the face, and the second characteristic is that the target object has no glasses on the face.

An embodiment of the present application further provides a model training device, the device includes:

a training image obtaining unit, configured to obtain a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations;

the model training unit is used for training an image processing model by using the training image, wherein the image processing model comprises a first generation model and a second generation model;

the training process comprises the following steps:

Optionally, the first objective function

l_aid ₀＝||x₀-G₁(G₀(x₀))||₁

Optionally, the second objective function

l_aid ₁＝||x₁-G₀(G₁(x₁))||₁

The embodiment of the application also provides an image processing method, which comprises the following steps:

acquiring a first image, wherein the first image comprises a first feature;

the first image is input into a first generation model to obtain a second image, the second image comprises a second feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the first generation model is obtained by training by adopting a model training method provided by the embodiment of the application.

An embodiment of the present application further provides an image processing apparatus, including:

a first image acquisition unit for acquiring a first image, the first image including a first feature;

the second image obtaining unit is configured to input the first image into a first generation model to obtain a second image, where the second image includes a second feature, the first feature and the second feature are features that are specific to the same attribute and have different presentations, and the first generation model is obtained by training using a model training method provided in an embodiment of the present application.

The embodiment of the application also provides another image processing method, which comprises the following steps:

acquiring a third image, wherein the third image comprises a second feature;

inputting the third image into a second generation model to obtain a fourth image, wherein the image comprises a first feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the second generation model is obtained by training by adopting a model training method provided by the embodiment of the application.

An embodiment of the present application further provides another image processing apparatus, including:

a third image acquisition unit configured to acquire a third image, the third image including a second feature;

a fourth image obtaining unit, configured to input the third image into a second generative model to obtain a fourth image, where the image includes a first feature, and the first feature and the second feature are features that are specific to a same attribute and have different presentations, and the second generative model is obtained by training using a model training method provided in an embodiment of the present application.

The embodiment of the application provides a model training method and a device, wherein training images can be obtained firstly, the training images comprise first training images and/or second training images, the first training images comprise images with first characteristics, the second training images comprise images with second characteristics, the first characteristics and the second characteristics are specific to the same attribute and have different presented characteristics, for example, wearing glasses and not wearing glasses can be respectively used as the first characteristics and the second characteristics; then, training an image processing model by using a training image, wherein the image processing model comprises a first generation model and a second generation model, wherein inputting the first training image into the first generation model can obtain a third training image with second characteristics, inputting the third training image into the second generation model can obtain a fourth training image with the first characteristics, correspondingly, inputting the second training image into the second generation model can obtain a fifth training image with the first characteristics, inputting the fifth training image into the first generation model can obtain a sixth training image with the second characteristics, in the embodiment of the application, the difference between the first training image and the fourth training image can be expressed by a first objective function, the difference between the second training image and the sixth training image can be expressed by a second objective function, and minimizing the first objective function and/or minimizing the second objective function as the training image processing model And (5) constraint conditions to obtain the trained image processing model.

Because the first generation model can process the image with the first characteristic into the image with the second characteristic, the second generation model can process the image with the second characteristic into the image with the first characteristic, if the second generation model can restore the image processed by the first generation model into the original image, and the first generation model can restore the image processed by the second generation model into the original image, the targeted processing of the image by the first generation model and the second generation model can be considered to be more accurate, based on the above, the constraint conditions for minimizing the first objective function and/or minimizing the second objective function can be set for the training of the image processing model, the image processing model can be updated by the model optimization algorithm, so that the obtained image processing model can process the image more accurately, and the quality of the output image is higher, the user experience is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of a model training method provided in an embodiment of the present application;

fig. 2 is an image processing schematic diagram of an image processing model according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of image processing of another image processing model according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of image processing of another image processing model provided in an embodiment of the present application;

fig. 5 is a flowchart of an image processing method according to an embodiment of the present application;

FIG. 6 is a flowchart of another image processing method provided in the embodiments of the present application;

fig. 7 is a block diagram illustrating a structure of a model training apparatus according to an embodiment of the present disclosure;

fig. 8 is a block diagram of an image processing apparatus according to an embodiment of the present application;

fig. 9 is a block diagram of another image processing apparatus according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the prior art, an image processing model can be trained by using a model optimization algorithm, so that the image processing model can process a target position in an image, for example, glasses, a beard and the like in a portrait are removed, a facial expression of a task is adjusted, or glasses, a beard and the like are added to the portrait. However, in an output image of the existing image processing model, the loss of detail is relatively serious, the image quality is poor, and the requirements of a user cannot be met, for example, in the process of removing glasses in a portrait, other parts of the portrait may be changed, so that the processed portrait and the portrait before processing are not 'portrait'.

For example, in some scenes, the face image includes glasses, and the light reflected by the glasses generates light spots, and if the face image is used for performing the sight analysis, the light spots reflected by the glasses affect the analysis result, so that the light spots reflected by the glasses can be removed through the image processing model, and then the sight analysis can be performed according to the processed image, so as to improve the accuracy of the sight analysis result. However, in an output image of the existing image processing model, not only light spots reflected by glasses are removed, but also other positions of a face image are changed, for example, changes of shapes, sizes, distances and the like of eyes also affect the accuracy of a sight line analysis result.

Based on this, the embodiment of the application provides a model training method and device, which may first acquire a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations, and for example, wearing glasses and not wearing glasses may be respectively used as the first feature and the second feature; then, training an image processing model by using a training image, wherein the image processing model comprises a first generation model and a second generation model, wherein inputting the first training image into the first generation model can obtain a third training image with second characteristics, inputting the third training image into the second generation model can obtain a fourth training image with the first characteristics, correspondingly, inputting the second training image into the second generation model can obtain a fifth training image with the first characteristics, inputting the fifth training image into the first generation model can obtain a sixth training image with the second characteristics, in the embodiment of the application, the difference between the first training image and the fourth training image can be expressed by a first objective function, the difference between the second training image and the sixth training image can be expressed by a second objective function, and minimizing the first objective function and/or minimizing the second objective function as the training image processing model And (5) constraint conditions to obtain the trained image processing model.

The first embodiment is as follows:

referring to fig. 1, the figure is a flowchart of a model training method provided in an embodiment of the present application. The model training method provided by the embodiment comprises the following steps:

and S101, acquiring a training image.

The training image is used for training the image processing model, and the image processing model is used for processing the image, so that the image processing model can be trained in a targeted manner according to the actual image processing requirement, the trained image processing model has corresponding functions, and the training image used for training the image processing model is selected according to the actual image processing requirement. For example, an image processing model may process a first feature in an image into a second feature, and a training image used to train the image processing model may be an image including the first feature and an image including the second feature. For example, if the function of the training model is to cull glasses, then the training image is a data set with no glasses worn.

In embodiments of the present application, the image processing model may be an unsupervised method model, such as a Generative Adaptive Networks (GAN) based model, and the training images used to train the image processing model may not require human recognition and tagging.

The training images may include only the first training image, only the second training image, or both the first training image and the second training image. The first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features aiming at the same attribute and having different presentations.

For example, if the attribute is glasses, then the first characteristic may be that the target object is wearing glasses on its face and the second characteristic may be that the target object is not wearing glasses on its face; if the attribute is a spot, then the first feature may be that there is a spot on the target object face and the second feature may be that there is no spot on the target object face; if the attribute is a mustache, the first feature may be that there is a mustache on the face of the target object, and the second feature may be that there is no mustache on the face of the target object; if the attribute is mouth, the first characteristic may be that the target object is toothed smile, the second characteristic may be that the target object is not toothed smile, and so on. Of course, the first feature and the second feature may be interchanged, i.e., the first feature may be that the target subject does not wear glasses on his face, the second feature may be that the target subject wears glasses on his face, etc.

In the scene of removing the light spot reflected by the glasses through the image processing model, the first feature may be that the glasses are on the target object face, and the light spot exists on the glasses, and the second feature may be that the glasses are on the target object face, and the light spot does not exist on the glasses, of course, when the first feature is that the glasses are on the target object face, and the light spot exists on the glasses, the corresponding second feature may also be that the glasses are not on the target object face, and the images with the two second features may be used for training the image processing model.

And S102, training the image processing model by using the training image.

In an embodiment of the application, the image processing model may comprise a first generative model and a second generative model. Inputting the first training image into the first generative model, so as to obtain a third training image including the second feature, that is, the first generative model has the capability of processing the first feature into the second feature; and inputting the second training image into the second generative model, a fifth training image comprising the first feature may be obtained, that is, the second generative model has the capability of processing the second feature into the first feature.

Theoretically, the processing of the images by the first generative model and the second generative model is the completely opposite process, and the fourth training image obtained by processing the third training image output by the first generative model by the second generative model should be the same as the first training image, and the sixth training image obtained by processing the fifth training image output by the second generative model by the first generative model should be the same as the second training image.

In practice, however, the image processing process does not reach the ideal state as described above. Therefore, in the embodiment of the present application, the difference between the fourth training image and the first training image may be expressed by a first objective function, and the difference between the sixth training image and the second training image may be expressed by a second objective function, and during the training process of the image processing model, the minimization of the first objective function and/or the minimization of the second objective function may be used as a constraint condition during the training process, that is, the difference between the fourth training image and the first training image may be reduced during the training process, and/or the difference between the sixth training image and the second training image may be reduced, so that the image processing model approaches the ideal state. It will be appreciated that the closer the function of the image processing model is to the ideal, the more accurate the image processing will be, and the less will be the change to other features than the first or second features.

Referring to FIG. 2, an image processing diagram of an image processing model is shown, wherein a first training image x is obtained₀Input to the first generation model G₀Obtaining a third training image G₀(x₀) The third training image G₀(x₀) Input to the second generative model G₁Obtaining a fourth training image G₁(G₀(x₀) The first objective function may be expressed as:

l_aid ₀＝||x₀-G₁(G₀(x₀))||₁ (1)

wherein, | | x₀-G₁(G₀(x₀))||₁Represents a pair x₀-G₁(G₀(x₀) Calculate the L1 norm.

The minimizing of the first objective function may specifically be initializing a weight parameter of the image processing model, setting a hyper-parameter of the model, and adjusting the weight parameters of the first generative model and the second generative model by a gradient descent method to minimize the first objective function. The hyper-parameters may include the number of training rounds n, the learning rate lr, the batch number size bn, etc.

Referring to FIG. 3, an image processing diagram of another image processing model is shown, in which a first training image x is obtained₁Input to the second generative model G₁To obtain a fifth training image G₁(x₁) The fifth training image G₁(x₁) Input to the first generation model G₀To obtain a sixth training image G₀(G₁(x₁) The second objective function may be expressed as:

l_aid ₁＝||x₁-G₀(G₁(x₁))||₁ (2)

wherein, | | x₁-G₀(G₁(x₁))||₁Represents a pair x₁-G₀(G₁(x₁) Calculate the L1 norm.

The minimizing of the second objective function may specifically be initializing a weight parameter of the image processing model, setting a hyper-parameter of the model, and adjusting the weight parameters of the first generative model and the second generative model by a gradient descent method to minimize the second objective function. The hyper-parameters may include the number of training rounds n, the learning rate lr, the batch number size bn, etc.

The minimized first objective function and/or the minimized second objective function are/is used as constraint conditions in the training process, and the image processing model can be updated by using a model optimization algorithm, so that the updated image processing model can process the image more accurately. Parameters in the image processing model can be adjusted by using a model optimization algorithm, so that the image processing model is updated.

In the embodiment of the present application, since the first generation model can process the image with the first feature into the image with the second feature, the second generation model can process the image with the second feature into the image with the first feature, if the second generation model can restore the image processed by the first generation model to the original image, and the first generation model can restore the image processed by the second generation model to the original image, it can be considered that the first generation model and the second generation model are more accurate in processing the images, based on this, the constraint condition for minimizing the first objective function and/or minimizing the second objective function can be set for the training of the image processing model, the image processing model is updated by using the model optimization algorithm, so that the obtained image processing model can more accurately process the image, the output image quality is higher, and the user experience is improved.

Based on the model training method provided in the above embodiment, the embodiment of the present application may further have other constraint conditions, and specifically, may further minimize at least one of the following functions: a loss function of the discriminant model, a first loss function of the first generative model, a second loss function of the first generative model, a third loss function of the second generative model, and a fourth loss function of the second generative model.

Specifically, referring to fig. 4, an image processing schematic diagram of another image processing model provided in the embodiment of the present application is shown, where the image processing model includes a first generation model, a second generation model, and a discriminant model, where the discriminant model is used to determine a probability that an input image is an original image, and a probability that the input image is an image processed by the first generation model or the second generation model, and a result of the determination is a probability value. The penalty function of the discriminant model can be expressed as:

l_d＝l_cls(p，t)＝-log(p_t)，t＝0，1，2 (3)

wherein p is_tIndicating the probability of the image belonging to class t, the specific class 0 indicating the firstAn input image of a generative model, class 1 representing an input image of a second generative model, and class 2 representing an output image of the first generative model or the second generative model.

In the embodiment of the present application, minimizing the loss function of the discriminant model may also be used as a constraint condition in the training process of the image processing model, so as to reduce the discriminant difference between the original image and the processed image by the discriminant model.

In specific implementation, the first training image x is used₀Input to the first generation model G₀Obtaining a third training image

The third training image may also be denoted as G₀(x₀) Inputting the first training image and the third training image into a discrimination model D; second training image x₁Input to the second generative model G₁In (3), a fifth training image is obtained

The fifth training image may also be denoted as G₁(x₁) Inputting the second training image and the fifth training image into the discrimination model D, thereby making the discrimination model to the first training image x₀Second training image x₁The third training image

Fifth training image

And judging, wherein the probability that the first training image is of class 0 is high, the probability that the second training image is of class 1 is high, and the probability that the third training image and the fifth training image are of class 2 is high.

In addition, a third training image may be used

Input to the second generative model G₁Obtaining a fourth training image G₁(G₀(x₀) Receive a fifth training image from the second training image

Input to the first generation model G₀To obtain a sixth training image G₀(G₁(x₁))。

Thus, the first loss function l of the first generative model_g ₀₁Can be expressed as:

l_g ₀₁＝γl_aid ₀+l_o ₀₁＝γl_aid ₀+l_GAN ₀+αl_pix ₀+βl_per ₀ (4)

wherein l_o ₀₁For a third objective function of the first generative model, l_GAN ₀Representing the GAN loss, l, of the first generative model_pix ₀Residual regularization value, l, representing the first generative model_per ₀Representing the perceptual loss of the first generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/_aid ₀Reference may be made to equation (1). In particular, the method comprises the following steps of,

l_GAN ₀＝-log(D(G₀(x₀)))，(5)

l_pix ₀＝||r₀||₁， (6)

wherein r is₀Is x₀Input to the first generative model G₀The residual image obtained in (1) is obtained,

to know

Respectively represent x₀And

the value of (a) is mapped according to a certain rule.

Second loss function l of the first generative model_g ₀₂Can be expressed as:

l_g ₀₂＝γl_aid ₀+l_o ₀₂＝γl_aid ₀+l_GAN ₀+l_dual ₀+αl_pix ₀+βl_per ₀ (8)

wherein l_o ₀₂For a fourth objective function of the first generative model, l_GAN ₀Representing the GAN loss, l, of the first generative model_dual ₀Representing a first loss function, l, in a dual learning process of a first generative model and a second generative model_pix ₀Residual regularization value, l, representing the first generative model_per ₀Representing the perceptual loss of the first generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/_GAN ₀、l_pix ₀、l_per ₀Reference may be made to equations (5), (6) and (7), l_aid ₀Reference may be made to equation (1). In particular, the method comprises the following steps of,

in the embodiment of the present application, minimizing the first loss function and/or the second loss function of the first generative model may also be used as a constraint condition in the training process of the image processing model.

Correspondingly, a third loss function l of the second generative model_g ₁₁Can be expressed as:

l_g ₁₁＝γl_aid ₁+l_o ₁₁＝γl_aid ₁+l_GAN ₁+αl_pix ₁+βl_per ₁ (10)

wherein l_o ₁₁For a fifth objective function of the second generative model, l_GAN ₁Representing the GAN loss, l, of the second generative model_pix ₁Residual regularization value, l, representing a second generative model_per ₁Representing the perceptual loss of the second generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/_aid ₁Reference may be made to equation (2). In particular, the method comprises the following steps of,

l_GAN ₁＝-log(1-D(G₁(x₁)))， (11)

l_pix ₁＝||r₁||₁， (12)

wherein, r₁Is x₁Input to the second generative model G₁The residual image obtained in (1) is obtained,

and

respectively represent x₁And

the value of (a) is mapped according to a certain rule.

Fourth loss function l of the second generative model_g ₁₂Can be expressed as:

l_g ₁₂＝γl_aid ₁+l_o ₁₂＝γl_aid ₁+l_GAN ₁+l_dual ₁+αl_pix ₁+βl_per ₁ (14)

wherein l_o ₁₂For a fifth objective function of the second generative model, l_GAN ₁Representing a second generative modelLoss of GAN,. l_dual ₁Representing a second loss function, l, in a dual learning process of the first generative model and the second generative model_pix ₁Residual regularization value, l, representing a second generative model_per ₁Representing the perceptual loss of the second generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/_GAN ₁、l_pix ₁、l_per ₁Reference may be made to formulae (11), (12) and (13), l_aid ₁Reference may be made to equation (2). In particular, the method comprises the following steps of,

in the embodiment of the present application, minimizing the third loss function and/or the fourth loss function of the second generative model may also be used as a constraint condition in the training process of the image processing model.

Example two

Based on the model training method provided in the foregoing embodiment, as shown in fig. 5, an embodiment of the present application further provides an image processing method, where the trained image processing model is used to process a first image, and specifically, the method may include the following steps:

s201, a first image is acquired.

The trained image processing model may perform image processing on a certain attribute of the image, so that the first image of the input image processing model may include the first feature, and the first feature may refer to the description in the first embodiment, which is not described herein again.

In a scenario where the flare of the eye reflection is removed by the image processing model, the first feature may be that the target object has eye glasses on its face and the flare is present on the eye glasses.

S202, inputting the first image into the first generation model to obtain a second image.

The first generative model is obtained by training with the model training method according to the first embodiment of the present application, and the first generative model can process the first feature into the second feature, so that the second image includes the second feature, and the first feature and the second feature are features that are directed to the same attribute and have different presentations.

For example, if the attribute is glasses and the first characteristic is that the target object is wearing glasses on its face, then the second characteristic may be that the target object is not wearing glasses on its face; if the attribute is flare, the first feature is flare on the target object face, then the second feature may be no flare on the target object face; if the attribute is a mustache and the first feature is that there is a mustache on the face of the target object, then the second feature may be that there is no mustache on the face of the target object; if the attribute is mouth, the first feature is the target object is toothed smile, then the second feature may be the target object is not toothed smile, and so on.

In the scene of removing the light spot reflected by the glasses through the image processing model, the first feature may be that the glasses are on the target object face and the light spot exists on the glasses, and the second feature may be that the glasses are on the target object and the light spot does not exist on the glasses.

The first generation model is trained according to the model training method provided by the embodiment of the application, so that the first image can be processed in a targeted manner, the image processing accuracy is high, and the quality of the output image is high, so that the user experience is improved.

EXAMPLE III

Based on the model training method provided in the foregoing embodiment, as shown in fig. 6, an embodiment of the present application further provides another image processing method, where the trained image processing model is used to process a third image, and specifically, the method may include the following steps:

s301, acquiring a third image.

The trained image processing model is image-processable for a property of the image, and thus a third image of the input image processing model may comprise the second feature. The second feature can refer to the description in the first embodiment, and is not described herein.

S302, inputting the third image into the second generation model to obtain a fourth image.

The second generative model is obtained by training using the model training method according to the first embodiment of the present application, and the second generative model can process the second feature into the first feature, so that the fourth image includes the first feature, and the first feature and the second feature are features that are directed to the same attribute and have different presentations.

For example, if the attribute is glasses and the second characteristic is that the target object does not wear glasses on its face, then the first characteristic may be that the target object does wear glasses on its face; if the attribute is flare, the second feature is no flare on the target object face, then the first feature may be flare on the target object face; if the attribute is a beard and the second characteristic is that no beard is present on the face of the target object, then the first characteristic may be that a beard is present on the face of the target object; if the attribute is mouth, the second characteristic may be that the target object is not toothed smile, then the first characteristic may be that the target object is toothed smile, and so on.

The second generation model is trained according to the model training method provided by the embodiment of the application, so that the third image can be processed in a targeted manner, the image processing accuracy is high, and the quality of the output image is high, so that the user experience is improved.

Based on the model training method provided by the above embodiment, the embodiment of the present application further provides a model training device, and the working principle of the model training device is described in detail below with reference to the accompanying drawings.

Example four

Referring to fig. 7, this figure is a block diagram of a model training apparatus according to a fourth embodiment of the present application. The model training device provided by the embodiment comprises:

a training image obtaining unit 110, configured to obtain a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations;

a model training unit 120, configured to train an image processing model with the training image, where the image processing model includes a first generative model and a second generative model;

the training process includes the following constraints: a first objective function minimization and/or a second objective function minimization;

the first objective function is used for expressing a difference between a third training image obtained by inputting the first training image into the first generation model and then inputting the third training image into the second generation model and a fourth training image obtained by inputting the first training image into the second generation model, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;

the second objective function is used for expressing a difference between a fifth training image obtained by inputting the second training image into the second generation model and then inputting the fifth training image into a sixth training image obtained by the first generation model, and the difference is between the fifth training image and the second training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature.

Optionally, the first objective function

l_aid ₀＝||x₀-G₁(G₀(x₀))||₁

Optionally, the second objective function

l_aid ₁＝||x₁-G₀(G₁(x₁))||₁

Based on the image processing method provided by the above embodiment, the embodiment of the present application further provides an image processing apparatus, and the working principle of the image processing apparatus is described in detail below with reference to the accompanying drawings.

EXAMPLE five

Referring to fig. 8, this figure is a block diagram of an image processing apparatus according to a fifth embodiment of the present application. The image processing apparatus provided by the embodiment includes:

a first image obtaining unit 210 for obtaining a first image, the first image including a first feature;

the second image obtaining unit 220 is configured to input the first image into a first generation model to obtain a second image, where the second image includes a second feature, the first feature and the second feature are features that are specific to the same attribute and have different presentations, and the first generation model is obtained by training using a model training method provided in an embodiment of the present application.

EXAMPLE six

Referring to fig. 9, this figure is a block diagram of another image processing apparatus according to a sixth embodiment of the present application. The image processing apparatus provided by the embodiment includes:

a third image obtaining unit 310, configured to obtain a third image, where the third image includes a second feature;

a fourth image obtaining unit 320, configured to input the third image into a second generative model to obtain a fourth image, where the image includes a first feature, and the first feature and the second feature are features that are specific to a same attribute and have different presentations, and the second generative model is obtained by training using a model training method provided in an embodiment of the present application.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

The above-described embodiments are intended to explain the objects, aspects and advantages of the present invention in further detail, and it should be understood that the above-described embodiments are merely exemplary embodiments of the present invention.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of model training, the method comprising:

acquiring a training image, wherein the training image comprises a first training image and/or a second training image, the first training image is an image comprising a first characteristic, and the second training image is an image comprising a second characteristic;

the training process comprises the following steps:

the training process further includes the following constraints: minimizing a first objective function for expressing a difference between the fourth training image and the first training image;

and/or, the training process comprises:

the training process further includes the following constraints: minimizing a second objective function;

the second objective function is used for expressing the difference of the sixth training image and the second training image;

the image processing model further comprises a discrimination model, wherein the discrimination model is used for judging the probability that an input image is an original image and the probability that the input image is an image processed by the first generation model or the second generation model;

the penalty function of the discriminant model is expressed as: l_d＝l_cls(p,t)＝-log(p_t) T is 0,1,2, wherein p_tRepresenting the probability that an image belongs to class t, class 0 representing the input image of the first generative model, class 1 representing the input image of the second generative model, class 2 representing the output image of the first generative model or the second generative model.

2. The method of claim 1, wherein the first characteristic is the presence of a flare on the target subject's face and the second characteristic is the absence of a flare on the target subject's face.

3. The method of claim 1 or 2, wherein the first characteristic is the target subject's face with glasses and the second characteristic is the target subject's face without glasses.

4. A model training apparatus, the apparatus comprising:

the training process comprises the following steps:

the first objective function is used for expressing the difference between the fourth training image and the first training image, and the second objective function is used for expressing the difference between the sixth training image and the second training image;

5. The apparatus of claim 4, wherein the first characteristic is the presence of a spot on the face of the target object and the second characteristic is the absence of a spot on the face of the target object.

6. The apparatus of claim 4 or 5, wherein the first characteristic is that the target subject is wearing glasses on the face and the second characteristic is that the target subject is not wearing glasses on the face.

7. An image processing method, characterized in that the method comprises:

acquiring a first image, wherein the first image comprises a first feature;

inputting the first image into a first generation model to obtain a second image, wherein the second image comprises a second feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the first generation model is obtained by training by adopting the model training method according to any one of claims 1 to 3.

8. An image processing apparatus, characterized in that the apparatus comprises:

a second image obtaining unit, configured to input the first image into a first generation model to obtain a second image, where the second image includes a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations, and the first generation model is obtained by training using the model training method according to any one of claims 1 to 3.

9. An image processing method, characterized in that the method comprises:

acquiring a third image, wherein the third image comprises a second feature;

inputting the third image into a second generated model to obtain a fourth image, wherein the image comprises a first feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the second generated model is obtained by training by adopting the model training method according to any one of claims 1 to 3.

10. An image processing apparatus, characterized in that the apparatus comprises:

a fourth image obtaining unit, configured to input the third image into a second generative model to obtain a fourth image, where the image includes a first feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations, and the second generative model is obtained by training using the model training method according to any one of claims 1 to 3.