CN110009044B - Model training method and device, and image processing method and device - Google Patents

Model training method and device, and image processing method and device Download PDF

Info

Publication number
CN110009044B
CN110009044B CN201910280851.9A CN201910280851A CN110009044B CN 110009044 B CN110009044 B CN 110009044B CN 201910280851 A CN201910280851 A CN 201910280851A CN 110009044 B CN110009044 B CN 110009044B
Authority
CN
China
Prior art keywords
image
training
model
feature
training image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910280851.9A
Other languages
Chinese (zh)
Other versions
CN110009044A (en
Inventor
聂凤梅
姚涛
黄通兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qixin Yiwei Shenzhen Technology Co ltd
Beijing 7Invensun Technology Co Ltd
Original Assignee
Qixin Yiwei Shenzhen Technology Co ltd
Beijing 7Invensun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qixin Yiwei Shenzhen Technology Co ltd, Beijing 7Invensun Technology Co Ltd filed Critical Qixin Yiwei Shenzhen Technology Co ltd
Priority to CN201910280851.9A priority Critical patent/CN110009044B/en
Publication of CN110009044A publication Critical patent/CN110009044A/en
Application granted granted Critical
Publication of CN110009044B publication Critical patent/CN110009044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses a model training method and device, and an image processing method and device, wherein a first generation model can process an image with a first characteristic into an image with a second characteristic, a second generation model can process the image with the second characteristic into an image with the first characteristic, if the second generation model can restore the image processed by the first generation model into an original image, and the first generation model can restore the image processed by the second generation model into the original image, then the first generation model and the second generation model can be considered to be more accurate in processing the images, therefore, constraint conditions for minimizing a first objective function and/or minimizing a second objective function can be set for the training of the image processing model, and the image processing model is updated by using a model optimization algorithm, the obtained image processing model can accurately process the image, the quality of the output image is high, and the user experience is improved.

Description

Model training method and device, and image processing method and device
Technical Field
The present application relates to the field of computers, and in particular, to a model training method and apparatus, and an image processing method and apparatus.
Background
In the image processing process, there is usually a need to process the target position in the image, such as removing glasses and beard from the portrait, adjusting facial expression of the person, and adding glasses and beard to the portrait. To meet this requirement, the image processing model may be trained to perform specific processing on the image. However, in an output image of the existing image processing model, the detail loss is serious, the image quality is poor, and the requirements of users cannot be met.
Disclosure of Invention
In order to solve the problem of poor image quality of an output image in a machine learning model in the prior art, the embodiment of the application provides a model training method and device, and an image processing method and device.
The embodiment of the application provides a model training method, which comprises the following steps:
acquiring a training image, wherein the training image comprises a first training image and/or a second training image, the first training image is an image comprising a first feature, the second training image is an image comprising a second feature, and the first feature and the second feature are features aiming at the same attribute and having different presentations;
training an image processing model by using the training image, wherein the image processing model comprises a first generative model and a second generative model;
the training process comprises the following steps:
inputting the first training image into the first generation model to obtain a third training image, and inputting the third training image into the second generation model to obtain a fourth training image, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;
inputting the second training image into the second generation model to obtain a fifth training image, and inputting the fifth training image into the first generation model to obtain a sixth training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature;
the training process further includes the following constraints: a first objective function minimization and/or a second objective function minimization;
the first objective function is used to express a difference between the fourth training image and the first training image, and the second objective function is used to express a difference between the sixth training image and the second training image.
Optionally, the first objective function
laid 0=||x0-G1(G0(x0))||1
Wherein x is0Representing said first training image, G0Representing said first generative model, G0(x0) Representing said third training image, G1Representing said second generative model, G1(G0(x0) Represents the fourth training image, | x0-G1(G0(x0))||1Represents a pair x0-G1(G0(x0) Calculate the L1 norm.
Optionally, the second objective function
laid 1=||x1-G0(G1(x1))||1
Wherein x is1Representing said second training image, G1Representing said second generative model, G1(x1) Representing said fifth training image, G0Representing said first generative model, G0(G1(x1) Represents the sixth training image, | x1-G0(G1(x1))||1Represents a pair x1-G0(G1(x1) Calculate the L1 norm.
Optionally, the first characteristic is that the target object has a spot on the face, and the second characteristic is that the target object has no spot on the face.
Optionally, the first characteristic is that the target object has glasses on the face, and the second characteristic is that the target object has no glasses on the face.
An embodiment of the present application further provides a model training device, the device includes:
a training image obtaining unit, configured to obtain a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations;
the model training unit is used for training an image processing model by using the training image, wherein the image processing model comprises a first generation model and a second generation model;
the training process comprises the following steps:
inputting the first training image into the first generation model to obtain a third training image, and inputting the third training image into the second generation model to obtain a fourth training image, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;
inputting the second training image into the second generation model to obtain a fifth training image, and inputting the fifth training image into the first generation model to obtain a sixth training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature;
the training process further includes the following constraints: a first objective function minimization and/or a second objective function minimization;
the first objective function is used to express a difference between the fourth training image and the first training image, and the second objective function is used to express a difference between the sixth training image and the second training image.
Optionally, the first objective function
laid 0=||x0-G1(G0(x0))||1
Wherein x is0Representing said first training image, G0Representing said first generative model, G0(x0) Representing said third training image, G1Representing said second generative model, G1(G0(x0) Represents the fourth training image, | x0-G1(G0(x0))||1Represents a pair x0-G1(G0(x0) Calculate the L1 norm.
Optionally, the second objective function
laid 1=||x1-G0(G1(x1))||1
Wherein x is1Representing said second training image, G1Representing said second generative model, G1(x1) Representing said fifth training image, G0Representing said first generative model, G0(G1(x1) Represents the sixth training image, | x1-G0(G1(x1))||1Represents a pair x1-G0(G1(x1) Calculate the L1 norm.
Optionally, the first characteristic is that the target object has a spot on the face, and the second characteristic is that the target object has no spot on the face.
Optionally, the first characteristic is that the target object has glasses on the face, and the second characteristic is that the target object has no glasses on the face.
The embodiment of the application also provides an image processing method, which comprises the following steps:
acquiring a first image, wherein the first image comprises a first feature;
the first image is input into a first generation model to obtain a second image, the second image comprises a second feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the first generation model is obtained by training by adopting a model training method provided by the embodiment of the application.
An embodiment of the present application further provides an image processing apparatus, including:
a first image acquisition unit for acquiring a first image, the first image including a first feature;
the second image obtaining unit is configured to input the first image into a first generation model to obtain a second image, where the second image includes a second feature, the first feature and the second feature are features that are specific to the same attribute and have different presentations, and the first generation model is obtained by training using a model training method provided in an embodiment of the present application.
The embodiment of the application also provides another image processing method, which comprises the following steps:
acquiring a third image, wherein the third image comprises a second feature;
inputting the third image into a second generation model to obtain a fourth image, wherein the image comprises a first feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the second generation model is obtained by training by adopting a model training method provided by the embodiment of the application.
An embodiment of the present application further provides another image processing apparatus, including:
a third image acquisition unit configured to acquire a third image, the third image including a second feature;
a fourth image obtaining unit, configured to input the third image into a second generative model to obtain a fourth image, where the image includes a first feature, and the first feature and the second feature are features that are specific to a same attribute and have different presentations, and the second generative model is obtained by training using a model training method provided in an embodiment of the present application.
The embodiment of the application provides a model training method and a device, wherein training images can be obtained firstly, the training images comprise first training images and/or second training images, the first training images comprise images with first characteristics, the second training images comprise images with second characteristics, the first characteristics and the second characteristics are specific to the same attribute and have different presented characteristics, for example, wearing glasses and not wearing glasses can be respectively used as the first characteristics and the second characteristics; then, training an image processing model by using a training image, wherein the image processing model comprises a first generation model and a second generation model, wherein inputting the first training image into the first generation model can obtain a third training image with second characteristics, inputting the third training image into the second generation model can obtain a fourth training image with the first characteristics, correspondingly, inputting the second training image into the second generation model can obtain a fifth training image with the first characteristics, inputting the fifth training image into the first generation model can obtain a sixth training image with the second characteristics, in the embodiment of the application, the difference between the first training image and the fourth training image can be expressed by a first objective function, the difference between the second training image and the sixth training image can be expressed by a second objective function, and minimizing the first objective function and/or minimizing the second objective function as the training image processing model And (5) constraint conditions to obtain the trained image processing model.
Because the first generation model can process the image with the first characteristic into the image with the second characteristic, the second generation model can process the image with the second characteristic into the image with the first characteristic, if the second generation model can restore the image processed by the first generation model into the original image, and the first generation model can restore the image processed by the second generation model into the original image, the targeted processing of the image by the first generation model and the second generation model can be considered to be more accurate, based on the above, the constraint conditions for minimizing the first objective function and/or minimizing the second objective function can be set for the training of the image processing model, the image processing model can be updated by the model optimization algorithm, so that the obtained image processing model can process the image more accurately, and the quality of the output image is higher, the user experience is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a model training method provided in an embodiment of the present application;
fig. 2 is an image processing schematic diagram of an image processing model according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of image processing of another image processing model according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of image processing of another image processing model provided in an embodiment of the present application;
fig. 5 is a flowchart of an image processing method according to an embodiment of the present application;
FIG. 6 is a flowchart of another image processing method provided in the embodiments of the present application;
fig. 7 is a block diagram illustrating a structure of a model training apparatus according to an embodiment of the present disclosure;
fig. 8 is a block diagram of an image processing apparatus according to an embodiment of the present application;
fig. 9 is a block diagram of another image processing apparatus according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the prior art, an image processing model can be trained by using a model optimization algorithm, so that the image processing model can process a target position in an image, for example, glasses, a beard and the like in a portrait are removed, a facial expression of a task is adjusted, or glasses, a beard and the like are added to the portrait. However, in an output image of the existing image processing model, the loss of detail is relatively serious, the image quality is poor, and the requirements of a user cannot be met, for example, in the process of removing glasses in a portrait, other parts of the portrait may be changed, so that the processed portrait and the portrait before processing are not 'portrait'.
For example, in some scenes, the face image includes glasses, and the light reflected by the glasses generates light spots, and if the face image is used for performing the sight analysis, the light spots reflected by the glasses affect the analysis result, so that the light spots reflected by the glasses can be removed through the image processing model, and then the sight analysis can be performed according to the processed image, so as to improve the accuracy of the sight analysis result. However, in an output image of the existing image processing model, not only light spots reflected by glasses are removed, but also other positions of a face image are changed, for example, changes of shapes, sizes, distances and the like of eyes also affect the accuracy of a sight line analysis result.
Based on this, the embodiment of the application provides a model training method and device, which may first acquire a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations, and for example, wearing glasses and not wearing glasses may be respectively used as the first feature and the second feature; then, training an image processing model by using a training image, wherein the image processing model comprises a first generation model and a second generation model, wherein inputting the first training image into the first generation model can obtain a third training image with second characteristics, inputting the third training image into the second generation model can obtain a fourth training image with the first characteristics, correspondingly, inputting the second training image into the second generation model can obtain a fifth training image with the first characteristics, inputting the fifth training image into the first generation model can obtain a sixth training image with the second characteristics, in the embodiment of the application, the difference between the first training image and the fourth training image can be expressed by a first objective function, the difference between the second training image and the sixth training image can be expressed by a second objective function, and minimizing the first objective function and/or minimizing the second objective function as the training image processing model And (5) constraint conditions to obtain the trained image processing model.
Because the first generation model can process the image with the first characteristic into the image with the second characteristic, the second generation model can process the image with the second characteristic into the image with the first characteristic, if the second generation model can restore the image processed by the first generation model into the original image, and the first generation model can restore the image processed by the second generation model into the original image, the targeted processing of the image by the first generation model and the second generation model can be considered to be more accurate, based on the above, the constraint conditions for minimizing the first objective function and/or minimizing the second objective function can be set for the training of the image processing model, the image processing model can be updated by the model optimization algorithm, so that the obtained image processing model can process the image more accurately, and the quality of the output image is higher, the user experience is improved.
The first embodiment is as follows:
referring to fig. 1, the figure is a flowchart of a model training method provided in an embodiment of the present application. The model training method provided by the embodiment comprises the following steps:
and S101, acquiring a training image.
The training image is used for training the image processing model, and the image processing model is used for processing the image, so that the image processing model can be trained in a targeted manner according to the actual image processing requirement, the trained image processing model has corresponding functions, and the training image used for training the image processing model is selected according to the actual image processing requirement. For example, an image processing model may process a first feature in an image into a second feature, and a training image used to train the image processing model may be an image including the first feature and an image including the second feature. For example, if the function of the training model is to cull glasses, then the training image is a data set with no glasses worn.
In embodiments of the present application, the image processing model may be an unsupervised method model, such as a Generative Adaptive Networks (GAN) based model, and the training images used to train the image processing model may not require human recognition and tagging.
The training images may include only the first training image, only the second training image, or both the first training image and the second training image. The first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features aiming at the same attribute and having different presentations.
For example, if the attribute is glasses, then the first characteristic may be that the target object is wearing glasses on its face and the second characteristic may be that the target object is not wearing glasses on its face; if the attribute is a spot, then the first feature may be that there is a spot on the target object face and the second feature may be that there is no spot on the target object face; if the attribute is a mustache, the first feature may be that there is a mustache on the face of the target object, and the second feature may be that there is no mustache on the face of the target object; if the attribute is mouth, the first characteristic may be that the target object is toothed smile, the second characteristic may be that the target object is not toothed smile, and so on. Of course, the first feature and the second feature may be interchanged, i.e., the first feature may be that the target subject does not wear glasses on his face, the second feature may be that the target subject wears glasses on his face, etc.
In the scene of removing the light spot reflected by the glasses through the image processing model, the first feature may be that the glasses are on the target object face, and the light spot exists on the glasses, and the second feature may be that the glasses are on the target object face, and the light spot does not exist on the glasses, of course, when the first feature is that the glasses are on the target object face, and the light spot exists on the glasses, the corresponding second feature may also be that the glasses are not on the target object face, and the images with the two second features may be used for training the image processing model.
And S102, training the image processing model by using the training image.
In an embodiment of the application, the image processing model may comprise a first generative model and a second generative model. Inputting the first training image into the first generative model, so as to obtain a third training image including the second feature, that is, the first generative model has the capability of processing the first feature into the second feature; and inputting the second training image into the second generative model, a fifth training image comprising the first feature may be obtained, that is, the second generative model has the capability of processing the second feature into the first feature.
Theoretically, the processing of the images by the first generative model and the second generative model is the completely opposite process, and the fourth training image obtained by processing the third training image output by the first generative model by the second generative model should be the same as the first training image, and the sixth training image obtained by processing the fifth training image output by the second generative model by the first generative model should be the same as the second training image.
In practice, however, the image processing process does not reach the ideal state as described above. Therefore, in the embodiment of the present application, the difference between the fourth training image and the first training image may be expressed by a first objective function, and the difference between the sixth training image and the second training image may be expressed by a second objective function, and during the training process of the image processing model, the minimization of the first objective function and/or the minimization of the second objective function may be used as a constraint condition during the training process, that is, the difference between the fourth training image and the first training image may be reduced during the training process, and/or the difference between the sixth training image and the second training image may be reduced, so that the image processing model approaches the ideal state. It will be appreciated that the closer the function of the image processing model is to the ideal, the more accurate the image processing will be, and the less will be the change to other features than the first or second features.
Referring to FIG. 2, an image processing diagram of an image processing model is shown, wherein a first training image x is obtained0Input to the first generation model G0Obtaining a third training image G0(x0) The third training image G0(x0) Input to the second generative model G1Obtaining a fourth training image G1(G0(x0) The first objective function may be expressed as:
laid 0=||x0-G1(G0(x0))||1 (1)
wherein, | | x0-G1(G0(x0))||1Represents a pair x0-G1(G0(x0) Calculate the L1 norm.
The minimizing of the first objective function may specifically be initializing a weight parameter of the image processing model, setting a hyper-parameter of the model, and adjusting the weight parameters of the first generative model and the second generative model by a gradient descent method to minimize the first objective function. The hyper-parameters may include the number of training rounds n, the learning rate lr, the batch number size bn, etc.
Referring to FIG. 3, an image processing diagram of another image processing model is shown, in which a first training image x is obtained1Input to the second generative model G1To obtain a fifth training image G1(x1) The fifth training image G1(x1) Input to the first generation model G0To obtain a sixth training image G0(G1(x1) The second objective function may be expressed as:
laid 1=||x1-G0(G1(x1))||1 (2)
wherein, | | x1-G0(G1(x1))||1Represents a pair x1-G0(G1(x1) Calculate the L1 norm.
The minimizing of the second objective function may specifically be initializing a weight parameter of the image processing model, setting a hyper-parameter of the model, and adjusting the weight parameters of the first generative model and the second generative model by a gradient descent method to minimize the second objective function. The hyper-parameters may include the number of training rounds n, the learning rate lr, the batch number size bn, etc.
The minimized first objective function and/or the minimized second objective function are/is used as constraint conditions in the training process, and the image processing model can be updated by using a model optimization algorithm, so that the updated image processing model can process the image more accurately. Parameters in the image processing model can be adjusted by using a model optimization algorithm, so that the image processing model is updated.
In the embodiment of the present application, since the first generation model can process the image with the first feature into the image with the second feature, the second generation model can process the image with the second feature into the image with the first feature, if the second generation model can restore the image processed by the first generation model to the original image, and the first generation model can restore the image processed by the second generation model to the original image, it can be considered that the first generation model and the second generation model are more accurate in processing the images, based on this, the constraint condition for minimizing the first objective function and/or minimizing the second objective function can be set for the training of the image processing model, the image processing model is updated by using the model optimization algorithm, so that the obtained image processing model can more accurately process the image, the output image quality is higher, and the user experience is improved.
Based on the model training method provided in the above embodiment, the embodiment of the present application may further have other constraint conditions, and specifically, may further minimize at least one of the following functions: a loss function of the discriminant model, a first loss function of the first generative model, a second loss function of the first generative model, a third loss function of the second generative model, and a fourth loss function of the second generative model.
Specifically, referring to fig. 4, an image processing schematic diagram of another image processing model provided in the embodiment of the present application is shown, where the image processing model includes a first generation model, a second generation model, and a discriminant model, where the discriminant model is used to determine a probability that an input image is an original image, and a probability that the input image is an image processed by the first generation model or the second generation model, and a result of the determination is a probability value. The penalty function of the discriminant model can be expressed as:
ld=lcls(p,t)=-log(pt),t=0,1,2 (3)
wherein p istIndicating the probability of the image belonging to class t, the specific class 0 indicating the firstAn input image of a generative model, class 1 representing an input image of a second generative model, and class 2 representing an output image of the first generative model or the second generative model.
In the embodiment of the present application, minimizing the loss function of the discriminant model may also be used as a constraint condition in the training process of the image processing model, so as to reduce the discriminant difference between the original image and the processed image by the discriminant model.
In specific implementation, the first training image x is used0Input to the first generation model G0Obtaining a third training image
Figure BDA0002021603990000115
The third training image may also be denoted as G0(x0) Inputting the first training image and the third training image into a discrimination model D; second training image x1Input to the second generative model G1In (3), a fifth training image is obtained
Figure BDA0002021603990000116
The fifth training image may also be denoted as G1(x1) Inputting the second training image and the fifth training image into the discrimination model D, thereby making the discrimination model to the first training image x0Second training image x1The third training image
Figure BDA0002021603990000117
Fifth training image
Figure BDA0002021603990000118
And judging, wherein the probability that the first training image is of class 0 is high, the probability that the second training image is of class 1 is high, and the probability that the third training image and the fifth training image are of class 2 is high.
In addition, a third training image may be used
Figure BDA0002021603990000119
Input to the second generative model G1Obtaining a fourth training image G1(G0(x0) Receive a fifth training image from the second training image
Figure BDA00020216039900001110
Input to the first generation model G0To obtain a sixth training image G0(G1(x1))。
Thus, the first loss function l of the first generative modelg 01Can be expressed as:
lg 01=γlaid 0+lo 01=γlaid 0+lGAN 0+αlpix 0+βlper 0 (4)
wherein lo 01For a third objective function of the first generative model, lGAN 0Representing the GAN loss, l, of the first generative modelpix 0Residual regularization value, l, representing the first generative modelper 0Representing the perceptual loss of the first generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/aid 0Reference may be made to equation (1). In particular, the method comprises the following steps of,
lGAN 0=-log(D(G0(x0))),(5)
lpix 0=||r0||1, (6)
Figure BDA0002021603990000111
wherein r is0Is x0Input to the first generative model G0The residual image obtained in (1) is obtained,
Figure BDA0002021603990000112
to know
Figure BDA0002021603990000113
Respectively represent x0And
Figure BDA0002021603990000114
the value of (a) is mapped according to a certain rule.
Second loss function l of the first generative modelg 02Can be expressed as:
lg 02=γlaid 0+lo 02=γlaid 0+lGAN 0+ldual 0+αlpix 0+βlper 0 (8)
wherein lo 02For a fourth objective function of the first generative model, lGAN 0Representing the GAN loss, l, of the first generative modeldual 0Representing a first loss function, l, in a dual learning process of a first generative model and a second generative modelpix 0Residual regularization value, l, representing the first generative modelper 0Representing the perceptual loss of the first generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/GAN 0、lpix 0、lper 0Reference may be made to equations (5), (6) and (7), laid 0Reference may be made to equation (1). In particular, the method comprises the following steps of,
Figure BDA0002021603990000121
in the embodiment of the present application, minimizing the first loss function and/or the second loss function of the first generative model may also be used as a constraint condition in the training process of the image processing model.
Correspondingly, a third loss function l of the second generative modelg 11Can be expressed as:
lg 11=γlaid 1+lo 11=γlaid 1+lGAN 1+αlpix 1+βlper 1 (10)
wherein lo 11For a fifth objective function of the second generative model, lGAN 1Representing the GAN loss, l, of the second generative modelpix 1Residual regularization value, l, representing a second generative modelper 1Representing the perceptual loss of the second generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/aid 1Reference may be made to equation (2). In particular, the method comprises the following steps of,
lGAN 1=-log(1-D(G1(x1))), (11)
lpix 1=||r1||1, (12)
Figure BDA0002021603990000122
wherein, r1Is x1Input to the second generative model G1The residual image obtained in (1) is obtained,
Figure BDA0002021603990000125
and
Figure BDA0002021603990000123
respectively represent x1And
Figure BDA0002021603990000124
the value of (a) is mapped according to a certain rule.
Fourth loss function l of the second generative modelg 12Can be expressed as:
lg 12=γlaid 1+lo 12=γlaid 1+lGAN 1+ldual 1+αlpix 1+βlper 1 (14)
wherein lo 12For a fifth objective function of the second generative model, lGAN 1Representing a second generative modelLoss of GAN,. ldual 1Representing a second loss function, l, in a dual learning process of the first generative model and the second generative modelpix 1Residual regularization value, l, representing a second generative modelper 1Representing the perceptual loss of the second generative model, α, β and γ being weight parameters of a loss function, respectively, which can be set empirically by a person skilled in the art,/GAN 1、lpix 1、lper 1Reference may be made to formulae (11), (12) and (13), laid 1Reference may be made to equation (2). In particular, the method comprises the following steps of,
Figure BDA0002021603990000131
in the embodiment of the present application, minimizing the third loss function and/or the fourth loss function of the second generative model may also be used as a constraint condition in the training process of the image processing model.
Example two
Based on the model training method provided in the foregoing embodiment, as shown in fig. 5, an embodiment of the present application further provides an image processing method, where the trained image processing model is used to process a first image, and specifically, the method may include the following steps:
s201, a first image is acquired.
The trained image processing model may perform image processing on a certain attribute of the image, so that the first image of the input image processing model may include the first feature, and the first feature may refer to the description in the first embodiment, which is not described herein again.
In a scenario where the flare of the eye reflection is removed by the image processing model, the first feature may be that the target object has eye glasses on its face and the flare is present on the eye glasses.
S202, inputting the first image into the first generation model to obtain a second image.
The first generative model is obtained by training with the model training method according to the first embodiment of the present application, and the first generative model can process the first feature into the second feature, so that the second image includes the second feature, and the first feature and the second feature are features that are directed to the same attribute and have different presentations.
For example, if the attribute is glasses and the first characteristic is that the target object is wearing glasses on its face, then the second characteristic may be that the target object is not wearing glasses on its face; if the attribute is flare, the first feature is flare on the target object face, then the second feature may be no flare on the target object face; if the attribute is a mustache and the first feature is that there is a mustache on the face of the target object, then the second feature may be that there is no mustache on the face of the target object; if the attribute is mouth, the first feature is the target object is toothed smile, then the second feature may be the target object is not toothed smile, and so on.
In the scene of removing the light spot reflected by the glasses through the image processing model, the first feature may be that the glasses are on the target object face and the light spot exists on the glasses, and the second feature may be that the glasses are on the target object and the light spot does not exist on the glasses.
The first generation model is trained according to the model training method provided by the embodiment of the application, so that the first image can be processed in a targeted manner, the image processing accuracy is high, and the quality of the output image is high, so that the user experience is improved.
EXAMPLE III
Based on the model training method provided in the foregoing embodiment, as shown in fig. 6, an embodiment of the present application further provides another image processing method, where the trained image processing model is used to process a third image, and specifically, the method may include the following steps:
s301, acquiring a third image.
The trained image processing model is image-processable for a property of the image, and thus a third image of the input image processing model may comprise the second feature. The second feature can refer to the description in the first embodiment, and is not described herein.
S302, inputting the third image into the second generation model to obtain a fourth image.
The second generative model is obtained by training using the model training method according to the first embodiment of the present application, and the second generative model can process the second feature into the first feature, so that the fourth image includes the first feature, and the first feature and the second feature are features that are directed to the same attribute and have different presentations.
For example, if the attribute is glasses and the second characteristic is that the target object does not wear glasses on its face, then the first characteristic may be that the target object does wear glasses on its face; if the attribute is flare, the second feature is no flare on the target object face, then the first feature may be flare on the target object face; if the attribute is a beard and the second characteristic is that no beard is present on the face of the target object, then the first characteristic may be that a beard is present on the face of the target object; if the attribute is mouth, the second characteristic may be that the target object is not toothed smile, then the first characteristic may be that the target object is toothed smile, and so on.
The second generation model is trained according to the model training method provided by the embodiment of the application, so that the third image can be processed in a targeted manner, the image processing accuracy is high, and the quality of the output image is high, so that the user experience is improved.
Based on the model training method provided by the above embodiment, the embodiment of the present application further provides a model training device, and the working principle of the model training device is described in detail below with reference to the accompanying drawings.
Example four
Referring to fig. 7, this figure is a block diagram of a model training apparatus according to a fourth embodiment of the present application. The model training device provided by the embodiment comprises:
a training image obtaining unit 110, configured to obtain a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations;
a model training unit 120, configured to train an image processing model with the training image, where the image processing model includes a first generative model and a second generative model;
the training process includes the following constraints: a first objective function minimization and/or a second objective function minimization;
the first objective function is used for expressing a difference between a third training image obtained by inputting the first training image into the first generation model and then inputting the third training image into the second generation model and a fourth training image obtained by inputting the first training image into the second generation model, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;
the second objective function is used for expressing a difference between a fifth training image obtained by inputting the second training image into the second generation model and then inputting the fifth training image into a sixth training image obtained by the first generation model, and the difference is between the fifth training image and the second training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature.
Optionally, the first objective function
laid 0=||x0-G1(G0(x0))||1
Wherein x is0Representing said first training image, G0Representing said first generative model, G0(x0) Representing said third training image, G1Representing said second generative model, G1(G0(x0) Represents the fourth training image, | x0-G1(G0(x0))||1Represents a pair x0-G1(G0(x0) Calculate the L1 norm.
Optionally, the second objective function
laid 1=||x1-G0(G1(x1))||1
Wherein x is1Representing said second training image, G1Representing said second generative model, G1(x1) Representing said fifth training image, G0Representing said first generative model, G0(G1(x1) Represents the sixth training image, | x1-G0(G1(x1))||1Represents a pair x1-G0(G1(x1) Calculate the L1 norm.
Optionally, the first characteristic is that the target object has a spot on the face, and the second characteristic is that the target object has no spot on the face.
Optionally, the first characteristic is that the target object has glasses on the face, and the second characteristic is that the target object has no glasses on the face.
In the embodiment of the present application, since the first generation model can process the image with the first feature into the image with the second feature, the second generation model can process the image with the second feature into the image with the first feature, if the second generation model can restore the image processed by the first generation model to the original image, and the first generation model can restore the image processed by the second generation model to the original image, it can be considered that the first generation model and the second generation model are more accurate in processing the images, based on this, the constraint condition for minimizing the first objective function and/or minimizing the second objective function can be set for the training of the image processing model, the image processing model is updated by using the model optimization algorithm, so that the obtained image processing model can more accurately process the image, the output image quality is higher, and the user experience is improved.
Based on the image processing method provided by the above embodiment, the embodiment of the present application further provides an image processing apparatus, and the working principle of the image processing apparatus is described in detail below with reference to the accompanying drawings.
EXAMPLE five
Referring to fig. 8, this figure is a block diagram of an image processing apparatus according to a fifth embodiment of the present application. The image processing apparatus provided by the embodiment includes:
a first image obtaining unit 210 for obtaining a first image, the first image including a first feature;
the second image obtaining unit 220 is configured to input the first image into a first generation model to obtain a second image, where the second image includes a second feature, the first feature and the second feature are features that are specific to the same attribute and have different presentations, and the first generation model is obtained by training using a model training method provided in an embodiment of the present application.
The first generation model is trained according to the model training method provided by the embodiment of the application, so that the first image can be processed in a targeted manner, the image processing accuracy is high, and the quality of the output image is high, so that the user experience is improved.
EXAMPLE six
Referring to fig. 9, this figure is a block diagram of another image processing apparatus according to a sixth embodiment of the present application. The image processing apparatus provided by the embodiment includes:
a third image obtaining unit 310, configured to obtain a third image, where the third image includes a second feature;
a fourth image obtaining unit 320, configured to input the third image into a second generative model to obtain a fourth image, where the image includes a first feature, and the first feature and the second feature are features that are specific to a same attribute and have different presentations, and the second generative model is obtained by training using a model training method provided in an embodiment of the present application.
The second generation model is trained according to the model training method provided by the embodiment of the application, so that the third image can be processed in a targeted manner, the image processing accuracy is high, and the quality of the output image is high, so that the user experience is improved.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The above-described embodiments are intended to explain the objects, aspects and advantages of the present invention in further detail, and it should be understood that the above-described embodiments are merely exemplary embodiments of the present invention.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method of model training, the method comprising:
acquiring a training image, wherein the training image comprises a first training image and/or a second training image, the first training image is an image comprising a first characteristic, and the second training image is an image comprising a second characteristic;
training an image processing model by using the training image, wherein the image processing model comprises a first generative model and a second generative model;
the training process comprises the following steps:
inputting the first training image into the first generation model to obtain a third training image, and inputting the third training image into the second generation model to obtain a fourth training image, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;
the training process further includes the following constraints: minimizing a first objective function for expressing a difference between the fourth training image and the first training image;
and/or, the training process comprises:
inputting the second training image into the second generation model to obtain a fifth training image, and inputting the fifth training image into the first generation model to obtain a sixth training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature;
the training process further includes the following constraints: minimizing a second objective function;
the second objective function is used for expressing the difference of the sixth training image and the second training image;
the image processing model further comprises a discrimination model, wherein the discrimination model is used for judging the probability that an input image is an original image and the probability that the input image is an image processed by the first generation model or the second generation model;
the penalty function of the discriminant model is expressed as: ld=lcls(p,t)=-log(pt) T is 0,1,2, wherein ptRepresenting the probability that an image belongs to class t, class 0 representing the input image of the first generative model, class 1 representing the input image of the second generative model, class 2 representing the output image of the first generative model or the second generative model.
2. The method of claim 1, wherein the first characteristic is the presence of a flare on the target subject's face and the second characteristic is the absence of a flare on the target subject's face.
3. The method of claim 1 or 2, wherein the first characteristic is the target subject's face with glasses and the second characteristic is the target subject's face without glasses.
4. A model training apparatus, the apparatus comprising:
a training image obtaining unit, configured to obtain a training image, where the training image includes a first training image and/or a second training image, where the first training image is an image including a first feature, the second training image is an image including a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations;
the model training unit is used for training an image processing model by using the training image, wherein the image processing model comprises a first generation model and a second generation model;
the training process comprises the following steps:
inputting the first training image into the first generation model to obtain a third training image, and inputting the third training image into the second generation model to obtain a fourth training image, wherein the third training image is an image including the second feature, and the fourth training image is an image including the first feature;
inputting the second training image into the second generation model to obtain a fifth training image, and inputting the fifth training image into the first generation model to obtain a sixth training image, wherein the fifth training image is an image including the first feature, and the sixth training image is an image including the second feature;
the training process further includes the following constraints: a first objective function minimization and/or a second objective function minimization;
the first objective function is used for expressing the difference between the fourth training image and the first training image, and the second objective function is used for expressing the difference between the sixth training image and the second training image;
the image processing model further comprises a discrimination model, wherein the discrimination model is used for judging the probability that an input image is an original image and the probability that the input image is an image processed by the first generation model or the second generation model;
the penalty function of the discriminant model is expressed as: ld=lcls(p,t)=-log(pt) T is 0,1,2, wherein ptRepresenting the probability that an image belongs to class t, class 0 representing the input image of the first generative model, class 1 representing the input image of the second generative model, class 2 representing the output image of the first generative model or the second generative model.
5. The apparatus of claim 4, wherein the first characteristic is the presence of a spot on the face of the target object and the second characteristic is the absence of a spot on the face of the target object.
6. The apparatus of claim 4 or 5, wherein the first characteristic is that the target subject is wearing glasses on the face and the second characteristic is that the target subject is not wearing glasses on the face.
7. An image processing method, characterized in that the method comprises:
acquiring a first image, wherein the first image comprises a first feature;
inputting the first image into a first generation model to obtain a second image, wherein the second image comprises a second feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the first generation model is obtained by training by adopting the model training method according to any one of claims 1 to 3.
8. An image processing apparatus, characterized in that the apparatus comprises:
a first image acquisition unit for acquiring a first image, the first image including a first feature;
a second image obtaining unit, configured to input the first image into a first generation model to obtain a second image, where the second image includes a second feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations, and the first generation model is obtained by training using the model training method according to any one of claims 1 to 3.
9. An image processing method, characterized in that the method comprises:
acquiring a third image, wherein the third image comprises a second feature;
inputting the third image into a second generated model to obtain a fourth image, wherein the image comprises a first feature, the first feature and the second feature are features aiming at the same attribute and having different presentations, and the second generated model is obtained by training by adopting the model training method according to any one of claims 1 to 3.
10. An image processing apparatus, characterized in that the apparatus comprises:
a third image acquisition unit configured to acquire a third image, the third image including a second feature;
a fourth image obtaining unit, configured to input the third image into a second generative model to obtain a fourth image, where the image includes a first feature, and the first feature and the second feature are features that are directed to a same attribute and have different presentations, and the second generative model is obtained by training using the model training method according to any one of claims 1 to 3.
CN201910280851.9A 2019-04-09 2019-04-09 Model training method and device, and image processing method and device Active CN110009044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910280851.9A CN110009044B (en) 2019-04-09 2019-04-09 Model training method and device, and image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910280851.9A CN110009044B (en) 2019-04-09 2019-04-09 Model training method and device, and image processing method and device

Publications (2)

Publication Number Publication Date
CN110009044A CN110009044A (en) 2019-07-12
CN110009044B true CN110009044B (en) 2021-09-03

Family

ID=67170545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910280851.9A Active CN110009044B (en) 2019-04-09 2019-04-09 Model training method and device, and image processing method and device

Country Status (1)

Country Link
CN (1) CN110009044B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220929A (en) * 2017-06-23 2017-09-29 深圳市唯特视科技有限公司 A kind of non-paired image method for transformation using the consistent confrontation network of circulation
CN107577985A (en) * 2017-07-18 2018-01-12 南京邮电大学 The implementation method of the face head portrait cartooning of confrontation network is generated based on circulation
CN109426858A (en) * 2017-08-29 2019-03-05 京东方科技集团股份有限公司 Neural network, training method, image processing method and image processing apparatus

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11003995B2 (en) * 2017-05-19 2021-05-11 Huawei Technologies Co., Ltd. Semi-supervised regression with generative adversarial networks
US11734955B2 (en) * 2017-09-18 2023-08-22 Board Of Trustees Of Michigan State University Disentangled representation learning generative adversarial network for pose-invariant face recognition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220929A (en) * 2017-06-23 2017-09-29 深圳市唯特视科技有限公司 A kind of non-paired image method for transformation using the consistent confrontation network of circulation
CN107577985A (en) * 2017-07-18 2018-01-12 南京邮电大学 The implementation method of the face head portrait cartooning of confrontation network is generated based on circulation
CN109426858A (en) * 2017-08-29 2019-03-05 京东方科技集团股份有限公司 Neural network, training method, image processing method and image processing apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks;Jun-Yan Zhu et al;《2017 IEEE International Conference on Computer Vision》;20170330;摘要、第1-3节 *

Also Published As

Publication number Publication date
CN110009044A (en) 2019-07-12

Similar Documents

Publication Publication Date Title
JP7309116B2 (en) Gaze direction identification method, device, electronic device, and storage medium
CN108932693B (en) Face editing and completing method and device based on face geometric information
CN110147744B (en) Face image quality assessment method, device and terminal
EP3579187A1 (en) Facial tracking method, apparatus, storage medium and electronic device
CN103839250B (en) The method and apparatus processing for face-image
JP2020522285A (en) System and method for whole body measurement extraction
CN111738160A (en) Video micro-expression recognition method and device, computer equipment and storage medium
MX2013002904A (en) Person image processing apparatus and person image processing method.
KR102400609B1 (en) A method and apparatus for synthesizing a background and a face by using deep learning network
CN108734078B (en) Image processing method, image processing apparatus, electronic device, storage medium, and program
TWI691937B (en) Method and device for filtering light spot, computer readable storage medium, processor, gaze tracking equipment
WO2023178906A1 (en) Liveness detection method and apparatus, and electronic device, storage medium, computer program and computer program product
CN111160229A (en) Video target detection method and device based on SSD (solid State disk) network
CN112069887A (en) Face recognition method, face recognition device, terminal equipment and storage medium
CN107368817B (en) Face recognition method and device
CN115512014A (en) Method for training expression driving generation model, expression driving method and device
CN113344796A (en) Image processing method, device, equipment and storage medium
WO2022262209A1 (en) Neural network training method and apparatus, computer device, and storage medium
CN110009044B (en) Model training method and device, and image processing method and device
CN114399424A (en) Model training method and related equipment
KR20200019282A (en) Apparatus and method for creating information related to facial expression and apparatus for creating facial expression
CN113269719A (en) Model training method, image processing method, device, equipment and storage medium
CN113139603A (en) Federal learning method based on EMD distance fusion multi-source heterogeneous data
CN114529962A (en) Image feature processing method and device, electronic equipment and storage medium
CN116091891A (en) Image recognition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant