WO2021258920A1 - 生成对抗网络训练方法、图像换脸、视频换脸方法及装置 - Google Patents

生成对抗网络训练方法、图像换脸、视频换脸方法及装置 Download PDF

Info

Publication number
WO2021258920A1
WO2021258920A1 PCT/CN2021/094257 CN2021094257W WO2021258920A1 WO 2021258920 A1 WO2021258920 A1 WO 2021258920A1 CN 2021094257 W CN2021094257 W CN 2021094257W WO 2021258920 A1 WO2021258920 A1 WO 2021258920A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
image
changing
original image
network
Prior art date
Application number
PCT/CN2021/094257
Other languages
English (en)
French (fr)
Inventor
李玉乐
陈德健
项伟
颜乐驹
Original Assignee
百果园技术(新加坡)有限公司
李玉乐
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 李玉乐 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2021258920A1 publication Critical patent/WO2021258920A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the embodiments of the present application relate to the field of image processing technology, for example, to a training method for generating a confrontation network, an image face replacement method, and a video face replacement method and device.
  • Face changing is an important technology in the field of computer vision. Face changing is widely used in content production, movie production, and entertainment video production.
  • Face-changing means that given an original image and a target image, the identity features in the target image are transferred to the original image to obtain a face-changing image, so that the face-changing image not only maintains the identity characteristics of the target image, but also has the characteristics of the original image. Facial postures, facial expressions and other attributes. In addition, face-changing images are required to be true and natural. Related technologies include the following three face-changing methods:
  • Face fusion based on key points of the face this method first obtains the key points of the original image and the face of the target image, and then extracts the face area of the original image through the key points of the original image, and then according to the key points of the target image The face area of the original image is merged into the target image. This method is likely to cause the face of the face-changing image to be unreal and natural.
  • Face change based on 3D face modeling This method reconstructs the 3D model of the original image and the target image respectively, and then extracts the identity features from the 3D model of the target image, and combines the attribute characteristics of the 3D model of the original image to generate the face change Image, the face-changing image generated in this way is also unreal and natural.
  • This method extracts the attribute features from the original image through the neural network, extracts the identity feature from the target image, and then combines the two features and decodes the combined features through the decoder to get the face change Image, the face-changing image generated by this method is real and natural, but it is difficult to maintain the attribute characteristics of the original image and the identity characteristics of the target image at the same time.
  • the face-changing technology in the related art is difficult to obtain a true and natural face-changing image, and it is impossible to maintain the attribute characteristics of the original image and the identity characteristics of the target image in the face-changing image at the same time.
  • the embodiments of the present application provide a method for generating a confrontation network training, an image face-changing method, a video face-changing method, a device, an electronic device, and a storage medium, so as to improve the inability to obtain true and natural face-changing images in the face-changing technology in related technologies.
  • the attribute characteristics of the original image and the identity characteristics of the target image cannot be maintained at the same time.
  • an embodiment of the present application provides a method for training a generative adversarial network, including:
  • the original image and the target image are input into the generator for training to obtain a face-changing image
  • the generator is configured to extract the attribute feature map of the first face from the original image, and from the target image
  • the identity feature of the second face is extracted, and the identity feature is injected into the attribute feature map to generate a hybrid feature map, and the hybrid feature map is decoded according to the identity feature and the attribute feature map to obtain the A face-changing image after the second face replaces the first face;
  • the parameters of the generator and the discriminator are adjusted according to the determination value, the face-changing image, the original image, and the target image.
  • an image face changing method including:
  • the generator is trained by the generative confrontation network training method described in any embodiment of the present application.
  • an embodiment of the present application provides a video face changing method, including:
  • the generator is trained by the generative confrontation network training method described in any embodiment of the present application.
  • an embodiment of the present application provides a training device for generating a confrontation network, including:
  • the original image and target image acquisition module is configured to acquire the original image containing the first human face and the target image containing the second human face;
  • a generator training module configured to input the original image and the target image into the generator for training to obtain a face-changing image, and the generator is configured to extract the attribute feature map of the first face from the original image , Extracting the identity feature of the second face from the target image, and injecting the identity feature into the attribute feature map to generate a hybrid feature map, and comparing the hybrid feature map according to the identity feature and the attribute feature map Decoding the feature map to obtain a face-changing image after the second face replaces the first face;
  • a discriminator training module configured to train the discriminator using the original image and the face-changing image to obtain a judgment value
  • the parameter adjustment module is configured to adjust the parameters of the generator and the discriminator according to the judgment value, the face-changing image, the original image, and the target image.
  • an image face-changing device including:
  • the original image and target image acquisition module is configured to acquire the original image containing the first human face and the target image containing the second human face;
  • An image face-changing module configured to input the original image and the target image into the generator of the generation confrontation network to obtain a face-changing image of the original image after replacing the first face with the second face ;
  • the generator is trained by the generative confrontation network training method described in any embodiment of the present application.
  • an embodiment of the present application provides a video face changing device, including:
  • the face-to-be-changed video data acquisition module is set to acquire the face-to-be-changed video data
  • An original image extraction module configured to extract a video image containing the first human face from the video data as the original image
  • the target image acquisition module is configured to acquire a target image containing the second face
  • the video face-changing module is configured to input the original image and the target image into the generator of the generation confrontation network to obtain the face-changing image of the original image after replacing the first face with the second face ;
  • a face-changing video data generating module configured to generate face-changing video data based on the face-changing image
  • the generator is trained by the generative confrontation network training method described in any embodiment of the present application.
  • an embodiment of the present application provides an electronic device, and the electronic device includes:
  • One or more processors are One or more processors;
  • Storage device set to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the generative confrontation network training method described in any embodiment of the present application, and/or, the image exchange Face method, and/or, video face changing method.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for training a generative adversarial network described in any of the embodiments of the present application is implemented, and/ Or, the image face-changing method, and/or, the video face-changing method.
  • FIG. 1 is a flowchart of steps of a method for generating a confrontation network training provided by an embodiment of the present application
  • FIG. 2A is a flowchart of steps of a method for generating a confrontation network training provided by an embodiment of the present application
  • Fig. 2B is a schematic diagram of a generator of an embodiment of the present application.
  • FIG. 3 is a flowchart of the steps of an image face-changing method provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of steps of a video face changing method provided by an embodiment of the present application.
  • FIG. 5 is a structural block diagram of a training device for generating a confrontation network provided by an embodiment of the present application
  • Fig. 6 is a structural block diagram of an image face-changing device provided by an embodiment of the present application.
  • FIG. 7 is a structural block diagram of a video face changing device provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 1 is a flow chart of the steps of a training method for generating a confrontation network provided by an embodiment of this application.
  • the method can be executed by the training device for generating a confrontation network according to an embodiment of the present application, and the training device for generating a confrontation network may be implemented by hardware or software and integrated in the electronic device provided by the embodiment of the present application.
  • the training method of the generative confrontation network of the embodiment of the present application may include the following steps:
  • the original image and the target image are images including human faces, where the original image is an image that needs to be replaced with a human face, and the target image is an image used to change the face of the original image.
  • the original image may be an image containing the first face extracted from the video data
  • the target image may be an image containing the second face.
  • the first face and the second face are different faces.
  • a large number of images of different faces can be obtained, two images are randomly selected as the original image and the target image, and an image pair is formed by the original image and the target image as a training sample.
  • Generative Adversarial Networks include a generator (Generator) and a discriminator (Discriminator), the generator is set to input data to generate a new image, and the discriminator is set to determine whether the new image is real Probability.
  • the generator and the discriminator may be a neural network, and the network parameters of the generator and the discriminator for generating the confrontation network can be initialized.
  • the generator may include an encoding network, a decoding network, an identity extraction network, and a residual network, where the identity extraction network and the residual network may be pre-trained networks, and the initialization generator may be an initialization The network parameters of the encoding network and decoding network of the generator.
  • the generator may include an encoding network, a decoding network, an identity extraction network, and a residual network.
  • the attribute feature map of the first face can be extracted from the original image through the encoding network
  • the identity extraction The network extracts the identity feature of the second face from the target image, and uses the residual network to inject the identity feature into the attribute feature map to generate a hybrid feature map, and decode the hybrid feature map according to the identity feature and the attribute feature map in the decoding network ,
  • the attribute feature map can be a feature map expressing attributes such as the facial posture and facial expression of the first face
  • the identity feature can be able to identify the first face Information about the identity of the two faces.
  • the purpose of the generator is to generate real images as much as possible to deceive the discriminator.
  • the purpose of the discriminator is to distinguish the image generated by the generator from the real image.
  • the training generator and the discriminator constitute a dynamic "game process”, and finally the image generated by the trained generator is enough to "fake the real", that is, infinitely close to the real image.
  • the original image and the face-changing image are input to the discriminator to train the discriminator, and the judgment value of the face-changing image is obtained after each input, where the judgment value may be the probability that the face-changing image belongs to the real image.
  • the total loss can be calculated according to the judgment value, the face-changing image, the original image and the target image, and the training of the generator and the discriminator is stopped when the total loss is less than the preset value, otherwise, the discriminator is adjusted according to the total loss And the network parameters of the generator, and then start a new round of training iterative update until the total loss is less than the preset value.
  • the trained confrontation generation network is obtained, and the original image and the target image are input into the generator of the confrontation generation network After that, the generator automatically outputs the face-changing image.
  • the confrontation loss and the key point loss can be calculated according to the judgment value
  • the identity feature loss can be calculated according to the target image and the face-changing image
  • the two original images are input into the generator to obtain
  • the reconstruction loss is calculated according to the original image and the self-changing face image
  • the sum of the confrontation loss, reconstruction loss, key point loss and identity feature loss is calculated to obtain the total loss
  • the gradient is calculated according to the total loss
  • the generative confrontation network in the embodiment of the application includes a generator and a discriminator.
  • the generator extracts the attribute feature map of the first face from the original image, extracts the identity feature of the second face from the target image, and injects the identity feature into the attribute feature Generate a mixed feature map in the figure, and decode the mixed feature map according to the identity feature and attribute feature map to obtain a face-changing image after replacing the first face with the second face.
  • the original image and the face-changing image are input to the discriminator for training Obtain the judgment value; adjust the parameters of the generator and the discriminator according to the judgment value, the face-changing image, the original image and the target image until a trained generative confrontation network is obtained, which realizes the combination of the attribute characteristics of the original image and the target image in the decoding process
  • the identity feature of the target image decodes the mixed feature map, so that the face-changing image better maintains the original image's facial posture, facial expression and other attributes, and the identity features in the target image are also better integrated into the face-changing image. Enhance the transfer ability of the identity feature of the target image.
  • the trained generator of the generation confrontation network is used to change the face of the image or video, the obtained face change image or video is real and natural, and can maintain the attribute characteristics of the original image and the target image Identity characteristics.
  • FIG. 2A is a flow chart of the steps of a training method for generating a confrontational network provided by an embodiment of this application.
  • the embodiment of this application is refined on the basis of the foregoing embodiment.
  • the network training method may include the following steps:
  • S202 Initialize the parameters of the discriminator that generates the confrontation network, the parameters of the encoding network and the decoding network of the generator, and obtain the trained residual network and the identity extraction network used in the generator.
  • the generated confrontation network includes a discriminator and a generator
  • the generator may include an encoding network, a decoding network, a residual network, and an identity extraction network.
  • the residual network and the identity extraction network may be pre-trained.
  • the initialization referred to in the embodiments of this application can be to initialize the parameters of the discriminator, encoding network, and decoding network. In one embodiment, it can be to construct the network structure of the discriminator, encoding network, and decoding network, and to set the network parameters of the network structure. .
  • the discriminator, encoding network, and decoding network may be various neural networks.
  • FIG. 2B is a schematic diagram of the generator.
  • the generator 30 includes an encoding network 301, a decoding network 302, an identity extraction network 303, and a residual module 304.
  • the encoding network 301 and the decoding network 302 may be symmetrical.
  • Convolutional neural network and deconvolutional neural network, the residual module 304 is connected between the encoding network 301 and the decoding network 302, the original image 10 and the target image 20 are input to the generator 30 to obtain the face-changing image 40.
  • the original image may be preprocessed to obtain the preprocessed original image, and then the preprocessed original image is input into the coding network to obtain the downsampling feature of each downsampling convolutional layer output picture.
  • the preprocessing includes adjusting the image size, and the down-sampled feature map output by the last down-sampling convolutional layer of the coding network is the attribute feature map of the first face.
  • the encoding network 301 may be a network including multiple down-sampling convolutional layers. After the original image 10 is cropped into an image of a specified size, the cropped original image is input to the next Sampling convolutional layer, each downsampling convolutional layer samples and encodes the cropped original image to output a downsampling feature map, and input the downsampling feature map to the next sampling convolutional layer, the last layer of the encoding network
  • the down-sampling feature map output by the down-sampling convolutional layer is the attribute feature map of the first face F H ⁇ W ⁇ D , H and W are the height and width of the attribute feature map, D is the number of channels, and for each The down-sampled convolutional layer outputs a down-sampled feature map As shown in FIG. 2B, the encoding network 301 finally outputs the attribute feature map 50 of the first face.
  • the identity feature may refer to information that can distinguish the identities of two human faces belonging to different characters.
  • the identity extraction network may be a pre-trained network, for example, a pre-trained convolutional neural network CNN, Recurrent neural network RNN, deep neural network DNN, etc.
  • the identity feature F ID of the second face can be extracted.
  • the identity feature F ID can be a one-dimensional vector that contains the identity information of the face, as shown in Figure 2B, After the target image 20 is input into the identity extraction network 303, the identity feature 60 is obtained.
  • the identity feature can be converted first to obtain the identity feature mean value and identity feature variance of the identity feature, and the identity feature mean value, identity feature variance, and attribute feature map are input into the residual network to pass the residual
  • the difference network transfers the identity feature to the attribute feature map to obtain a hybrid feature map.
  • the identity feature 60 can output the identity feature mean ⁇ and identity feature variance ⁇ after passing through a fully connected layer 305, and the identity feature mean ⁇ , identity feature variance ⁇ , and attribute feature 50 can be input into the residual network 304 together.
  • Hybrid feature map 70 As shown in Figure 2B, the identity feature 60 can output the identity feature mean ⁇ and identity feature variance ⁇ after passing through a fully connected layer 305, and the identity feature mean ⁇ , identity feature variance ⁇ , and attribute feature 50 can be input into the residual network 304 together.
  • Hybrid feature map 70 Hybrid feature map 70.
  • the residual network may be a residual module (AdaIN ResBlk) of adaptive instance normalization, and the residual network may describe a style picture as the mean and variance of the feature map, by changing the content picture
  • ⁇ _y1 and ⁇ _y1 are the identity feature average value and identity feature variance to be injected.
  • x is the identity feature
  • y is the attribute feature map
  • AdaIN(x, y) is the mixed feature map.
  • the identity feature of the second face can be injected into the attribute feature map of the first face through the residual network, so that the identity feature of the second face is used to replace the identity feature of the first face, which can be retained
  • the posture, expression and other information of the first face in the original image are combined to realize the combination of the attribute features of the first face in the original image and the identity features of the second face in the target image.
  • the hybrid feature map and the identity feature are spliced to obtain the spliced feature, and the spliced feature is input into the decoding network, and the multi-layer upsampling convolutional layer performs sampling processing to obtain a face change Image, in which, for each up-sampled convolutional layer in the decoding network, determine the down-sampled convolutional layer corresponding to the up-sampled convolutional layer in the encoding network, obtain the down-sampled feature map output by the down-sampled convolutional layer, and obtain The up-sampling feature output by the previous up-sampling convolutional layer of the up-sampling convolutional layer, and the down-sampling feature map and the up-sampling feature are stitched together as the decoding object of the up-sampling convolutional layer.
  • the identity features extracted by the identity extraction network 303 are spliced to the mixed feature map 70 to obtain spliced features, thereby improving the second face
  • the decoding network 302 can better maintain the identity information of the second face after decoding the splicing feature.
  • the intermediate feature of the decoding network is connected to the feature layer of the decoding network through a cross-connection method through a straddle connection.
  • the decoding network and the encoding network are symmetrical Up-sampling convolutional neural network and down-sampling convolutional neural network, in the coding network, each down-sampling convolutional layer outputs down-sampling features
  • each up-sampling convolutional layer has input up-sampling features
  • the down-sampled convolutional layer corresponding to the up-sampled convolutional layer i in the encoding network can be determined, and the down-sampling feature of the output of the down-sampled convolutional layer can be obtained will and Upsampling is performed after addition, and the upsampling feature is output As input to the next up-sampled convolutional layer.
  • the intermediate features output by the down-sampling convolutional layer of the encoding network are input into the up-sampling convolutional layer of the decoding network through the cross-connection operation, so that the attribute characteristics of the first face in the original image are more improved. It blends well into the face-changing image, and the face-changing image is more real and natural.
  • the generator and the discriminator are alternately trained.
  • the generator is first trained to obtain the face-changing image, and then the face-changing image and the original image are used to train the discriminator, and then the generator is trained, and thus alternate training to generate a confrontation network, where one training
  • the generator and the primary discriminator are one round of training. After each round of training, the generator generates a face-changing image, and the discriminator discriminates the face-changing image to obtain a judgment value, which can be the probability that the face-changing image belongs to a real image.
  • the total loss may be the sum of the counter loss, reconstruction loss, key point loss, and identity feature loss.
  • the counter loss, reconstruction loss, key point loss, and identity feature loss can be calculated first, and then the countermeasure
  • the total loss is obtained by summing the loss, reconstruction loss, key point loss, and identity feature loss.
  • the following sub-steps can be included:
  • S2081 Calculate the confrontation loss and the key point loss according to the judgment value, the original image and the face-changing image.
  • the confrontation loss can be calculated according to the judgment value and the preset confrontation loss function, the key points of the face in the original image and the face-changing image are obtained, and the distance between the key points of the face in the original image and the face-changing image is calculated. Key point loss.
  • the counter loss gan_loss is:
  • G(X i ) is the face-changing image generated by the generator
  • D(G(X i )) is the judgment value for the discriminator to determine that the face-changing image G(X i ) is a real image.
  • the face key points of the original image and the face key points of the face-changing image can be extracted through the pre-trained facial pose evaluation network, and then the face key points and the original face of the face-changing image can be constrained.
  • the face key points of the image are similar.
  • the face key point coordinates lmks_gen of the face-changing image and the face key point coordinates lmks_src of the original image can be obtained, and the key point loss lmks_loss is:
  • the embodiment of the application constrains the face-changing image by calculating the key point loss, so that the face key points of the face-changing image are similar to the face key points of the original image, so that the face-changing image better maintains the face of the original image Attribute features such as facial expressions and facial gestures.
  • the identity extraction network is a pre-trained network, and the target image and the face-changing image can be input into the identity extraction network respectively, so as to extract the identity features of the face in the target image and change the identity through the identity extraction network.
  • the identity feature of the face in the face image is calculated, and the distance between the identity feature of the face in the target image and the identity feature of the face in the face-changing image is calculated to obtain the identity feature loss.
  • the identity feature of the face-changing image can be recorded as FeatID gen , remember the identity feature of the target image as FeatID target , then the identity feature loss ID_loss is:
  • ID_loss
  • the identity feature of the face-changing image can be constrained to be more similar to the identity characteristic of the target image, so that the face-changing image can better maintain the target image Identity characteristics.
  • the original image can be input into the encoding network of the generator and the identity extraction network at the same time, and the generator generates a self-changing face image of the original image, that is, the generator generates a face that uses one original image to replace another.
  • a face-changing image of a human face in an original image and then calculating the reconstruction loss of the reconstructed original image.
  • the reconstruction loss recon_loss is:
  • the above formula is the difference between the pixel value of the pixel point in the same position in the original image and the self-changing face image, and the generator's parameters are constrained and adjusted by calculating the reconstruction loss of the generator, so that the face-changing image generated by the generator is better
  • the attributes of the original image are kept, and the face-changing image is more real and natural.
  • the total loss is the sum of the combat loss, reconstruction loss, key point loss, and identity feature loss, that is, the total loss total_loss is:
  • the generative confrontation network includes a generator and a discriminator.
  • the generator and discriminator can be alternately trained to train the generative confrontation network.
  • the parameters of the generator and the discriminator are adjusted by calculating the total loss.
  • the total loss is calculated after the end of one round of alternate training, and it is judged whether the total loss is less than the preset threshold. If the total loss is less than the preset threshold, it means that the accuracy of the generator is high enough and the face-changing image generated by the generator is sufficient If the discriminator is deceived, the training of the generator and the discriminator can be stopped.
  • the discriminator can still identify the true or false of the face-changing image generated by the generator. Adjust the parameters of the discriminator against the loss, and adjust the parameters of the encoder and the decoder in the generator according to the total loss, and return to S203 to alternately train the generator and the discriminator until the stop iteration training condition is met.
  • the parameters can be updated through a gradient descent algorithm, where the gradient descent algorithm can be a stochastic gradient descent method SGD or other gradient descent methods.
  • the gradient descent algorithm can be a stochastic gradient descent method SGD or other gradient descent methods.
  • SGD stochastic gradient descent method
  • the original image is encoded by the encoding network to obtain the attribute feature map of the first face
  • the target image is input into the identity extraction network to extract the identity features of the second face
  • the residual network is used to
  • the identity feature is injected into the attribute feature map to obtain the hybrid feature map
  • the hybrid feature map is decoded by the decoding network to obtain the face-changing image
  • the original image and the face-changing image are input to the discriminator for training to obtain the judgment value Calculate the total loss according to the judgment value, the face-changing image, the original image and the target image, and adjust the parameters of the generator and the discriminator according to the total loss, so that the attribute characteristics of the original image and the identity characteristics of the target image are combined in the decoding process.
  • the mixed feature map is decoded, so that the face-changing image better maintains the original image's facial posture, facial expressions and other attributes.
  • the identity features in the target image are also better integrated into the face-changing image, which enhances the target image
  • the obtained face-changing image or video is true and natural, and can maintain the attribute characteristics of the original image and the identity characteristics of the target image.
  • the total loss includes the key point loss.
  • the key point loss restricts the facial key points of the face-changing image to be similar to the face key points of the original image, so that the face-changing image better maintains the facial expression of the original image Attributes and features such as face pose.
  • the total loss includes the identity feature loss, and the identity feature of the face-changing image is constrained by the identity feature loss to be more similar to the identity feature of the target image, so that the face-changing image better maintains the identity characteristics of the target image.
  • FIG. 3 is a flow chart of the steps of an image face-changing method provided by an embodiment of this application.
  • the embodiment of this application can be applied to the situation of changing the human face in the image.
  • This method can be implemented by the image face-changing device of this embodiment
  • Execution, the image face-changing device can be implemented by hardware or software and integrated into the electronic device provided in the embodiment of the present application.
  • the image-changing method of the embodiment of the present application It can include the following steps:
  • the user replaces the first face in the original image with the second face in the target image, so that the changed face image can maintain the identity characteristics of the second face and the posture of the first face , Expressions and other attributes.
  • the original image is the image that the user needs to change his face
  • the target image can be an image containing the user’s face.
  • an interactive interface may be provided, which provides the user with an operation to determine the original image and the target image.
  • the user can specify the original image and the target image in the interactive interface.
  • the interactive interface may provide images
  • the upload operation the user can upload the original image and the target image in the interactive interface.
  • the interactive interface first prompts the user to upload the original image, and then prompts the user to upload the target image.
  • the original image and target image specified by the user can also be obtained through other interactive operations.
  • the embodiments of this application do not impose restrictions on this.
  • the generator may be a neural network that replaces the first face in the original image with the second face in the target image.
  • the generator can be obtained by training to generate a confrontation network, wherein the generator can be obtained through the aforementioned
  • the training method for generating confrontation network provided by the embodiment is trained. For details of training, please refer to the foregoing embodiment, which will not be described in detail here.
  • the original image can be input to the generator’s encoding network to extract the attribute feature map of the first face
  • the target image can be input to the generator’s identity extraction network to extract the identity feature of the second face
  • the residual network injects the identity feature into the attribute feature map to generate a hybrid feature map.
  • the encoder network of the generator decodes the hybrid feature map according to the identity feature and the attribute feature map to obtain a face-changing image after replacing the first face with the second face .
  • the original image and the target image are input into the generator of the generation confrontation network, and the first face is replaced by the second face
  • the generator of the embodiment of this application realizes that the attribute feature of the original image and the identity feature of the target image are combined to decode the mixed feature map during the decoding process, so that the face-changing image better maintains the original image.
  • Attribute features such as face pose and facial expressions, and the identity features in the target image are also better integrated into the face-changing image, which enhances the transfer ability of the identity feature of the target image, and adopts a trained generator that generates a confrontation network
  • the obtained face-changing image is real and natural, and can maintain the attribute characteristics of the original image and the identity characteristics of the target image.
  • the video face changing method of the embodiment of the present application It can include the following steps:
  • the video data to be changed may be short video data including human faces, live video data, movie video data, etc.
  • the video data to be changed may include one or more human faces.
  • the user can specify the video data to be changed in the provided face-changing editing interactive interface, for example, upload the video data to be changed or enter the address of the video data to be changed, which can be the local storage of the video data to be changed The address can also be the network address of the video data to be changed.
  • S402 Extract a video image containing the first human face from the video data as an original image.
  • face detection can be performed on the face in the video data.
  • the video image is extracted as the original image, where the first face It may be a face specified by the user.
  • the user may be prompted to specify a face in the video data as the first face.
  • the target image is an image used to replace a human face in the original image, and the target image includes a second human face.
  • the target image may be a self-portrait image of the user, of course, it may also be other specified by the user.
  • the first human face and the second human face are different human faces.
  • the generator may be a neural network that replaces the first face in the original image with the second face in the target image.
  • the generator can be obtained by training to generate a confrontation network, wherein the generator can be obtained through the aforementioned
  • the training method for generating confrontation network provided by the embodiment is trained. For details of training, please refer to the foregoing embodiment, which will not be described in detail here.
  • the original image can be input to the generator’s encoding network to extract the attribute feature map of the first face
  • the target image can be input to the generator’s identity extraction network to extract the identity feature of the second face
  • the residual network injects the identity feature into the attribute feature map to generate a hybrid feature map.
  • the encoder network of the generator decodes the hybrid feature map according to the identity feature and the attribute feature map to obtain a face-changing image after replacing the first face with the second face .
  • S405 Generate face-changing video data based on the face-changing image.
  • the face-changing image of each original image after the face-changing face can be video-encoded according to the preset frame rate and bit rate to obtain the video data after the face-changing, the video data after the face-changing The face in, maintains the identity characteristics of the second face and the attributes and characteristics of the first face's posture and expression.
  • the original image containing the first face is extracted from the video data, and the target image containing the second face is obtained, and the original image and the target image are input into the generator of the confrontation network
  • the face-changing image of the original image after replacing the first face with the second face is obtained, and the face-changing video data is generated based on the face-changing image.
  • the generator of the embodiment of the present application realizes that the attribute feature of the original image and the identity feature of the target image are combined to decode the mixed feature map during the decoding process, so that the face-changing image better maintains the facial posture, facial expression, etc.
  • the identity features in the target image are better integrated into the face-changing image, which enhances the transfer ability of the identity feature of the target image.
  • the generator of the trained generation confrontation network is used to change the face of the video data, the result is The video data after the face change is real and natural, and can maintain the attribute characteristics of the face in the video data and the identity characteristics of the face in the target image.
  • FIG. 5 is a structural block diagram of a training apparatus for a generative confrontation network provided by an embodiment of the present application.
  • the training apparatus for a generative confrontation network may include the following modules:
  • the original image and target image obtaining module 501 is configured to obtain the original image containing the first face and the target image containing the second face;
  • the generator training module 503 is configured to input the original image and the target image into the generator for training to obtain a face-changing image, and the generator is configured to extract the attribute features of the first face from the original image Figure, extract the identity feature of the second face from the target image, and inject the identity feature into the attribute feature map to generate a mixed feature map, and compare the identity feature and the attribute feature map to the Decoding the mixed feature map to obtain a face-changing image after the second face replaces the first face;
  • a discriminator training module 504 configured to train the discriminator to obtain a judgment value by using the original image and the face-changing image
  • the parameter adjustment module 505 is configured to adjust the parameters of the generator and the discriminator according to the judgment value, the face-changing image, the original image, and the target image.
  • the generative confrontation network training device provided by the embodiment of this application can execute the generative confrontation network training method provided by the embodiment of this application, and has the corresponding functional modules and beneficial effects of the execution method.
  • Fig. 6 is a structural block diagram of an image face changing device provided by an embodiment of the present application. As shown in Fig. 6, the image face changing device of an embodiment of the present application may include the following modules:
  • the original image and target image acquiring module 601 is configured to acquire the original image containing the first face and the target image containing the second face;
  • the image face-changing module 602 is configured to input the original image and the target image into the generator of the generation confrontation network to obtain the face-changing of the original image after replacing the first face with the second face image;
  • the generator is trained by the generating confrontation network training method described in the embodiment of the present application.
  • the image face-changing device provided by the embodiment of the present application can execute the image face-changing method provided by the embodiment of the present application, and has the functional modules and beneficial effects corresponding to the execution method.
  • FIG. 7 is a structural block diagram of a video face changing device provided by an embodiment of the present application. As shown in FIG. 7, the video face changing device of an embodiment of the present application may include the following modules:
  • the face-to-be-changed video data acquisition module 701 is configured to acquire the face-to-be-changed video data
  • the original image extraction module 702 is configured to extract a video image containing the first human face from the video data as an original image
  • the target image acquisition module 703 is configured to acquire a target image containing the second face
  • the video face-changing module 704 is configured to input the original image and the target image into the generator generating the confrontation network to obtain the face-changing of the original image after replacing the first face with the second face image;
  • the face-changing video data generating module 705 is configured to generate face-changing video data based on the face-changing image
  • the generator is trained by the generating confrontation network training method described in the embodiment of the present application.
  • the video face changing device provided by the embodiment of the present application can execute the video face changing method provided by the embodiment of the present application, and has functional modules and beneficial effects corresponding to the execution method.
  • the electronic device may include: a processor 801, a storage device 802, a display screen 803 with a touch function, an input device 804, an output device 805, and a communication device 806.
  • the number of processors 801 in the device may be one or more. In FIG. 8, one processor 801 is taken as an example.
  • the processor 801, the storage device 802, the display screen 803, the input device 804, the output device 805, and the communication device 806 of the device may be connected through a bus or other methods. In FIG. 8, the connection through a bus is taken as an example.
  • the device is configured to execute the training method for generating a confrontation network as provided in any embodiment of the present application, and/or the image-changing method, and/or the video-changing method.
  • An embodiment of the present application also provides a computer-readable storage medium, where instructions in the storage medium are executed by a processor of an electronic device, so that the electronic device can execute the method for generating a confrontational network training as described in the foregoing method embodiment, and /Or, the image face-changing method, and/or, the video face-changing method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本申请实施例公开了一种生成对抗网络训练方法、图像换脸、视频换脸方法及装置,包括:获取原图像和目标图像;初始化生成对抗网络的生成器和判别器;将原图像和目标图像输入生成器中训练获得换脸图像,其中,生成器从原图像提取第一人脸的属性特征图,从目标图像中提取第二人脸的身份特征,身份特征注入属性特征图中生成混合特征图,根据身份特征和属性特征图对混合特征图进行解码得到第二人脸替换第一人脸后的换脸图像;将原图像和换脸图像输入判别器训练得到判定值;根据判定值、换脸图像、原图像和目标图像对生成器和判别器进行调整。

Description

生成对抗网络训练方法、图像换脸、视频换脸方法及装置
本申请要求在2020年6月24日提交中国专利局、申请号为202010592443.X的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像处理技术领域,例如涉及一种生成对抗网络训练方法、图像换脸方法、视频换脸方法及装置。
背景技术
随着短视频、直播等视频应用的普及,换脸是计算机视觉领域的一项重要技术,换脸被广泛用于内容生产、电影制作、娱乐视频制作等。
换脸是指给定一张原图像和目标图像,将目标图像中的身份特征迁移到原图像中得到换脸图像,使得该换脸图像既保持目标图像的身份特征,同时又具有原图像的脸部姿态、脸部表情等属性特征,另外需求换脸图像真实自然,相关技术中包括以下三种换脸方式:
1)基于脸部关键点的脸部融合换脸,该方式先获取原图像和目标图像的脸部关键点,然后通过原图像关键点提取原图像的脸部区域,再根据目标图像的关键点将原图像的脸部区域融合到目标图像中,此方式容易造成换脸图像的脸部不真实自然。
2)基于3D脸部建模换脸,该方式分别对原图像和目标图像重建3D模型,然后从目标图像的3D模型中提取身份特征,并结合原图像的3D模型的属性特征来生成换脸图像,该方式生成的换脸图像同样不真实自然。
3)基于对抗生成网络换脸,该方式通过神经网络从原图像中提取属性特征,从目标图像中提取身份特征,然后将两个特征结合后,通过解码器来解码结合后的特征得到换脸图像,该方法生成的换脸图像真实自然,但是比较难同时保持原图像的属性特征和目标图像的身份特征。
综上所述,相关技术中的换脸技术难以获得真实自然的换脸图像,并且换脸图像中无法同时保持原图像的属性特征和目标图像的身份特征。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供一种生成对抗网络训练方法、图像换脸方法、视频换脸方法、装置、电子设备和存储介质,以改善相关技术中的换脸技术中无法获得真实自然的换脸图像,且换脸图像中无法同时保持原图像的属性特征和目标图像的身份特征的情况。
第一方面,本申请实施例提供了一种生成对抗网络训练方法,包括:
获取包含第一人脸的原图像和包含第二人脸的目标图像;
初始化生成对抗网络的生成器和判别器;
将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像,所述生成器设置为从所述原图像提取第一人脸的属性特征图,从所述目标图像中提取第二人脸的身份特征,并将所述身份特征注入所述属性特征图中生成混合特征图,以及根据所述身份特征和所述属性特征图对所述混合特征图进行解码得到所述第二人脸替换所述第一人脸后的换脸图像;
将所述原图像和所述换脸图像输入所述判别器进行训练得到判定值;
根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整。
第二方面,本申请实施例提供了一种图像换脸方法,包括:
获取包含第一人脸的原图像和包含第二人脸的目标图像;
将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
其中,所述生成器通过本申请任一实施例所述的生成对抗网络训练方法所训练。
第三方面,本申请实施例提供了一种视频换脸方法,包括:
获取待换脸视频数据;
从所述视频数据中提取包含第一人脸的视频图像作为原图像;
获取包含第二人脸的目标图像;
将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
基于所述换脸图像生成换脸后的视频数据;
其中,所述生成器通过本申请任一实施例所述的生成对抗网络训练方法所训练。
第四方面,本申请实施例提供了一种生成对抗网络训练装置,包括:
原图像和目标图像获取模块,设置为获取包含第一人脸的原图像和包含第二人脸的目标图像;
生成对抗网络初始化模块,设置为初始化生成对抗网络的生成器和判别器;
生成器训练模块,设置为将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像,所述生成器设置为从所述原图像提取第一人脸的属性特征图,从所述目标图像中提取第二人脸的身份特征,并将所述身份特征注入所述属性特征图中生成混合特征图,以及根据所述身份特征和所述属性特征图对所述混合特征图进行解码得到所述第二人脸替换所述第一人脸后的换脸图像;
判别器训练模块,设置为采用所述原图像和所述换脸图像训练所述判别器得到判定值;
参数调整模块,设置为根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整。
第五方面,本申请实施例提供了一种图像换脸装置,包括:
原图像和目标图像获取模块,设置为获取包含第一人脸的原图像和包含第二人脸的目标图像;
图像换脸模块,设置为将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
其中,所述生成器通过本申请任一实施例所述的生成对抗网络训练方法所训练。
第六方面,本申请实施例提供了一种视频换脸装置,包括:
待换脸视频数据获取模块,设置为获取待换脸视频数据;
原图像提取模块,设置为从所述视频数据中提取包含第一人脸的视频图像作为原图像;
目标图像获取模块,设置为获取包含第二人脸的目标图像;
视频换脸模块,设置为将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
换脸视频数据生成模块,设置为基于所述换脸图像生成换脸后的视频数据;
其中,所述生成器通过本申请任一实施例所述的生成对抗网络训练方法所 训练。
第七方面,本申请实施例提供了一种电子设备,所述电子设备包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本申请任一实施例所述的生成对抗网络训练方法,和/或,图像换脸方法,和/或,视频换脸方法。
第八方面,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本申请任一实施例所述的生成对抗网络训练方法,和/或,图像换脸方法,和/或,视频换脸方法。
附图说明
图1是本申请一实施例提供的一种生成对抗网络训练方法的步骤流程图;
图2A是本申请一实施例提供的一种生成对抗网络训练方法的步骤流程图;
图2B是本申请实施例的生成器的示意图;
图3是本申请一实施例提供的一种图像换脸方法的步骤流程图;
图4是本申请一实施例提供的一种视频换脸方法的步骤流程图;
图5是本申请一实施例提供的一种生成对抗网络训练装置的结构框图;
图6是本申请一实施例提供的一种图像换脸装置的结构框图;
图7是本申请一实施例提供的一种视频换脸装置的结构框图;
图8是本申请一实施例提供的一种电子设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作详细说明。可以理解的是,此处所描述的示例实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互结合。
图1为本申请一实施例提供的一种生成对抗网络训练方法的步骤流程图,本申请实施例可适用于训练生成对抗网络,以通过训练好的生成对抗网络的生成器以对图像或者视频进行换脸的情况,该方法可以由本申请实施例的生成对 抗网络训练装置来执行,该生成对抗网络训练装置可以由硬件或软件来实现,并集成在本申请实施例所提供的电子设备中,在一实施例中,如图1所示,本申请实施例的生成对抗网络训练方法可以包括如下步骤:
S101、获取包含第一人脸的原图像和包含第二人脸的目标图像。
本申请实施例中,原图像和目标图像为包括人脸的图像,其中,原图像是需要更换人脸的图像,目标图像是用来对原图像进行换脸的图像。在本申请的一个示例中,原图像可以是视频数据中提取的包含第一人脸的图像,目标图像可以是包含第二人脸的图像,第一人脸和第二人脸为不同人脸,在实际应用中,可以获取大量不同人脸的图像,从中随机抽取两张图像作为原图像和目标图像,由原图像和目标图像构成一个图像对作为训练样本。
S102、初始化生成对抗网络的生成器和判别器。
本申请实施例中,生成对抗网络(Generative Adversarial Networks,GANs)包括生成器(Generator)和判别器(Discriminator),生成器设置为输入数据后生成新图像,判别器设置为判别新图像是否真实的概率。在本申请实施例中,生成器和判别器可以是神经网络,则可以初始化生成对抗网络的生成器和判别器的网络参数。
在本申请的示例实施例中,生成器可以包括编码网络、解码网络、身份提取网络和残差网络,其中,身份提取网络和残差网络可以是预先训练好的网络,初始化生成器可以是初始化生成器的编码网络和解码网络的网络参数。
S103、将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像。
在本申请实施例中,生成器可以包括编码网络、解码网络、身份提取网络和残差网络,在生成器中,可以通过编码网络从原图像提取第一人脸的属性特征图,通过身份提取网络从目标图像中提取第二人脸的身份特征,并采用残差网络将身份特征注入属性特征图中生成混合特征图,以及在解码网络中根据身份特征和属性特征图对混合特征图进行解码,得到第二人脸替换第一人脸后的换脸图像,其中,属性特征图可以是表达第一人脸的人脸姿态、人脸表情等属性的特征图,身份特征可以是能够识别第二人脸的身份的信息。
S104、将所述原图像和所述换脸图像输入所述判别器进行训练得到判定值。
在对抗生成网络的训练过程中,生成器的目的是尽量生成真实的图像去欺骗判别器,判别器的目的把生成器生成的图像和真实的图像区分,通过交替训 练生成器和判别器,使得训练生成器和判别器构成动态的“博弈过程”,最终训练好的生成器生成的图像足以“以假乱真”,即无限接近真实的图像。本申请实施例中,将原图像和换脸图像输入判别器来训练判别器,每次输入后均获得换脸图像的判定值,其中,判定值可以是换脸图像属于真实图像的概率。
S105、根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整。
在一实施例中,可以根据判定值、换脸图像、原图像和目标图像计算总损失,在总损失小于预设值时停止对生成器和判别器进行训练,否则,根据总损失调整判别器和生成器的网络参数,然后开始新一轮的训练迭代更新直到总损失小于预设值为止,停止训练后得到训练好的对抗生成网络,在将原图像和目标图像输入对抗生成网络的生成器后,生成器自动输出换脸图像。
在本申请的示例实施例中,可以根据判定值、原图像和换脸图像计算对抗损失和关键点损失,根据目标图像和换脸图像计算身份特征损失,将两张原图像输入生成器中获得原图像的自换脸图像后,根据原图像和自换脸图像计算重构损失,计算对抗损失、重构损失、关键点损失和身份特征损失的和值得到总损失,并根据总损失求梯度以对生成器的编码网络和解码网络的参数进行调整,以及根据对抗损失求梯度以对判别器的参数进行调整。
本申请实施例的生成对抗网络包括生成器和判别器,生成器从原图像提取第一人脸的属性特征图,从目标图像中提取第二人脸的身份特征,并将身份特征注入属性特征图中生成混合特征图,以及根据身份特征和属性特征图对混合特征图进行解码得到第二人脸替换第一人脸后的换脸图像,在将原图像和换脸图像输入判别器进行训练得到判定值;根据判定值、换脸图像、原图像和目标图像对生成器和判别器的参数进行调整直到获得训练好的生成对抗网络,实现了解码过程中结合原图像的属性特征和目标图像的身份特征对混合特征图进行解码,使得换脸图像更好的保持原图像的人脸姿态、人脸表情等属性特征,同时目标图像中的身份特征也更好的融合到换脸图像中,增强了目标图像的身份特征的迁移能力,采用训练好的生成对抗网络的生成器对图像或者视频换脸时,得到的换脸图像或者视频真实自然,并且能够保持原图像的属性特征和目标图像的身份特征。
图2A为本申请一实施例提供的一种生成对抗网络训练方法的步骤流程图,本申请实施例在前述实施例的基础上进行细化,如图2A所示,本申请实施例的 生成对抗网络训练方法可以包括如下步骤:
S201、获取包含第一人脸的原图像和包含第二人脸的目标图像。
S202、初始化生成对抗网络的判别器的参数、生成器的编码网络和解码网络的参数,以及获取训练好的用于所述生成器中的残差网络和身份提取网络。
本申请实施例中,生成对抗网络包括判别器和生成器,生成器可以包括编码网络、解码网络、残差网络和身份提取网络,其中,残差网络和身份提取网络可以是预先训练好的多种神经网络。本申请实施例所指的初始化可以是初始化判别器、编码网络和解码网络的参数,在一实施例中,可以是构建判别器、编码网络和解码网络的网络结构,并设置网络结构的网络参数。在本申请实施例中,判别器、编码网络和解码网络可以是多种神经网络。
如图2B所示为生成器的示意图,在图2B中,生成器30包括编码网络301、解码网络302、身份提取网络303、残差模块304,其中编码网络301和解码网络302可以是对称的卷积神经网络和反卷积神经网络,残差模块304连接在编码网络301和解码网络302之间,原图像10和目标图像20输入生成器30后可以获得换脸图像40。
S203、采用所述编码网络对所述原图像进行编码处理得到所述第一人脸的属性特征图。
在本申请实施例中,可以先对原图像进行预处理,获得预处理后的原图像,再将预处理后的原图像输入编码网络中,获得每个下采样卷积层输出的下采样特征图。其中,预处理包括调整图像尺寸,编码网络的最后一层下采样卷积层输出的下采样特征图即为第一人脸的属性特征图。
示例性地,如图2B所示,编码网络301可以是一个包括多个下采样卷积层的网络,在将原图像10裁剪为指定大小尺寸的图像后,将裁剪后的原图像输入到下采样卷积层,每层下采样卷积层对裁剪后的原图像进行采样编码处理输出下采样特征图,并将该下采样特征图输入到下一下采样卷积层中,编码网络最后一层下采样卷积层输出的下采样特征图即为第一人脸的属性特征图F H×W×D,H和W分别为属性特征图的高和宽,D为通道数,而对于每个下采样卷积层均输出一个下采样特征图
Figure PCTCN2021094257-appb-000001
如图2B所示,编码网络301最终输出第一人脸的属性特征图50。
S204、将所述目标图像输入所述身份提取网络中提取所述第二人脸的身份特征。
在本申请实施例中,身份特征可以是指能够区分两个人脸属于不同人物的身份的信息,身份提取网络可以是预先训练好的网络,例如,可以是预先训练好的卷积神经网络CNN、循环神经网络RNN、深度神经网络DNN等。将目标图像输入该身份提取网络后可以提取第二人脸的身份特征F ID,身份特征F ID可以是一个一维向量,该一维向量包含了人脸的身份信息,如图2B所示,目标图像20输入身份提取网络303后得到身份特征60。
S205、采用所述残差网络将所述身份特征注入所述属性特征图中得到混合特征图。
在本申请的示例实施例中,可以先对身份特征进行转换,获得身份特征的身份特征均值和身份特征方差,将身份特征均值、身份特征方差以及属性特征图输入残差网络中,以通过残差网络将身份特征迁移到属性特征图上得到混合特征图。
如图2B所示,身份特征60可以经过一个全连接层305后输出身份特征均值μ和身份特征方差σ,身份特征均值μ、身份特征方差σ以及属性特征50一起输入到残差网络304中得到混合特征图70。
在本申请的示例实施例中,残差网络可以是自适应实例归一化的残差模块(AdaIN ResBlk),残差网络可以将一个风格图片描述成特征图的均值、方差,通过改变内容图片特征图的均值、方差,从而实现风格注入,以x表示内容特征图,y表示风格图片,则残差网络的公式为:
Figure PCTCN2021094257-appb-000002
上述公式中,μ_y1,σ_y1为需要注入的身份特征均值和身份特征方差,本申请实施例中,x为身份特征,y为属性特征图,AdaIN(x,y)为混合特征图。
本申请实施例通过残差网络可以将第二人脸的身份特征注入到第一人脸的属性特征图中,从而采用第二人脸的身份特征替换掉第一人脸的身份特征,能够保留原图像中第一人脸的姿态、表情等信息,实现原图像中第一人脸的属性特征和目标图像中第二人脸的身份特征的结合。
S206、基于所述属性特征图和所述身份特征,采用所述解码网络对所述混合特征图进行解码,获得采用所述第二人脸替换所述第一人脸后的换脸图像。
在本申请的示例实施例中,在得到混合特征图后,将混合特征图和身份特征拼接得到拼接特征,将拼接特征输入到解码网络中通过多层上采样卷积层进 行采样处理得到换脸图像,其中,针对解码网络中的每个上采样卷积层,确定上采样卷积层在编码网络中对应的下采样卷积层,获取下采样卷积层输出的下采样特征图,以及获取上采样卷积层的前一上采样卷积层输出的上采样特征,拼接下采样特征图和上采样特征作为上采样卷积层的解码对象。
在一实施例中,如图2B所示,通过残差网络304输出混合特征图70后,将身份提取网络303提取的身份特征拼接到该混合特征图70得到拼接特征,从而提高第二人脸的身份特征的迁移能力,解码网络302对该拼接特征进行解码后能够更好地保持第二人脸的身份信息。
在一实施例中,如图2B所示,通过跨连,将解码网络的中间特征通过跨连的方式连接到解码网络的特征层上,在一实施例中,解码网络和编码网络为对称的上采样卷积神经网络和下采样卷积神经网络,在编码网络中,每个下采样卷积层均输出下采样特征
Figure PCTCN2021094257-appb-000003
在解码网络中,每个上采样卷积层均输入上采样特征
Figure PCTCN2021094257-appb-000004
对于解码网络中的某一个上采样卷积层i,可以确定该上采样卷积层i在编码网络中对应的下采样卷积层,并获得该下采样卷积层输出的下采样特征
Figure PCTCN2021094257-appb-000005
Figure PCTCN2021094257-appb-000006
Figure PCTCN2021094257-appb-000007
进行相加后进行上采样输出上采样特征
Figure PCTCN2021094257-appb-000008
作为下一上采样卷积层的输入。
本申请实施例在解码过程中,通过跨连操作将编码网络的下采样卷积层输出的中间特征输入到解码网络的上采样卷积层中,使得原图像中第一人脸的属性特征更好地融入到换脸图像中,换脸图像更真实自然。
S207、将所述原图像和所述换脸图像输入所述判别器进行训练得到判定值。
本申请实施例对生成器和判别器交替训练,先训练生成器得到换脸图像,然后采用换脸图像和原图像训练判别器,然后训练生成器,如此交替训练生成对抗网络,其中,训练一次生成器和一次判别器为一轮训练,每轮训练后生成器生成一个换脸图像,判别器对该换脸图像进行判别得到判定值,该判定值可以是换脸图像属于真实图像的概率。
S208、根据所述判定值、所述换脸图像、所述原图像和所述目标图像计算总损失。
在本申请实施例中,总损失可以是对抗损失、重构损失、关键点损失和身份特征损失的总和,可以先分别计算对抗损失、重构损失、关键点损失和身份特征损失,然后对对抗损失、重构损失、关键点损失和身份特征损失求和得到总损失,例如可以包括以下子步骤:
S2081、根据判定值、原图像和换脸图像计算对抗损失和关键点损失。
在一实施例中,可以根据判定值和预设对抗损失函数计算对抗损失,获取原图像和换脸图像中人脸的关键点,计算原图像和换脸图像中人脸的关键点的距离得到关键点损失。
示例性地,对抗损失gan_loss为:
gan_loss=∑-logD(G(X i))
上述公式中,G(X i)为生成器生成的换脸图像,D(G(X i))为判别器判别换脸图像G(X i)为真实图像的判定值。
示例性地,对于关键点损失,可以通过预先训练好的脸部姿态评估网络提取原图像的脸部关键点和换脸图像的脸部关键点,然后约束换脸图像的脸部关键点和原图像的脸部关键点相似,在一实施例中,可以获取换脸图像的脸部关键点坐标lmks_gen和原图像的脸部关键点坐标lmks_src,则关键点损失lmks_loss为:
lmks_loss=||lmks_gen-lmks_target|| 2
本申请实施例通过计算关键点损失来对换脸图像进行约束,使得换脸图像的脸部关键点和原图像的脸部关键点相似,从而使得换脸图像更好地保持原图像的人脸表情、人脸姿态等属性特征。
S2082、根据所述目标图像和所述换脸图像计算身份特征损失。
在本申请的示例实施例中,身份提取网络为预先训练好的网络,可以将目标图像和换脸图像分别输入身份提取网络中,以通过身份提取网络提取目标图像中人脸的身份特征和换脸图像中人脸的身份特征,并计算目标图像中人脸的身份特征和换脸图像中人脸的身份特征的距离得到身份特征损失,示例性地,可以记换脸图像的身份特征为FeatID gen,记目标图像的身份特征为FeatID target,则身份特征损失ID_loss为:
ID_loss=||FeatID gen-FeatID target|| 2
在本申请实施例中,通过计算目标图像和所述换脸图像的身份特征损失,可以约束换脸图像的身份特征与目标图像的身份特征更为相似,使得换脸图像更好地保持目标图像的身份特征。
S2083、将两张所述原图像输入所述生成器中获得所述原图像的自换脸图像, 并根据所述原图像和所述自换脸图像计算重构损失。
在一实施例中,可以将原图像同时输入生成器的编码网络和身份提取网络中,通过生成器生成原图像的自换脸图像,即通过生成器生成采用一张原图像的人脸替换另一原图像的人脸的换脸图像,然后计算该重构原图像的重构损失,示例性地,记采用两张原图像为original_img,所生成的自换脸图像为src_img,则重构损失recon_loss为:
recon_loss=||src img-original img|| 2
上述公式为对原图像和自换脸图像中相同位置的像素点的像素值的差值,通过计算生成器的重构损失来约束调整生成器的参数,使得生成器生成的换脸图像更好地保持原图像的属性特征,换脸图像更为真实自然。
S2084、计算所述对抗损失、所述重构损失、所述关键点损失和所述身份特征损失的和值得到总损失。
在一实施例中,总损失为对抗损失、重构损失、关键点损失和身份特征损失的总和,即总损失total_loss为:
total_loss=recon_loss+ID_loss+gan_loss+lmks_loss
当然,在实际应用中,还可以为对抗损失、重构损失、关键点损失和身份特征损失设置权重,计算权重和损失的乘积得到每个损失的权值,求权值和作为总损失,本领域技术人员还可以通过计算多个损失的加权平均值等来作为总损失,本申请实施例对计算总损失的方式不加以限制。
S209、根据所述总损失对所述生成器和所述判别器的参数进行调整。
在本申请实施例中,生成对抗网络包括生成器和判别器,可以通过交替训练生成器和判别器来训练生成对抗网络,最终通过计算总损失来对生成器和判别器的参数进行调整,在一实施例中,在一轮交替训练结束后计算得到总损失,判断总损失是否小于预设阈值,如果总损失小于预设阈值,说明生成器的精度足够高,生成器生成的换脸图像足以欺骗过判别器,可以停止对生成器和判别器进行训练,如果总损失大于预设阈值,说明生成器的精度不足,判别器仍然可以识别出生成器生成换脸图像的真假,则可以根据对抗损失调整判别器的参数,以及根据总损失调整生成器中编码器和解码器的参数,并返回S203以交替训练生成器和判别器,直到满足停止迭代训练条件。
在一实施例中,对判别器、编码器或者解码器的参数进行调整,可以是通过梯度下降算法对参数进行更新,其中,梯度下降算法可以是随机梯度下降法 SGD或者其他梯度下降法,本申请实施例对梯度算法和对参数调整更新的方法不加以限制。
本申请实施例训练生成器时,采用编码网络对原图像进行编码处理得到第一人脸的属性特征图,将目标图像输入身份提取网络中提取第二人脸的身份特征,采用残差网络将身份特征注入属性特征图中得到混合特征图,并基于属性特征图和身份特征,采用解码网络对混合特征图进行解码得到换脸图像,将原图像和换脸图像输入判别器进行训练得到判定值,根据判定值、换脸图像、原图像和目标图像计算总损失,根据总损失对生成器和判别器的参数进行调整,实现了解码过程中结合原图像的属性特征和目标图像的身份特征对混合特征图进行解码,使得换脸图像更好的保持原图像的人脸姿态、人脸表情等属性特征,同时目标图像中的身份特征也更好的融合到换脸图像中,增强了目标图像的身份特征的迁移能力,采用训练好的生成对抗网络的生成器对图像或者视频换脸时,得到的换脸图像或者视频真实自然,并且能够保持原图像的属性特征和目标图像的身份特征。
在一实施例中,总损失包括关键点损失,通过关键点损失约束换脸图像的脸部关键点和原图像的脸部关键点相似,使得换脸图像更好地保持原图像的人脸表情、人脸姿态等属性特征。
在一实施例中,总损失包括身份特征损失,通过身份特征损失约束换脸图像的身份特征与目标图像的身份特征更为相似,使得换脸图像更好地保持目标图像的身份特征。
图3为本申请一实施例提供的一种图像换脸方法的步骤流程图,本申请实施例可适用于更换图像中的人脸的情况,该方法可以由本申请实施例的图像换脸装置来执行,该图像换脸装置可以由硬件或软件来实现,并集成在本申请实施例所提供的电子设备中,在一实施例中,如图3所示,本申请实施例的图像换脸方法可以包括如下步骤:
S301、获取包含第一人脸的原图像和包含第二人脸的目标图像。
在本申请的一个示例中,用户采用目标图像中的第二人脸替换原图像中的第一人脸,使得换脸后的图像能够保持第二人脸的身份特征和第一人脸的姿态、表情等属性,在一个应用场景中,原图像为用户需要换脸的图像,目标图像可以是包含用户脸部的图像。
在本申请实施例中,可以提供一交互界面,该交互界面中提供用户确定原 图像和目标图像的操作,用户可以在交互界面中指定原图像和目标图像,示例性地,交互界面可以提供图像上传操作,用户可以在交互界面中上传原图像和目标图像,例如交互界面先提示用户上传原图像,再提示用户上传目标图像,当然,还可以通过其他交互操作获取用户指定的原图像和目标图像,本申请实施例对此不加以限制。
S302、将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像。
本申请实施例中,生成器可以是将原图像中第一人脸替换为目标图像中的第二人脸的神经网络,该生成器可以通过训练生成对抗网络获得,其中,生成器可以通过前述实施例所提供的生成对抗网络训练方法所训练,训练详情可参考前述实施例,在此不再详述。
对于训练好的生成器,可以将原图像输入生成器的编码网络提取第一人脸的属性特征图,将目标图像输入生成器的身份提取网络提取第二人脸的身份特征,并通过生成器的残差网络将身份特征注入属性特征图中生成混合特征图,生成器的编码网络根据身份特征和属性特征图对混合特征图进行解码得到第二人脸替换第一人脸后的换脸图像。
本申请实施例获取包含第一人脸的原图像和包含第二人脸的目标图像后,将原图像和目标图像输入生成对抗网络的生成器中,得到采用第二人脸替换第一人脸后原图像的换脸图像,本申请实施例的生成器实现了解码过程中结合原图像的属性特征和目标图像的身份特征对混合特征图进行解码,使得换脸图像更好的保持原图像的人脸姿态、人脸表情等属性特征,同时目标图像中的身份特征也更好的融合到换脸图像中,增强了目标图像的身份特征的迁移能力,采用训练好的生成对抗网络的生成器对图像换脸时,得到的换脸图像真实自然,并且能够保持原图像的属性特征和目标图像的身份特征。
图4为本申请一实施例提供的一种视频换脸方法的步骤流程图,本申请实施例可适用于更换视频中的人脸的情况,该方法可以由本申请实施例的视频换脸装置来执行,该视频换脸装置可以由硬件或软件来实现,并集成在本申请实施例所提供的电子设备中,在一实施例中,如图4所示,本申请实施例的视频换脸方法可以包括如下步骤:
S401、获取待换脸视频数据。
在本申请实施例中,待换脸视频数据可以是包含人脸的短视频数据、直播 视频数据、影片视频数据等,该待换脸视频数据可以包括一个或者一个以上的人脸,在实际应用中,用户可以在提供的换脸编辑交互界面中指定待换脸视频数据,例如,上传待换脸视频数据或者输入待换脸视频数据的地址,该地址可以是待换脸视频数据的本地存储地址,还可以是待换脸视频数据的网络地址。
S402、从所述视频数据中提取包含第一人脸的视频图像作为原图像。
在一实施例中,可以在视频数据解码的过程中,对视频数据中的人脸进行人脸检测,当检测到第一人脸时,提取该视频图像作为原图像,其中,第一人脸可以是用户指定的人脸,例如,在获取视频数据时,可以提示用户指定视频数据中的一个人脸作为第一人脸。
S403、获取包含第二人脸的目标图像。
本申请实施例中,目标图像为用于替换原图像中人脸的图像,该目标图像包含第二人脸,示例性地,目标图像可以是用户的自拍图像,当然还可以是用户指定的其他包含第二人脸的图像,在一实施例中,第一人脸和第二人脸为不同的人脸。
S404、将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像。
本申请实施例中,生成器可以是将原图像中第一人脸替换为目标图像中的第二人脸的神经网络,该生成器可以通过训练生成对抗网络获得,其中,生成器可以通过前述实施例所提供的生成对抗网络训练方法所训练,训练详情可参考前述实施例,在此不再详述。
对于训练好的生成器,可以将原图像输入生成器的编码网络提取第一人脸的属性特征图,将目标图像输入生成器的身份提取网络提取第二人脸的身份特征,并通过生成器的残差网络将身份特征注入属性特征图中生成混合特征图,生成器的编码网络根据身份特征和属性特征图对混合特征图进行解码得到第二人脸替换第一人脸后的换脸图像。
S405、基于所述换脸图像生成换脸后的视频数据。
在获得原图像的换脸图像后,可以按照预设的帧率、码率对每个原图像换脸后的换脸图像进行视频编码得到换脸后的视频数据,该换脸后的视频数据中的人脸保持了第二人脸的身份特征和第一人脸的姿态、表情等属性特征。
本申请实施例获取待换脸视频数据后,从视频数据中提取包含第一人脸的原图像,并获取包含第二人脸的目标图像,将原图像和目标图像输入生成对抗 网络的生成器中,得到采用第二人脸替换第一人脸后原图像的换脸图像,基于该换脸图像生成换脸后的视频数据。本申请实施例的生成器实现了解码过程中结合原图像的属性特征和目标图像的身份特征对混合特征图进行解码,使得换脸图像更好的保持原图像的人脸姿态、人脸表情等属性特征,同时目标图像中的身份特征也更好的融合到换脸图像中,增强了目标图像的身份特征的迁移能力,采用训练好的生成对抗网络的生成器对视频数据换脸时,得到的换脸后的视频数据真实自然,并且能够保持视频数据中人脸的属性特征和目标图像中人脸的身份特征。
图5是本申请一实施例提供的一种生成对抗网络训练装置的结构框图,如图5所示,本申请实施例的生成对抗网络训练装置可以包括如下模块:
原图像和目标图像获取模块501,设置为获取包含第一人脸的原图像和包含第二人脸的目标图像;
生成对抗网络初始化模块502,设置为初始化生成对抗网络的生成器和判别器;
生成器训练模块503,设置为将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像,所述生成器设置为从所述原图像提取第一人脸的属性特征图,从所述目标图像中提取第二人脸的身份特征,并将所述身份特征注入所述属性特征图中生成混合特征图,以及根据所述身份特征和所述属性特征图对所述混合特征图进行解码得到所述第二人脸替换所述第一人脸后的换脸图像;
判别器训练模块504,设置为采用所述原图像和所述换脸图像训练所述判别器得到判定值;
参数调整模块505,设置为根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整。
本申请实施例所提供的生成对抗网络训练装置可执行本申请实施例所提供的生成对抗网络训练方法,具备执行方法相应的功能模块和有益效果。
图6是本申请一实施例提供的一种图像换脸装置的结构框图,如图6所示,本申请实施例的图像换脸装置可以包括如下模块:
原图像和目标图像获取模块601,设置为获取包含第一人脸的原图像和包含第二人脸的目标图像;
图像换脸模块602,设置为将所述原图像和所述目标图像输入生成对抗网络 的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
其中,所述生成器通过本申请实施例所述的生成对抗网络训练方法所训练。
本申请实施例所提供的图像换脸装置可执行本申请实施例所提供的图像换脸方法,具备执行方法相应的功能模块和有益效果。
图7是本申请一实施例提供的一种视频换脸装置的结构框图,如图7所示,本申请实施例的视频换脸装置可以包括如下模块:
待换脸视频数据获取模块701,设置为获取待换脸视频数据;
原图像提取模块702,设置为从所述视频数据中提取包含第一人脸的视频图像作为原图像;
目标图像获取模块703,设置为获取包含第二人脸的目标图像;
视频换脸模块704,设置为将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
换脸视频数据生成模块705,设置为基于所述换脸图像生成换脸后的视频数据;
其中,所述生成器通过本申请实施例所述的生成对抗网络训练方法所训练。
本申请实施例所提供的视频换脸装置可执行本申请实施例所提供的视频换脸方法,具备执行方法相应的功能模块和有益效果。
参照图8,示出了本申请一个示例中的一种电子设备的结构示意图。如图8所示,该电子设备可以包括:处理器801、存储装置802、具有触摸功能的显示屏803、输入装置804、输出装置805以及通信装置806。该设备中处理器801的数量可以是一个或者多个,图8中以一个处理器801为例。该设备的处理器801、存储装置802、显示屏803、输入装置804、输出装置805以及通信装置806可以通过总线或者其他方式连接,图8中以通过总线连接为例。所述设备设置为执行如本申请任一实施例提供的生成对抗网络训练方法,和/或,图像换脸方法,和/或,视频换脸方法。
本申请实施例还提供一种计算机可读存储介质,所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行如上述方法实施例所述的生成对抗网络训练方法,和/或,图像换脸方法,和/或,视频换脸方法。
需要说明的是,对于装置、电子设备、存储介质实施例而言,由于其与方 法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
注意,上述仅为本申请的示例实施例及所运用技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行多种明显的变换、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其他等效实施例,而本申请的范围由所附的权利要求范围决定。

Claims (18)

  1. 一种生成对抗网络训练方法,包括:
    获取包含第一人脸的原图像和包含第二人脸的目标图像;
    初始化生成对抗网络的生成器和判别器;
    将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像,所述生成器设置为从所述原图像提取第一人脸的属性特征图,从所述目标图像中提取第二人脸的身份特征,并将所述身份特征注入所述属性特征图中生成混合特征图,以及根据所述身份特征和所述属性特征图对所述混合特征图进行解码得到所述第二人脸替换所述第一人脸后的换脸图像;
    将所述原图像和所述换脸图像输入所述判别器进行训练得到判定值;
    根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整。
  2. 根据权利要求1所述的方法,其中,所述初始化生成对抗网络的生成器和判别器,包括:
    初始化生成对抗网络的判别器的参数、生成器的编码网络和解码网络的参数,以及获取训练好的用于所述生成器中的残差网络和身份提取网络。
  3. 根据权利要求2所述的方法,其中,将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像,包括:
    采用所述编码网络对所述原图像进行编码处理得到所述第一人脸的属性特征图;
    将所述目标图像输入所述身份提取网络中提取所述第二人脸的身份特征;
    采用所述残差网络将所述身份特征注入所述属性特征图中得到混合特征图;
    基于所述属性特征图和所述身份特征,采用所述解码网络对所述混合特征图进行解码,获得采用所述第二人脸替换所述第一人脸后的换脸图像。
  4. 根据权利要求3所述的方法,其中,所述编码网络包括多层下采样卷积层,所述采用所述编码网络对所述原图像进行编码处理得到所述第一人脸的属性特征图,包括:
    对所述原图像进行预处理,获得预处理后的原图像;
    将所述预处理后的原图像输入所述编码网络中,获得每个下采样卷积层输出的下采样特征图;
    其中,所述预处理包括调整图像尺寸,所述编码网络的最后一层下采样卷 积层输出的下采样特征图为所述第一人脸的属性特征图。
  5. 根据权利要求3所述的方法,其中,所述采用所述残差网络将所述身份特征注入所述属性特征图中得到混合特征图,包括:
    对所述身份特征进行转换,获得所述身份特征的身份特征均值和身份特征方差;
    将所述身份特征均值、所述身份特征方差以及所述属性特征图输入所述残差网络中,以通过所述残差网络将所述身份特征迁移到所述属性特征图上得到混合特征图。
  6. 根据权利要求3或4所述的方法,其中,所述编码网络包括多层下采样卷积层,所述解码网络包括多层上采样卷积层,所述基于所述属性特征图和所述身份特征,采用所述解码网络对所述混合特征图进行解码,获得采用所述第二人脸替换所述第一人脸后的换脸图像,包括:
    将所述混合特征图和所述身份特征拼接得到拼接特征;
    将所述拼接特征输入到所述解码网络中通过所述多层上采样卷积层进行采样处理得到换脸图像;
    其中,针对所述解码网络中的每个上采样卷积层,确定所述上采样卷积层在所述编码网络中对应的下采样卷积层,获取所述下采样卷积层输出的下采样特征图,以及获取所述上采样卷积层的前一上采样卷积层输出的上采样特征,拼接所述下采样特征图和所述上采样特征作为所述上采样卷积层的解码对象。
  7. 根据权利要求2所述的方法,其中,所述根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整,包括:
    根据所述判定值、所述换脸图像、所述原图像和所述目标图像计算总损失;
    根据所述总损失对所述生成器和所述判别器的参数进行调整。
  8. 根据权利要求7所述的方法,其中,所述根据所述判定值、所述换脸图像、所述原图像和所述目标图像计算总损失,包括:
    根据所述判定值、所述原图像和所述换脸图像计算对抗损失和关键点损失;
    根据所述目标图像和所述换脸图像计算身份特征损失;
    将两张所述原图像输入所述生成器中获得所述原图像的自换脸图像;
    根据所述原图像和所述自换脸图像计算重构损失;
    计算所述对抗损失、所述重构损失、所述关键点损失和所述身份特征损失 的和值得到总损失。
  9. 根据权利要求8所述的方法,其中,所述根据所述判定值、所述原图像和所述换脸图像计算对抗损失和关键点损失,包括:
    根据所述判定值和预设对抗损失函数计算对抗损失;
    获取所述原图像和所述换脸图像中人脸的关键点;
    计算所述原图像和所述换脸图像中人脸的关键点的距离得到关键点损失。
  10. 根据权利要求8所述的方法,其中,所述根据所述目标图像和所述换脸图像计算身份特征损失,包括:
    将所述目标图像和所述换脸图像分别输入所述身份提取网络中,得到所述目标图像中人脸的身份特征和所述换脸图像中人脸的身份特征;
    计算所述目标图像中人脸的身份特征和所述换脸图像中人脸的身份特征的距离得到身份特征损失。
  11. 根据权利要求7所述的方法,其中,所述根据所述总损失对所述生成器和所述判别器的参数进行调整,包括:
    判断所述总损失是否小于预设阈值;
    基于所述总损失小于所述预设阈值的判断结果,停止对所述生成器和所述判别器进行训练;
    基于所述总损失大于或等于所述预设阈值的判断结果,根据所述对抗损失调整所述判别器的参数,以及根据所述总损失调整所述生成器中编码器和解码器的参数,返回采用所述编码网络对所述原图像进行编码处理得到所述第一人脸的属性特征图的步骤。
  12. 一种图像换脸方法,包括:
    获取包含第一人脸的原图像和包含第二人脸的目标图像;
    将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
    其中,所述生成器通过权利要求1-11任一项所述的生成对抗网络训练方法所训练。
  13. 一种视频换脸方法,包括:
    获取待换脸视频数据;
    从所述视频数据中提取包含第一人脸的视频图像作为原图像;
    获取包含第二人脸的目标图像;
    将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
    基于所述换脸图像生成换脸后的视频数据;
    其中,所述生成器通过权利要求1-11任一项所述的生成对抗网络训练方法所训练。
  14. 一种生成对抗网络训练装置,包括:
    原图像和目标图像获取模块,设置为获取包含第一人脸的原图像和包含第二人脸的目标图像;
    生成对抗网络初始化模块,设置为初始化生成对抗网络的生成器和判别器;
    生成器训练模块,设置为将所述原图像和所述目标图像输入所述生成器中进行训练获得换脸图像,所述生成器设置为从所述原图像提取第一人脸的属性特征图,从所述目标图像中提取第二人脸的身份特征,并将所述身份特征注入所述属性特征图中生成混合特征图,以及根据所述身份特征和所述属性特征图对所述混合特征图进行解码得到所述第二人脸替换所述第一人脸后的换脸图像;
    判别器训练模块,设置为采用所述原图像和所述换脸图像训练所述判别器得到判定值;
    参数调整模块,设置为根据所述判定值、所述换脸图像、所述原图像和所述目标图像对所述生成器和所述判别器的参数进行调整。
  15. 一种图像换脸装置,包括:
    原图像和目标图像获取模块,设置为获取包含第一人脸的原图像和包含第二人脸的目标图像;
    图像换脸模块,设置为将所述原图像和所述目标图像输入生成对抗网络的生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
    其中,所述生成器通过权利要求1-11任一项所述的生成对抗网络训练方法所训练。
  16. 一种视频换脸装置,包括:
    待换脸视频数据获取模块,设置为获取待换脸视频数据;
    原图像提取模块,设置为从所述视频数据中提取包含第一人脸的视频图像作为原图像;
    目标图像获取模块,设置为获取包含第二人脸的目标图像;
    视频换脸模块,设置为将所述原图像和所述目标图像输入生成对抗网络的 生成器中,得到采用所述第二人脸替换所述第一人脸后所述原图像的换脸图像;
    换脸视频数据生成模块,设置为基于所述换脸图像生成换脸后的视频数据;
    其中,所述生成器通过权利要求1-11任一项所述的生成对抗网络训练方法所训练。
  17. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,设置为存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现以下至少之一的方法:
    如权利要求1-11中任一项所述的生成对抗网络训练方法,
    如权利要求12所述的图像换脸方法,
    如权利要求13所述的视频换脸方法。
  18. 一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现以下至少之一的方法:
    如权利要求1-11中任一项所述的生成对抗网络训练方法,
    权利要求12所述的图像换脸方法,
    权利要求13所述的视频换脸方法。
PCT/CN2021/094257 2020-06-24 2021-05-18 生成对抗网络训练方法、图像换脸、视频换脸方法及装置 WO2021258920A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010592443.X 2020-06-24
CN202010592443.XA CN111783603A (zh) 2020-06-24 2020-06-24 生成对抗网络训练方法、图像换脸、视频换脸方法及装置

Publications (1)

Publication Number Publication Date
WO2021258920A1 true WO2021258920A1 (zh) 2021-12-30

Family

ID=72759820

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/094257 WO2021258920A1 (zh) 2020-06-24 2021-05-18 生成对抗网络训练方法、图像换脸、视频换脸方法及装置

Country Status (2)

Country Link
CN (1) CN111783603A (zh)
WO (1) WO2021258920A1 (zh)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387656A (zh) * 2022-01-14 2022-04-22 平安科技(深圳)有限公司 基于人工智能的换脸方法、装置、设备及存储介质
CN114419702A (zh) * 2021-12-31 2022-04-29 南京硅基智能科技有限公司 数字人生成模型、模型的训练方法以及数字人生成方法
CN114663199A (zh) * 2022-05-17 2022-06-24 武汉纺织大学 一种动态展示的实时三维虚拟试衣系统及方法
CN114677566A (zh) * 2022-04-08 2022-06-28 北京百度网讯科技有限公司 深度学习模型的训练方法、对象识别方法和装置
CN114693940A (zh) * 2022-03-22 2022-07-01 电子科技大学 基于深度学习的特征混合可分解性增强的图像描述方法
CN114782291A (zh) * 2022-06-23 2022-07-22 中国科学院自动化研究所 图像生成器的训练方法、装置、电子设备和可读存储介质
CN115171199A (zh) * 2022-09-05 2022-10-11 腾讯科技(深圳)有限公司 图像处理方法、装置及计算机设备、存储介质
CN115171198A (zh) * 2022-09-02 2022-10-11 腾讯科技(深圳)有限公司 模型质量评估方法、装置、设备及存储介质
CN115187446A (zh) * 2022-05-26 2022-10-14 北京健康之家科技有限公司 换脸视频的生成方法、装置、计算机设备及可读存储介质
CN115205903A (zh) * 2022-07-27 2022-10-18 华中农业大学 一种基于身份迁移生成对抗网络的行人重识别方法
CN115239857A (zh) * 2022-07-05 2022-10-25 阿里巴巴达摩院(杭州)科技有限公司 图像生成方法以及电子设备
CN115278297A (zh) * 2022-06-14 2022-11-01 北京达佳互联信息技术有限公司 基于驱动视频的数据处理方法、装置、设备及存储介质
CN115272534A (zh) * 2022-07-29 2022-11-01 中国电信股份有限公司 人脸图像保护方法、保护装置、电子设备和可读存储介质
CN115426506A (zh) * 2022-11-07 2022-12-02 北京鲜衣怒马文化传媒有限公司 替换人脸部眼睛样式并跟随眼睛动作的方法、系统和介质
CN115578779A (zh) * 2022-11-23 2023-01-06 腾讯科技(深圳)有限公司 一种换脸模型的训练、基于视频的换脸方法和相关装置
CN115713680A (zh) * 2022-11-18 2023-02-24 山东省人工智能研究院 一种基于语义引导的人脸图像身份合成方法
US20230071456A1 (en) * 2021-09-03 2023-03-09 International Business Machines Corporation Generative adversarial network implemented digital script modification
CN115984094A (zh) * 2022-12-05 2023-04-18 中南大学 基于多损失约束视角一致性保持人脸安全生成方法及设备
CN115994983A (zh) * 2023-03-24 2023-04-21 湖南大学 一种基于快照式编码成像系统的医药高光谱重构方法
WO2023168903A1 (zh) * 2022-03-10 2023-09-14 腾讯科技(深圳)有限公司 模型训练和身份匿名化方法、装置、设备、存储介质及程序产品
CN116828129A (zh) * 2023-08-25 2023-09-29 小哆智能科技(北京)有限公司 一种超清2d数字人生成方法及系统
CN117315354A (zh) * 2023-09-27 2023-12-29 南京航空航天大学 基于多判别器复合编码gan网络的绝缘子异常检测方法
WO2024099026A1 (zh) * 2022-11-07 2024-05-16 腾讯科技(深圳)有限公司 图像处理方法、装置、设备、存储介质及程序产品

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783603A (zh) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 生成对抗网络训练方法、图像换脸、视频换脸方法及装置
CN113486694A (zh) * 2020-10-26 2021-10-08 青岛海信电子产业控股股份有限公司 一种人脸图像处理方法及终端设备
CN112102461B (zh) * 2020-11-03 2021-04-09 北京智源人工智能研究院 一种人脸渲染方法、装置、电子设备和存储介质
CN112529159B (zh) * 2020-12-09 2023-08-04 北京百度网讯科技有限公司 网络训练方法、装置及电子设备
CN112734631A (zh) * 2020-12-31 2021-04-30 北京深尚科技有限公司 基于微调模型的视频图像换脸方法、装置、设备及介质
CN112800869B (zh) * 2021-01-13 2023-07-04 网易(杭州)网络有限公司 图像人脸表情迁移方法、装置、电子设备及可读存储介质
CN112766160B (zh) * 2021-01-20 2023-07-28 西安电子科技大学 基于多级属性编码器和注意力机制的人脸替换方法
CN112785495A (zh) * 2021-01-27 2021-05-11 驭势科技(南京)有限公司 图像处理模型训练方法、图像生成方法、装置和设备
CN113705290A (zh) * 2021-02-26 2021-11-26 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机设备和存储介质
CN112991152A (zh) * 2021-03-04 2021-06-18 网易(杭州)网络有限公司 一种图像处理方法、装置、电子设备和存储介质
CN113077379B (zh) * 2021-03-23 2024-03-22 深圳数联天下智能科技有限公司 特征潜码的提取方法及装置、设备及存储介质
CN112734634B (zh) * 2021-03-30 2021-07-27 中国科学院自动化研究所 换脸方法、装置、电子设备和存储介质
CN113240792B (zh) * 2021-04-29 2022-08-16 浙江大学 一种基于人脸重建的图像融合生成式换脸方法
CN113205449A (zh) * 2021-05-21 2021-08-03 珠海金山网络游戏科技有限公司 表情迁移模型的训练方法及装置、表情迁移方法及装置
CN113822790B (zh) * 2021-06-03 2023-04-21 腾讯云计算(北京)有限责任公司 一种图像处理方法、装置、设备及计算机可读存储介质
CN113486785A (zh) * 2021-07-01 2021-10-08 深圳市英威诺科技有限公司 基于深度学习的视频换脸方法、装置、设备及存储介质
CN113361490B (zh) * 2021-07-14 2023-04-18 网易(杭州)网络有限公司 图像生成、网络训练方法、装置、计算机设备和存储介质
CN113609960B (zh) * 2021-08-03 2023-07-28 北京奇艺世纪科技有限公司 一种目标图片的人脸驱动方法及装置
CN115708120A (zh) * 2021-08-10 2023-02-21 腾讯科技(深圳)有限公司 脸部图像处理方法、装置、设备以及存储介质
CN114973349A (zh) * 2021-08-20 2022-08-30 腾讯科技(深圳)有限公司 面部图像处理方法和面部图像处理模型的训练方法
CN113762148B (zh) * 2021-09-07 2023-12-08 京东科技信息技术有限公司 图像识别模型训练方法和装置、图像识别方法和装置
CN113763366B (zh) * 2021-09-10 2023-07-25 网易(杭州)网络有限公司 一种换脸方法、装置、设备及存储介质
CN113627404B (zh) * 2021-10-12 2022-01-14 中国科学院自动化研究所 基于因果推断的高泛化人脸替换方法、装置和电子设备
CN114007099A (zh) * 2021-11-04 2022-02-01 北京搜狗科技发展有限公司 一种视频处理方法、装置和用于视频处理的装置
CN114187624B (zh) * 2021-11-09 2023-09-22 北京百度网讯科技有限公司 图像生成方法、装置、电子设备及存储介质
CN114140320B (zh) * 2021-12-09 2023-09-01 北京百度网讯科技有限公司 图像迁移方法和图像迁移模型的训练方法、装置
CN116310615A (zh) * 2021-12-21 2023-06-23 北京字跳网络技术有限公司 图像处理方法、装置、设备及介质
CN114663539B (zh) * 2022-03-09 2023-03-14 东南大学 一种基于音频驱动的口罩下2d人脸还原技术
JP7479507B2 (ja) 2022-03-30 2024-05-08 ▲騰▼▲訊▼科技(深▲セン▼)有限公司 画像処理方法及び装置、コンピューター機器、並びにコンピュータープログラム
CN114972016A (zh) * 2022-06-02 2022-08-30 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机设备、存储介质及程序产品
CN117011665A (zh) * 2022-11-09 2023-11-07 腾讯科技(深圳)有限公司 一种图像处理模型训练方法、装置、电子设备及存储介质
CN115565238B (zh) * 2022-11-22 2023-03-28 腾讯科技(深圳)有限公司 换脸模型的训练方法、装置、设备、存储介质和程序产品
CN116740540B (zh) * 2023-08-11 2023-10-27 腾讯科技(深圳)有限公司 一种数据处理方法、装置、设备以及计算机可读存储介质
CN117196937B (zh) * 2023-09-08 2024-05-14 天翼爱音乐文化科技有限公司 一种基于人脸识别模型的视频换脸方法、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517185A (zh) * 2019-07-23 2019-11-29 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN110796111A (zh) * 2019-11-05 2020-02-14 腾讯科技(深圳)有限公司 图像处理方法、装置、设备及存储介质
CN111275057A (zh) * 2020-02-13 2020-06-12 腾讯科技(深圳)有限公司 图像处理方法、装置及设备
CN111783603A (zh) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 生成对抗网络训练方法、图像换脸、视频换脸方法及装置
US10825219B2 (en) * 2018-03-22 2020-11-03 Northeastern University Segmentation guided image generation with adversarial networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10825219B2 (en) * 2018-03-22 2020-11-03 Northeastern University Segmentation guided image generation with adversarial networks
CN110517185A (zh) * 2019-07-23 2019-11-29 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN110796111A (zh) * 2019-11-05 2020-02-14 腾讯科技(深圳)有限公司 图像处理方法、装置、设备及存储介质
CN111275057A (zh) * 2020-02-13 2020-06-12 腾讯科技(深圳)有限公司 图像处理方法、装置及设备
CN111783603A (zh) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 生成对抗网络训练方法、图像换脸、视频换脸方法及装置

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230071456A1 (en) * 2021-09-03 2023-03-09 International Business Machines Corporation Generative adversarial network implemented digital script modification
US11989509B2 (en) * 2021-09-03 2024-05-21 International Business Machines Corporation Generative adversarial network implemented digital script modification
CN114419702A (zh) * 2021-12-31 2022-04-29 南京硅基智能科技有限公司 数字人生成模型、模型的训练方法以及数字人生成方法
CN114419702B (zh) * 2021-12-31 2023-12-01 南京硅基智能科技有限公司 数字人生成模型、模型的训练方法以及数字人生成方法
CN114387656A (zh) * 2022-01-14 2022-04-22 平安科技(深圳)有限公司 基于人工智能的换脸方法、装置、设备及存储介质
WO2023168903A1 (zh) * 2022-03-10 2023-09-14 腾讯科技(深圳)有限公司 模型训练和身份匿名化方法、装置、设备、存储介质及程序产品
CN114693940A (zh) * 2022-03-22 2022-07-01 电子科技大学 基于深度学习的特征混合可分解性增强的图像描述方法
CN114693940B (zh) * 2022-03-22 2023-04-28 电子科技大学 基于深度学习的特征混合可分解性增强的图像描述方法
CN114677566B (zh) * 2022-04-08 2023-10-17 北京百度网讯科技有限公司 深度学习模型的训练方法、对象识别方法和装置
CN114677566A (zh) * 2022-04-08 2022-06-28 北京百度网讯科技有限公司 深度学习模型的训练方法、对象识别方法和装置
CN114663199A (zh) * 2022-05-17 2022-06-24 武汉纺织大学 一种动态展示的实时三维虚拟试衣系统及方法
CN115187446A (zh) * 2022-05-26 2022-10-14 北京健康之家科技有限公司 换脸视频的生成方法、装置、计算机设备及可读存储介质
CN115278297A (zh) * 2022-06-14 2022-11-01 北京达佳互联信息技术有限公司 基于驱动视频的数据处理方法、装置、设备及存储介质
CN115278297B (zh) * 2022-06-14 2023-11-28 北京达佳互联信息技术有限公司 基于驱动视频的数据处理方法、装置、设备及存储介质
CN114782291A (zh) * 2022-06-23 2022-07-22 中国科学院自动化研究所 图像生成器的训练方法、装置、电子设备和可读存储介质
CN114782291B (zh) * 2022-06-23 2022-09-06 中国科学院自动化研究所 图像生成器的训练方法、装置、电子设备和可读存储介质
CN115239857A (zh) * 2022-07-05 2022-10-25 阿里巴巴达摩院(杭州)科技有限公司 图像生成方法以及电子设备
CN115205903A (zh) * 2022-07-27 2022-10-18 华中农业大学 一种基于身份迁移生成对抗网络的行人重识别方法
CN115272534B (zh) * 2022-07-29 2024-02-02 中国电信股份有限公司 人脸图像保护方法、保护装置、电子设备和可读存储介质
CN115272534A (zh) * 2022-07-29 2022-11-01 中国电信股份有限公司 人脸图像保护方法、保护装置、电子设备和可读存储介质
CN115171198A (zh) * 2022-09-02 2022-10-11 腾讯科技(深圳)有限公司 模型质量评估方法、装置、设备及存储介质
CN115171199A (zh) * 2022-09-05 2022-10-11 腾讯科技(深圳)有限公司 图像处理方法、装置及计算机设备、存储介质
CN115171199B (zh) * 2022-09-05 2022-11-18 腾讯科技(深圳)有限公司 图像处理方法、装置及计算机设备、存储介质
WO2024099026A1 (zh) * 2022-11-07 2024-05-16 腾讯科技(深圳)有限公司 图像处理方法、装置、设备、存储介质及程序产品
CN115426506B (zh) * 2022-11-07 2023-02-03 北京鲜衣怒马文化传媒有限公司 替换人脸部眼睛样式并跟随眼睛动作的方法、系统和介质
CN115426506A (zh) * 2022-11-07 2022-12-02 北京鲜衣怒马文化传媒有限公司 替换人脸部眼睛样式并跟随眼睛动作的方法、系统和介质
CN115713680A (zh) * 2022-11-18 2023-02-24 山东省人工智能研究院 一种基于语义引导的人脸图像身份合成方法
CN115578779B (zh) * 2022-11-23 2023-03-10 腾讯科技(深圳)有限公司 一种换脸模型的训练、基于视频的换脸方法和相关装置
CN115578779A (zh) * 2022-11-23 2023-01-06 腾讯科技(深圳)有限公司 一种换脸模型的训练、基于视频的换脸方法和相关装置
CN115984094B (zh) * 2022-12-05 2023-11-10 中南大学 基于多损失约束视角一致性保持人脸安全生成方法及设备
CN115984094A (zh) * 2022-12-05 2023-04-18 中南大学 基于多损失约束视角一致性保持人脸安全生成方法及设备
CN115994983A (zh) * 2023-03-24 2023-04-21 湖南大学 一种基于快照式编码成像系统的医药高光谱重构方法
CN116828129B (zh) * 2023-08-25 2023-11-03 小哆智能科技(北京)有限公司 一种超清2d数字人生成方法及系统
CN116828129A (zh) * 2023-08-25 2023-09-29 小哆智能科技(北京)有限公司 一种超清2d数字人生成方法及系统
CN117315354A (zh) * 2023-09-27 2023-12-29 南京航空航天大学 基于多判别器复合编码gan网络的绝缘子异常检测方法
CN117315354B (zh) * 2023-09-27 2024-04-02 南京航空航天大学 基于多判别器复合编码gan网络的绝缘子异常检测方法

Also Published As

Publication number Publication date
CN111783603A (zh) 2020-10-16

Similar Documents

Publication Publication Date Title
WO2021258920A1 (zh) 生成对抗网络训练方法、图像换脸、视频换脸方法及装置
CN111489287B (zh) 图像转换方法、装置、计算机设备和存储介质
Ning et al. Multi‐view frontal face image generation: a survey
TWI779969B (zh) 圖像處理方法、處理器、電子設備與電腦可讀存儲介質
US11276231B2 (en) Semantic deep face models
CN110348330B (zh) 基于vae-acgan的人脸姿态虚拟视图生成方法
Lin et al. Exploring explicit domain supervision for latent space disentanglement in unpaired image-to-image translation
WO2021052375A1 (zh) 目标图像生成方法、装置、服务器及存储介质
CN111862274A (zh) 生成对抗网络训练方法、图像风格迁移方法及装置
WO2022135490A1 (zh) 一种人脸图像合成方法、系统、电子设备及存储介质
CN111242952B (zh) 图像分割模型训练方法、图像分割方法、装置及计算设备
CN110599411A (zh) 一种基于条件生成对抗网络的图像修复方法及系统
CN114648613A (zh) 基于可变形神经辐射场的三维头部模型重建方法及装置
CN114187165A (zh) 图像处理方法和装置
CN117036583A (zh) 视频生成方法、装置、存储介质及计算机设备
CN114282013A (zh) 一种数据处理方法、装置及存储介质
Liao et al. Artist-net: Decorating the inferred content with unified style for image inpainting
CN111640172A (zh) 一种基于生成对抗网络的姿态迁移方法
CN114783017A (zh) 基于逆映射的生成对抗网络优化方法及装置
Lee et al. FarfetchFusion: Towards Fully Mobile Live 3D Telepresence Platform
WO2022252372A1 (zh) 一种图像处理方法、装置、设备及计算机可读存储介质
CN112990123B (zh) 图像处理方法、装置、计算机设备和介质
Wang et al. [Retracted] Convolution‐Based Design for Real‐Time Pose Recognition and Character Animation Generation
CN115019378A (zh) 一种面向协同推理的抗数据审查属性推断攻击方法及装置
CN116958451B (zh) 模型处理、图像生成方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21829906

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21829906

Country of ref document: EP

Kind code of ref document: A1