WO2023160350A1 - 一种脸部处理方法、装置、计算机设备及存储介质 - Google Patents

一种脸部处理方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2023160350A1
WO2023160350A1 PCT/CN2023/074288 CN2023074288W WO2023160350A1 WO 2023160350 A1 WO2023160350 A1 WO 2023160350A1 CN 2023074288 W CN2023074288 W CN 2023074288W WO 2023160350 A1 WO2023160350 A1 WO 2023160350A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
image
sample
trained
image feature
Prior art date
Application number
PCT/CN2023/074288
Other languages
English (en)
French (fr)
Inventor
温琦
温翔
蒋昊
周佳庆
杨司琪
熊峰
李宏亮
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023160350A1 publication Critical patent/WO2023160350A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/80Special adaptations for executing a specific game genre or game mode
    • A63F13/822Strategy games; Role-playing games
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6009Methods for processing data by generating or executing the game program for importing or creating game content, e.g. authoring tools during game development, adapting content to different platforms, use of a scripting language to create content
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/80Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game specially adapted for executing a specific type of game
    • A63F2300/807Role playing or strategy games
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the field of computer technology, in particular, to a face processing method, device, computer equipment and storage medium.
  • RPG Role Playing Game
  • Embodiments of the present disclosure at least provide a face processing method, device, computer equipment, and storage medium.
  • an embodiment of the present disclosure provides a face processing method, including:
  • the trained face processing model It is used to obtain the face pinching parameter information of the corresponding virtual character after inputting the face image to be processed of the real face.
  • the human face image sample and the character face screenshot sample are respectively input into the face processing model to be trained to obtain the first pinch face parameter information corresponding to the human face image sample Afterwards, the method also includes:
  • the face restoration loss information is obtained;
  • the method of adjusting the weight parameter information in the face processing model to be trained based on the image feature loss information and the face pinching parameter loss information to obtain a trained face processing model includes:
  • the face pinching parameter loss information and the face restoration loss information Based on the image feature loss information, the face pinching parameter loss information and the face restoration loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the human face image sample and the character face screenshot sample are respectively input into the In the face processing model trained, after obtaining the first pinched face parameter information corresponding to the face image sample, the method also includes:
  • the first face-pinching parameter information of the human face image sample is input into a pre-trained imitator to obtain a second human face generated image;
  • the face image sample and the second face generated image are input into the pre-trained first face recognition model to obtain the third image feature of the face image sample and the second face generated image respectively.
  • the method of adjusting the weight parameter information in the face processing model to be trained based on the image feature loss information and the face pinching parameter loss information to obtain a trained face processing model includes:
  • the face pinching parameter loss information Based on the image feature loss information, the face pinching parameter loss information, the first human face consistency loss information and the second human face consistency loss information, adjust the face processing model to be trained Weight parameter information to obtain a trained face processing model.
  • the face image sample includes a plurality of face image pairs; each face image pair includes two different face images of the same person;
  • the first pinch face parameter information corresponding to the face image sample includes the third pinch face parameter information and the fourth pinch face parameter information respectively corresponding to the two face images in the face image sample pair information, and the fifth face-pinching parameter information corresponding to any other face image;
  • the method further includes:
  • the method of adjusting the weight parameter information in the face processing model to be trained based on the image feature loss information and the face pinching parameter loss information to obtain a trained face processing model includes:
  • the face pinching parameter loss information and the face comparison loss information Based on the image feature loss information, the face pinching parameter loss information and the face comparison loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the weight parameter information in the face processing model to be trained is adjusted based on the image feature loss information and the face pinching parameter loss information to obtain a trained face processing models, including:
  • the current weight parameter information is updated to obtain updated weight parameter information.
  • the human face image sample and the character face screenshot sample are respectively input into the face processing model to be trained to obtain the first image feature corresponding to the human face image sample and the first pinch face parameter letter Information, the second image feature corresponding to the character face screenshot sample and the second pinch face parameter information, including:
  • the human face image sample and the character face screenshot sample are respectively input into the encoder of the face processing model to be trained, and the human face image sample and the character face screenshot sample are respectively down-sampled to obtain the The seventh image feature and the eighth image feature of the human face image sample and the character face screenshot sample in the first preset dimension respectively;
  • the seventh image feature and the eighth image feature are respectively input into the decoder of the face processing model to be trained, and the seventh image feature and the eighth image feature are up-sampled to obtain The ninth image feature and the tenth image feature of the human face image sample and the character face screenshot sample in the second preset dimension respectively;
  • the second pinch face parameter information of the character face screenshot sample is input into a pre-trained imitator, and before the first human face generation image is obtained, the method further includes:
  • the imitator to be trained is trained to obtain a trained imitator.
  • the embodiment of the present disclosure also provides a face processing method, including:
  • the embodiment of the present disclosure also provides a facial processing device, including:
  • the first acquisition module is used to acquire face image samples of real faces and screenshot samples of character faces corresponding to virtual characters;
  • the first input module is used to input the human face image sample and the character face screenshot sample into the face processing model to be trained respectively, to obtain the first image feature and the first pinch corresponding to the human face image sample face parameter information, the second image feature corresponding to the screenshot sample of the character face, and the second pinch face parameter information;
  • a first determination module configured to determine image feature loss information based on the first image feature and the second image feature; and determine pinch based on the first pinch face parameter information and the second pinch face parameter information face parameter loss information;
  • An adjustment module configured to adjust the weight parameter information in the face processing model to be trained based on the image feature loss information and the face pinching parameter loss information, to obtain a trained face processing model; the trained face processing model The face processing model is used to obtain the face pinching parameter information of the corresponding virtual character after inputting the face image to be processed of the real face.
  • the embodiment of the present disclosure also provides a facial processing device, including:
  • Obtaining module for obtaining the face image to be processed of real face
  • An input module configured to input the face image to be processed into the face processing model obtained after training according to the steps in the first aspect, or any possible implementation of the first aspect, to obtain the face image to be processed
  • the face-pinching parameter information of the face image is used for rendering to obtain the virtual character in the virtual scene.
  • the embodiment of the present disclosure also provides a computer device, including: a processor, a memory, and a bus, so The memory stores machine-readable instructions executable by the processor.
  • the processor communicates with the memory through a bus, and the machine-readable instructions are executed when executed by the processor.
  • the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the above-mentioned first aspect, or any of the first aspects in the first aspect, can be executed. Steps in a possible implementation manner or performing the steps in the second aspect above.
  • the embodiments of the present disclosure further provide a computer program product, the computer program product includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the method described in the first aspect or the second aspect is implemented .
  • the face processing model is trained by combining image feature loss information and face pinching parameter loss information, wherein the image feature loss information is based on The first image feature and the second image feature from the virtual domain are determined, and the pinching parameter loss information is determined based on the first pinching parameter information from the real domain and the second pinching parameter information from the virtual domain , using the image feature loss information and face pinching parameter loss information as the domain loss, adjusting the weight parameter information of the face processing model based on the domain loss, can reduce the domain difference between the rendered virtual character and the real face image to a certain extent, In this way, after inputting the face image of the real face to be processed into the trained face processing model, the virtual character rendered based on the obtained face pinching parameter information can have a high similarity with the face image to be processed .
  • FIG. 1 shows a flow chart of a face processing method provided by an embodiment of the present disclosure
  • FIG. 2 shows a flow chart of another face processing method provided by an embodiment of the present disclosure
  • FIG. 3 shows a flow chart of another face processing method provided by an embodiment of the present disclosure
  • Fig. 4 shows a schematic diagram of a face processing device provided by an embodiment of the present disclosure
  • Fig. 5 shows a schematic diagram of another face processing device provided by an embodiment of the present disclosure
  • Fig. 6 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
  • the real face image is usually input into the pre-trained face recognition network or face segmentation network to obtain the pinch face parameters, and then by adjusting the pinch face parameters, rendering Get a game character.
  • the face recognition network or face segmentation network is not trained based on the difference between the real domain where the real face image is located and the virtual domain where the game character is located. There are obvious domain differences between game characters and real face images.
  • the present disclosure provides a face processing method.
  • the face processing model is trained by combining image feature loss information and face pinching parameter loss information, wherein the image feature loss information is based on The first image feature in and the second image feature from the virtual domain are determined, and the pinching parameter loss information is determined based on the first pinching parameter information from the real domain and the second pinching parameter information from the virtual domain Yes, using the image feature loss information and face pinching parameter loss information as the domain loss, adjusting the weight parameter information of the face processing model based on the domain loss can reduce the domain difference between the rendered virtual character and the real face image to a certain extent In this way, after inputting the face image of the real face to be processed into the trained face processing model, the virtual character rendered based on the obtained face pinch parameter information can have a high similarity with the face image to be processed. sex.
  • the execution subject of the face processing method provided in the embodiments of the present disclosure is generally a computer device with certain computing capabilities.
  • the face processing method provided by the embodiment of the present disclosure will be described below by taking the execution subject as a server as an example.
  • FIG. 1 is a flow chart of a face processing method provided by an embodiment of the present disclosure, the method includes S101-S104, wherein:
  • S101 Obtain a face image sample of a real face and a screenshot sample of a character face corresponding to a virtual character.
  • the face image samples of real faces may refer to face photos, frame images, video screenshots, etc. containing real people existing in the real world, such as face photos of celebrities, and user's own face photos.
  • the screenshot sample of the character face corresponding to the virtual character may refer to an image obtained by taking a screenshot of the face of the virtual character in the virtual scene, for example, including a screenshot of the face of the game character.
  • S102 Input the human face image sample and the character face screenshot sample into the face processing model to be trained respectively, and obtain the first image feature and the first pinch face parameter information corresponding to the human face image sample, and the The second image feature and the second face-pinching parameter information corresponding to the screenshot sample of the character face.
  • the first image feature can refer to the features of the face in the face image sample; the first face pinch parameter information can correspond to the multi-dimensional information of the face in the face image sample, and can specifically include facial information such as eyes, nose, mouth, and eyebrows.
  • the second image feature can refer to the feature of the human face in the screenshot sample of the character face; the second pinch face parameter information can correspond to the human face in the screenshot sample of the character face
  • the multi-dimensional information of the face may specifically include facial information such as eyes, nose, mouth, and eyebrows.
  • the face image sample is input into the face processing model to be trained, and the encoding model in the face processing model to be trained can extract the first image feature and the first pinching face parameter information corresponding to the face image sample; and the character
  • the face screenshot sample is input into the face processing model to be trained, and the coding model in the face processing model to be trained can extract the second image feature and the second pinching parameter information corresponding to the character face screenshot sample.
  • the human face image sample and the character face screenshot sample can be respectively input into In the encoder of the face processing model to be trained, the face image sample and the character face screenshot sample are respectively down-sampled to obtain the seventh image feature of the face image sample and the character face screenshot sample in the first preset dimension respectively and the eighth image feature; then, the seventh image feature and the eighth image feature are respectively input into the decoder of the face processing model to be trained, and the seventh image feature and the eighth image feature are up-sampled to obtain a human face
  • the ninth image feature and the tenth image feature of the image sample and the character face screenshot sample respectively in the second preset dimension; then, input the seventh image feature and the ninth image feature, the eighth image feature and the tenth image feature respectively In the regressor of the face processing model to be trained, the first image feature and the first face pinching parameter information corresponding to the face image sample, the
  • the codec structure composed of the encoder and the decoder can be used as a generator in the face processing model to be trained to obtain the first image feature of the face image sample and the second image feature of the character face screenshot sample.
  • the codec structure may be a VQVAE or VQ-VAE-2 structure.
  • the encoder can down-sample the face image sample and the character face screenshot sample respectively, and perform dimensionality reduction processing on the face image sample or the character face screenshot sample to obtain the seventh image feature of the face image sample in the first preset dimension , the eighth image feature of the screenshot sample of the character face in the first preset dimension.
  • the seventh image feature may be quantized to obtain a discretized seventh image feature; meanwhile, the eighth image feature may be quantized to obtain a discretized eighth image feature.
  • the decoder is a symmetrical structure similar to the encoder. It can upsample the face image samples and character face screenshot samples after dimension reduction processing, so that the face image samples and character face screenshot samples after dimension reduction processing return to the original Size, that is, the ninth image feature and the tenth image feature of the face image sample and the character face screenshot sample in the second preset dimension respectively.
  • the ninth image feature may be quantized to obtain a discretized ninth image feature; the tenth image feature may be quantized to obtain a discretized tenth image feature.
  • the seventh image feature is connected with the ninth image feature to obtain the first connected image feature
  • the eighth image feature is connected with the tenth image feature to obtain the second connected image feature.
  • the first connected image features and the second connected image features may be stereoscopic multi-dimensional image features.
  • the regressor can be located at the last layer of the decoder, and the regressor can consist of a preset number of convolutional layers.
  • the first image feature of the face image sample can be obtained in the penultimate layer of the regressor, and the first pinching parameter corresponding to the face image sample can be obtained in the penultimate layer of the regressor information.
  • the second image feature of the character face screenshot sample can be obtained in the penultimate layer of the regressor, and the second pinching parameters corresponding to the character face screenshot sample can be obtained in the penultimate layer of the regressor information.
  • S103 Determine image feature loss information based on the first image feature and the second image feature; and Based on the first face pinching parameter information and the second face pinching parameter information, determine face pinching parameter loss information.
  • MMD maximum mean discrepancy
  • MMD may be used to determine the difference between the vector mean value of the first face pinching parameter information and the vector mean value of the second pinching parameter information, and determine the pinching parameter loss information based on the difference.
  • S104 Based on the image feature loss information and the face pinching parameter loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model; the trained face The processing model is used to obtain the face pinch parameter information corresponding to the virtual character after inputting the face image to be processed of the real face.
  • the weight parameter information in the embodiments of the present disclosure may refer to the weight parameters in the codec structure.
  • the weight parameter information can be adjusted by minimizing the sum of the square of image feature loss information and the square of pinch face parameter loss information.
  • the sum of image feature loss information and face pinching parameter loss information may be derived to obtain a gradient function of weight parameter information in the face processing model to be trained.
  • the current weight parameter information is substituted into the gradient function to obtain the iterated gradient vector; where the gradient vector can indicate that the directional derivative of the loss information at this point obtains the maximum value along this direction.
  • the current weight parameter information is updated to obtain updated weight parameter information.
  • the preset training cut-off condition may be a preset number of iterations, or the loss information difference between two adjacent iterations is smaller than a set threshold, which may not be specifically limited here.
  • a trained face processing model can be obtained. After inputting the face image of the real face to be processed into the trained face processing model, the face pinch parameter information corresponding to the virtual character can be obtained. Based on the pinching parameter information of the virtual character, the corresponding virtual character can be rendered.
  • the rendered virtual character has little difference and high similarity with the real face, and the user's game experience is better.
  • the face pinching parameter information of the character face screenshot is obtained by using the trained face processing model, and the virtual character rendered by using the pinch face parameter information of the character face screenshot is indistinguishable from the input character face screenshot, in a In the implementation manner, it can be realized by adding the face restoration loss information during the training of the face processing model to be trained, and adjusting the weight parameter information in the face processing model to be trained.
  • the second pinch face parameter information of the character face screenshot sample can be input into a pre-trained imitator to obtain the first human face generated image; then , based on the first image information of the character face screenshot sample and the second image information of the first face generated image, the face restoration loss information is obtained; finally, based on the image feature loss information, face pinching parameter loss information and face restoration loss information , adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the pre-trained imitator can realize the function of the game engine, based on the second face-pinching parameter information, render and obtain the generated image of the first human face.
  • the first image information of the character face screenshot sample can refer to the color of the character face screenshot sample under the three channels of red, green and blue. color information; the second image information of the generated image of the first human face may refer to the color information of the generated image of the first human face in three channels of red, green and blue.
  • the virtual character rendered from the pinching parameter information of the character face screenshot can be compared with the input character face There is less variance between screenshots.
  • the imitator can be obtained through the following steps of training:
  • Obtain the sixth pinching parameter information of the character face screenshot sample use the sixth pinching face parameter information of the character face screenshot sample as input, and the character face screenshot sample as output, train the imitator to be trained, and obtain a trained imitator.
  • the sixth pinch face parameter information may correspond to the multi-dimensional vector of the human face in the screenshot sample of the character face, which specifically may include facial information such as nose, eyes, and mouth.
  • the sixth face pinching parameter information can be derived from the client where the virtual character corresponding to the character face screenshot sample is located; same.
  • the sixth face-pinching parameter information obtained through the face processing model to be trained may not match the sixth face-pinching parameter information and the screenshot sample of the character face due to the imperfection of the face processing model to be trained. The imitator does not train well.
  • the face to be trained can be trained In the process of processing the model, it is realized by adding the face consistency loss information and adjusting the weight parameter information in the face processing model to be trained.
  • the first pinched face parameter information of the human face image sample can be input into a pre-trained imitator to obtain a second face generated image; Then, the face image sample and the second face generated image are input into the pre-trained first face recognition model to obtain the third image feature of the face image sample and the fourth image feature of the second face generated image respectively and inputting the face image sample and the second face generation image into the pre-trained second face recognition model, respectively obtaining the fifth image feature of the face image sample and the sixth image feature of the second face generation image ; Next, based on the third image feature and the fourth image feature, determine the first face consistency loss information; and, based on the fifth image feature and the sixth image feature, determine the second face consistency loss information; finally, Based on image feature loss information, face pinching parameter loss information, first face consistency loss information and second face consistency loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face Handle the model.
  • the aforementioned simulators may be the same, so the training process of the simulator will not be repeated here, and the specific training process can be referred to above.
  • Two face recognition models are used here, namely the first face recognition model and the second face recognition model.
  • the two face recognition models may be different face recognition models.
  • the first face recognition model may be a mesh renderer
  • the second face recognition model may be a face recognition network LightCNN-29v2.
  • the first face consistency loss information can represent the loss between the third image feature of the face image sample and the fourth image feature of the second face generated image, based on the third image feature in vector form and the The fourth image feature
  • the sum of the squares of the difference between the two vectors can be determined, that is, the first face consistency loss information.
  • the second face consistency loss information may represent the loss between the fifth image feature of the face image sample and the sixth image feature of the second face generated image, based on the fifth image feature in vector form and the The sixth image feature may determine the sum of squares of differences between the two vectors, that is, the second face consistency loss information.
  • adjusting the weight parameter information in the face processing model to be trained can make different face images based on the same real face samples, the similarity between the rendered virtual characters is higher.
  • the face image samples may include multiple face image pairs; each face image pair may contain two different face images of the same person.
  • the first pinch face parameter information corresponding to the face image sample includes the third pinch face parameter information and the fourth pinch face parameter information corresponding to the two face images in the face image sample pair, and other The fifth face-pinching parameter information corresponding to any face image.
  • the face comparison loss information can be determined based on the third pinch face parameter information, the fourth pinch face parameter information, and the fifth pinch face parameter information; then, based on the image
  • the feature loss information, face pinching parameter loss information and face comparison loss information are used to adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the first product of the third pinch-face parameter information and the fourth pinch-face parameter information, the second product of the third pinch-face parameter information and the fifth pinch-face parameter information can be calculated respectively; The sum of the products; then calculate the ratio of the first product to the sum of the first product and the second product; finally take the negative value of the logarithm of the ratio, that is, the face contrast loss information.
  • the image feature loss information, face pinching parameter loss information, face restoration loss information, face consistency loss information and face comparison loss information are used to adjust the weight parameter information in the face processing model to be trained.
  • the method includes S201-S202, wherein:
  • S202 Input the face image to be processed into the face processing model obtained after training according to the face processing method described in the above embodiment, and obtain face pinching parameter information of the face image to be processed; the pinch The face parameter information is used for rendering to obtain the virtual character in the virtual scene.
  • the training process of the face processing model can refer to the above-mentioned embodiments, which will not be repeated here.
  • FIG. 3 it is a flowchart of another face processing method provided by an embodiment of the present disclosure.
  • image feature loss information, face pinch parameter loss information, Face restoration loss information, face consistency loss information and face comparison loss information are used to train the face processing model.
  • the face processing model includes a codec, and the codec is the VQ-VAE-2 model as an example for description.
  • the codec includes an encoder and a decoder. It can be defined that the encoder (Encoder) is represented by the symbol E, and the decoder (Decoder) is represented by the symbol D.
  • Step 1 Get training samples.
  • the training samples include multiple face image pairs of real faces and screenshot samples of character faces corresponding to virtual characters.
  • Multiple face image pairs of real faces include unlabeled images in the Source Domain (ie, the real domain).
  • Each face image pair contains two different face image samples of the same person.
  • the character face screenshot samples corresponding to the virtual characters include labeled images in the Target Domain (ie, the virtual domain).
  • any face image pair, any other face image, and a character face screenshot sample corresponding to any virtual character can be used as input.
  • any face image sample in the face image pair can be marked as Label the other face image sample in the face image pair as Label any other face image as Mark the character face screenshot sample corresponding to any virtual character as I t .
  • Step 2 Input the training sample into the encoder, and obtain the first image features corresponding to each image in the training sample.
  • Encoder E can down-sample each image to obtain a 64 ⁇ 64-dimensional bottom layer latent map, and then further reduce the 64 ⁇ 64-dimensional bottom layer latent map to obtain a quantized 32 ⁇ 32-dimensional top layer latent map, that is, the first an image feature.
  • the 32 ⁇ 32-dimensional first image features corresponding to any face image pair, any other face image, and character face screenshot samples corresponding to any virtual character can be respectively obtained.
  • Step 3 Input the first image features corresponding to each image in the training sample into the decoder to obtain the second image features corresponding to each image in the training sample.
  • Decoder D is a symmetrical structure similar to encoder E, and can first upsample the first image features corresponding to each image to obtain image features of the original dimension, that is, 256 ⁇ 256 dimensional image features. Then, quantization processing is performed to obtain a quantized 64 ⁇ 64-dimensional top-layer latent map, that is, the second image feature, which has the same dimension as the previous 64 ⁇ 64-dimensional bottom-layer latent map.
  • Step 4 Input the second image feature into the regressor in the face processing model to obtain the third image feature corresponding to each image in the training sample and the pinch face parameter information corresponding to each image.
  • the definition regressor (Regressor) is represented by the symbol R.
  • the regressor R is set at the last layer of the decoder D.
  • the regressor R is composed of 6 convolutional layers.
  • the 32 ⁇ 32-dimensional top-level latent map obtained in step 2 is connected with the 64 ⁇ 64-dimensional top-level latent map obtained in step 3 to obtain a 64 ⁇ 64 ⁇ 128-dimensional stereo latent map.
  • the 64 ⁇ 64 ⁇ 128-dimensional stereo latent map is used as input to the regressor R, and after the processing of the stereo latent map by each convolutional layer in the regressor R, the L5 of the regressor R can be Layer, that is, the fifth layer of the regressor R, obtains the third image feature of 2 ⁇ 2 ⁇ 256 dimensions, and then obtains the pinching parameter information of 1 ⁇ 1 ⁇ 267 dimensions in the last layer of the regressor R.
  • face image sample The corresponding third image feature can be denoted as f S
  • the third image feature corresponding to the character face screenshot sample I t corresponding to any virtual character can be denoted as f t .
  • Face Image Sample The corresponding pinch face parameter information can be expressed as P
  • the face image sample The corresponding face pinching parameter information can be expressed as P +
  • the corresponding face pinching parameter information can be expressed as P ⁇
  • the pinching parameter information corresponding to the character face screenshot sample I t corresponding to any virtual character can be expressed as P t .
  • Step 5 Face image samples centered on the face image
  • the corresponding pinch face parameter information P and the pinch face parameter information P t corresponding to the character face screenshot sample I t are respectively input to a pre-trained simulator to generate a face image sample
  • the definition simulator (Imitator) is represented by a symbol G.
  • the image information defining the second generated face image It ' is denoted as G(P t ), and the image information of the character face screenshot sample I t is denoted as I t .
  • the image information here may be red, green, and blue color channel information.
  • the loss between the image information G(P t ) of the second face generation image I t' and the image information I t of the character face screenshot sample I t can be calculated, and then the loss can be used Train on the simulator. Specifically, the square of the difference between the image information G(P t ) of the second face generation image It ' and the image information It of the character face screenshot sample I t may be calculated.
  • the loss Limitation between the image information G(P t ) of the second face generation image I t' and the image information I t of the character face screenshot sample I t can be:
  • the simulator G is trained by minimizing the loss Limitation , and the trained simulator G can make the generated second face generation image I t' more similar to the character face screenshot sample I t .
  • Step 6 Based on face image samples The corresponding third image feature f S and the third image feature f t corresponding to the character face screenshot sample I t determine the image feature loss information; based on the face image sample The corresponding pinching parameter information P and the pinching parameter information P t corresponding to the character face screenshot sample I t determine the pinching parameter loss information; and determine the first loss information based on the image feature loss information and the pinching parameter loss information.
  • the distance between the character face screenshot sample I t , the first loss information obtained here is That is, domain loss information.
  • H k is a kernel Hilbert space with a kernel function
  • the kernel function can be a Gaussian kernel function ⁇ is the bandwidth, which is used to control the local scope of the Gaussian kernel function.
  • Step 7 Determine the second loss information based on the image information I t of the screenshot sample I t of the character face and the image information It' of the second generated image I t ' of the human face.
  • the first sub-loss can be determined first:
  • the loss in the VQ-VAE-2 model is used here, which is defined as the second sub-loss, and the second sub-loss can be:
  • e is the quantization code of the screenshot sample I t of the character face
  • sg is an operator, which is used to indicate the operation of stopping the gradient
  • Step 8 Sample the face image and the first face generated image I s' are input to the pre-trained face reconstruction model, and the face image samples are obtained respectively The fourth image feature and the fifth image feature of the first face generation image I s' ; and, the face image sample and the first face generation image I s' are input to the pre-trained face recognition model, and the face image samples are obtained respectively The sixth image feature and the seventh image feature of the first face generation image I s' ; based on the fourth image feature and the fifth image feature, determine the first face consistency loss information; and, based on the sixth image feature and For a seventh image feature, determine second face consistency loss information; and determine third loss information based on the first face consistency loss information and the second face consistency loss information.
  • the face reconstruction model can be marked as F 3d
  • the face recognition model can be marked as F id .
  • the fourth image feature of the first face generated image I s' can be expressed as F 3d (I S )
  • the fifth image feature of the first face generated image I s' can be expressed as F 3d (I S' )
  • the first face consistency loss information can be obtained for:
  • the sixth image feature of the first face generation image I s' can be expressed as F id (I S )
  • the seventh image feature of the first face generated image I s' can be expressed as F id (I S' )
  • the second face consistency loss information can be obtained for:
  • Step 9 Based on face image samples Corresponding face pinching parameter information P + , face image sample The corresponding pinching parameter information P ⁇ , and the pinching parameter information P t corresponding to the character face screenshot sample I t determine the fourth loss information.
  • the fourth loss information may be:
  • Step 10 Based on the first loss information, the second loss information, the third loss information, and the fourth loss information, the total loss information can be determined.
  • Step 11 Based on the total loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the trained face processing model For adjusting the weight parameter information in the face processing model to be trained, refer to the aforementioned adjustment process, which will not be repeated here.
  • the trained face processing model it can be used to obtain the face pinching parameter information corresponding to the virtual character after inputting the face image to be processed of the real face.
  • the obtained face-pinching parameter information of the virtual character can be rendered in the game engine to obtain the virtual character.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides a face processing device corresponding to the face processing method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned face processing method in the embodiment of the present disclosure, therefore For the implementation of the device, reference may be made to the implementation of the method, and repeated descriptions will not be repeated.
  • FIG. 4 it is a schematic structural diagram of a face processing device provided by an embodiment of the present disclosure.
  • the device includes: a first acquisition module 401 , a first input module 402 , a first determination module 403 , and an adjustment module 404 ; in,
  • the first obtaining module 401 is used to obtain the human face image sample of the real human face and the character face screenshot sample corresponding to the virtual character;
  • the first input module 402 is configured to respectively input the human face image sample and the character face screenshot sample into the face processing model to be trained to obtain the first image feature and the first image feature corresponding to the human face image sample. pinching face parameter information, the second image feature corresponding to the character face screenshot sample, and the second pinching face parameter information;
  • the first determining module 403 is configured to determine image feature loss information based on the first image feature and the second image feature; and determine based on the first pinching face parameter information and the second pinching face parameter information Pinch face parameter loss information;
  • An adjustment module 404 configured to adjust weight parameter information in the face processing model to be trained based on the image feature loss information and the face pinching parameter loss information to obtain a trained face processing model; the training A good face processing model is used to obtain the pinching parameter information of the corresponding virtual character after inputting the face image to be processed of the real face.
  • the device further includes:
  • the second input module is used to input the second pinching face parameter information of the character face screenshot sample into the pre-trained imitator to obtain the first human face generation image;
  • the second determination module is used to obtain face restoration loss information based on the first image information of the character face screenshot sample and the second image information of the first face generation image;
  • the adjustment module 404 is specifically used for:
  • the face pinching parameter loss information and the face restoration loss information Based on the image feature loss information, the face pinching parameter loss information and the face restoration loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the device further includes:
  • the third input module is used to input the first pinch face parameter information of the human face image sample into the pre-trained imitator to obtain the second human face generation image;
  • the third determination module is used to input the face image sample and the second face generated image into the pre-trained first face recognition model, and obtain the third image feature and the third image feature of the face image sample respectively.
  • the fourth image feature of the generated image of the second human face and, inputting the sample of the human face image and the generated image of the second human face into the pre-trained second human face recognition model to obtain the human face respectively a fifth image feature of the face image sample and a sixth image feature of the second face generated image;
  • a fourth determining module configured to determine first face consistency loss information based on the third image feature and the fourth image feature; and, based on the fifth image feature and the sixth image feature, determine second face consistency loss information;
  • the adjustment module 404 is specifically used for:
  • the face pinching parameter loss information Based on the image feature loss information, the face pinching parameter loss information, the first human face consistency loss information and the second human face consistency loss information, adjust the face processing model to be trained Weight parameter information to obtain a trained face processing model.
  • the face image sample includes a plurality of face image pairs; each face image pair includes two different face images of the same person;
  • the first pinch face parameter information corresponding to the face image sample includes the third pinch face parameter information and the fourth pinch face parameter information respectively corresponding to the two face images in the face image sample pair information, and the fifth face-pinching parameter information corresponding to any other face image;
  • the device also includes:
  • a fifth determining module configured to determine face contrast loss information based on the third face pinching parameter information, the fourth face pinching parameter information, and the fifth pinching face parameter information;
  • the adjustment module 404 is specifically used for:
  • the face pinching parameter loss information and the face comparison loss information Based on the image feature loss information, the face pinching parameter loss information and the face comparison loss information, adjust the weight parameter information in the face processing model to be trained to obtain a trained face processing model.
  • the adjustment module 404 is specifically used for:
  • the current weight parameter information is updated to obtain updated weight parameter information.
  • the first input module 402 is specifically used for:
  • the human face image sample and the character face screenshot sample into the coding of the face processing model to be trained respectively
  • the face image sample and the character face screenshot sample are respectively down-sampled to obtain the seventh image feature and the seventh image feature in the first preset dimension of the human face image sample and the character face screenshot sample respectively.
  • the seventh image feature and the eighth image feature are respectively input into the decoder of the face processing model to be trained, and the seventh image feature and the eighth image feature are up-sampled to obtain The ninth image feature and the tenth image feature of the human face image sample and the character face screenshot sample in the second preset dimension respectively;
  • the device further includes:
  • the second obtaining module is used to obtain the sixth face-pinching parameter information of the character face screenshot sample
  • the training module is used to use the sixth pinching parameter information of the character face screenshot sample as input and the character face screenshot sample as output to train the imitator to be trained to obtain a trained imitator.
  • FIG. 5 it is a schematic structural diagram of another face processing device provided by an embodiment of the present disclosure.
  • the device includes: an acquisition module 501 and an input module 502; wherein,
  • Obtaining module 501 for obtaining the face image to be processed of real face
  • the input module 502 is used to input the face image to be processed into the face processing model obtained after training according to the face processing method as described in FIG. 1 , to obtain face pinching parameter information of the face image to be processed ;
  • the face-pinching parameter information is used for rendering to obtain the virtual character in the virtual scene.
  • FIG. 6 it is a schematic structural diagram of a computer device 600 provided by an embodiment of the present disclosure, including a processor 601 , a memory 602 , and a bus 603 .
  • the memory 602 is used to store execution instructions, including a memory 6021 and an external memory 6022; the memory 6021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 601 and exchange data with an external memory 6022 such as a hard disk.
  • the processor 601 exchanges data with the external memory 6022 through the memory 6021.
  • the processor 601 communicates with the memory 602 through the bus 603, so that the processor 601 executes the following instructions:
  • the trained face processing model It is used to obtain the face pinching parameter information of the corresponding virtual character after inputting the face image to be processed of the real face.
  • the processor 601 is executing the following instructions:
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored. When the computer program is run by a processor, the steps of the face processing method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • Embodiments of the present disclosure also provide a computer program product, the computer product carries a program code, and the instructions included in the program code can be used to execute the steps of the face processing method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
  • a software development kit Software Development Kit, SDK
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了一种脸部处理方法、装置、计算机设备及存储介质,其中,方法包括:将获取的真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本分别输入至待训练的脸部处理模型中,得到人脸图像样本对应的第一图像特征和第一捏脸参数信息、角色脸截图样本对应的第二图像特征和第二捏脸参数信息(S102);基于第一图像特征和第二图像特征,确定图像特征损失信息;基于第一捏脸参数信息和第二捏脸参数信息,确定捏脸参数损失信息(S103);基于图像特征损失信息和捏脸参数损失信息,得到训练好的脸部处理模型(S104)。本公开实施例将图像特征损失信息和捏脸参数损失信息作为域损失,可以缩小虚拟角色与真实人脸图像的域差异,虚拟角色与真实人脸图像的相似度更高。

Description

一种脸部处理方法、装置、计算机设备及存储介质
本公开要求于2022年02月25日提交的申请号为202210180995.9、申请名称为“一种脸部处理方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及计算机技术领域,具体而言,涉及一种脸部处理方法、装置、计算机设备及存储介质。
背景技术
随着网络游戏的快速发展,为了满足用户对个性化游戏角色的定制需求,出现了越来越多的角色扮演游戏(Role Playing Game,RPG)。在RPG中通常会通过捏脸功能来创建游戏角色,从而帮助用户根据真人的设定得到相应的游戏角色样貌。
在基于真实人脸图像创建相应的游戏角色时候,如何使得通过捏脸功能创建的游戏角色与真实人脸更像,是值得研究的问题。
发明内容
本公开实施例至少提供一种脸部处理方法、装置、计算机设备及存储介质。
第一方面,本公开实施例提供了一种脸部处理方法,包括:
获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本;
将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息;
基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息;
基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
一种可选的实施方式中,将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一捏脸参数信息之后,所述方法还包括:
将所述角色脸截图样本的第二捏脸参数信息输入至预先训练的模仿器中,得到第一人脸生成图像;
基于所述角色脸截图样本的第一图像信息和所述第一人脸生成图像的第二图像信息,得到人脸恢复损失信息;
所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
基于所述图像特征损失信息、所述捏脸参数损失信息和所述人脸恢复损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
一种可选的实施方式中,将所述人脸图像样本和所述角色脸截图样本分别输入至待训 练的脸部处理模型中,得到所述人脸图像样本对应的第一捏脸参数信息之后,所述方法还包括:
将所述人脸图像样本的第一捏脸参数信息输入至预先训练的模仿器中,得到第二人脸生成图像;
将所述人脸图像样本和所述第二人脸生成图像输入至预先训练的第一人脸识别模型中,分别得到所述人脸图像样本的第三图像特征和所述第二人脸生成图像的第四图像特征;以及,将所述人脸图像样本和所述第二人脸生成图像输入至预先训练的第二人脸识别模型中,分别得到所述人脸图像样本的第五图像特征和所述第二人脸生成图像的第六图像特征;
基于所述第三图像特征和所述第四图像特征,确定第一人脸一致性损失信息;以及,基于所述第五图像特征和所述第六图像特征,确定第二人脸一致性损失信息;
所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
基于所述图像特征损失信息、所述捏脸参数损失信息、所述第一人脸一致性损失信息和所述第二人脸一致性损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
一种可选的实施方式中,所述人脸图像样本包括多个人脸图像对;每个所述人脸图像对中包含同一人的两张不同人脸图像;
针对每个人脸图像对,所述人脸图像样本对应的第一捏脸参数信息包括所述人脸图像样本对中两张人脸图像分别对应的第三捏脸参数信息和第四捏脸参数信息,以及所述其他任一人脸图像对应的第五捏脸参数信息;
将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一捏脸参数信息之后,所述方法还包括:
基于所述第三捏脸参数信息、所述第四捏脸参数信息以及所述第五捏脸参数信息,确定人脸对比损失信息;
所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
基于所述图像特征损失信息、所述捏脸参数损失信息和所述人脸对比损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
一种可选的实施方式中,所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
对所述图像特征损失信息和所述捏脸参数损失信息的和求导,得到所述待训练的脸部处理模型中的权重参数的梯度函数;
将当前的权重参数信息代入至所述梯度函数中,得到迭代后的梯度向量;
基于所述迭代后的梯度向量,对所述当前的权重参数信息进行更新,得到更新后的权重参数信息。
一种可选的实施方式中,所述将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信 息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息,包括:
将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型的编码器中,分别对所述人脸图像样本和所述角色脸截图样本进行下采样,得到所述人脸图像样本和所述角色脸截图样本分别在第一预设维度下的第七图像特征和第八图像特征;
将所述第七图像特征和所述第八图像特征分别输入至所述待训练的脸部处理模型的解码器中,对所述第七图像特征和所述第八图像特征进行上采样,得到所述人脸图像样本和所述角色脸截图样本分别在第二预设维度下的第九图像特征和第十图像特征;
将所述第七图像特征和所述第九图像特征、所述第八图像特征和所述第十图像特征分别输入至所述待训练的脸部处理模型的回归器中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息。
一种可选的实施方式中,将所述角色脸截图样本的第二捏脸参数信息输入至预先训练的模仿器中,得到第一人脸生成图像之前,所述方法还包括:
获取所述角色脸截图样本的第六捏脸参数信息;
将所述角色脸截图样本的第六捏脸参数信息作为输入、所述角色脸截图样本作为输出,对待训练的模仿器进行训练,得到训练好的模仿器。
第二方面,本公开实施例还提供一种脸部处理方法,包括:
获取真实人脸的待处理人脸图像;
将所述待处理人脸图像输入至按照第一方面,或第一方面中任一种可能的实施方式中的步骤训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
第三方面,本公开实施例还提供一种脸部处理装置,包括:
第一获取模块,用于获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本;
第一输入模块,用于将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息;
第一确定模块,用于基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息;
调整模块,用于基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
第四方面,本公开实施例还提供一种脸部处理装置,包括:
获取模块,用于获取真实人脸的待处理人脸图像;
输入模块,用于将所述待处理人脸图像输入至按照第一方面,或第一方面中任一种可能的实施方式中的步骤训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
第五方面,本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所 述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第一方面,或第一方面中任一种可能的实施方式中的步骤或执行上述第二方面中的步骤。
第六方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第一方面,或第一方面中任一种可能的实施方式中的步骤或执行上述第二方面中的步骤。
第七方面,本公开实施例还提供一种计算机程序产品,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现第一方面或第二方面所述的方法。
本公开实施例提供的一种脸部处理方法,在训练过程中,结合了图像特征损失信息和捏脸参数损失信息对脸部处理模型进行训练,其中,图像特征损失信息是基于来自真实域中的第一图像特征与来自虚拟域中的第二图像特征确定的,捏脸参数损失信息是基于来自真实域中的第一捏脸参数信息与来自虚拟域中的第二捏脸参数信息确定的,将图像特征损失信息和捏脸参数损失信息作为域损失,基于域损失调整脸部处理模型的权重参数信息,能够在一定程度上缩小渲染的虚拟角色与真实人脸图像之间的域差异,如此,将真实人脸的待处理人脸图像输入至训练好的脸部处理模型后,基于得到的捏脸参数信息渲染出的虚拟角色与待处理人脸图像之间能够具有较高的相似性。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种脸部处理方法的流程图;
图2示出了本公开实施例所提供的另一种脸部处理方法的流程图;
图3示出了本公开实施例所提供的又一种脸部处理方法的流程图;
图4示出了本公开实施例所提供的一种脸部处理装置的示意图;
图5示出了本公开实施例所提供的另一种脸部处理装置的示意图;
图6示出了本公开实施例所提供的一种计算机设备的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有 其他实施例,都属于本公开保护的范围。
在相关通过捏脸功能创建游戏角色的方案中,通常是将真实人脸图像输入至预先训练好的人脸识别网络或人脸分割网络中,得到捏脸参数,然后通过调整捏脸参数,渲染得到游戏角色。但是上述方案中,并未基于真实人脸图像所在的真实域与游戏角色所在的虚拟域之间的差异对人脸识别网络或人脸分割网络进行训练,导致基于得到的捏脸参数渲染出来的游戏角色与真实人脸图像存在明显的域差异。
基于此,本公开提供了一种脸部处理方法,在训练过程中,结合了图像特征损失信息和捏脸参数损失信息对脸部处理模型进行训练,其中,图像特征损失信息是基于来自真实域中的第一图像特征与来自虚拟域中的第二图像特征确定的,捏脸参数损失信息是基于来自真实域中的第一捏脸参数信息与来自虚拟域中的第二捏脸参数信息确定的,将图像特征损失信息和捏脸参数损失信息作为域损失,基于域损失调整脸部处理模型的权重参数信息,能够在一定程度上缩小渲染的虚拟角色与真实人脸图像之间的域差异,如此,将真实人脸的待处理人脸图像输入至训练好的脸部处理模型后,基于得到的捏脸参数信息渲染出的虚拟角色与待处理人脸图像之间能够具有较高的相似性。
针对以上方案所存在的缺陷以及所提出的解决方案,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种脸部处理方法进行详细介绍,本公开实施例所提供的脸部处理方法的执行主体一般为具有一定计算能力的计算机设备。
下面以执行主体为服务器为例对本公开实施例提供的脸部处理方法加以说明。
参见图1所示,为本公开实施例提供的一种脸部处理方法的流程图,所述方法包括S101~S104,其中:
S101:获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本。
这里,在进行脸部处理模型的训练之前,需要准确训练样本,也就是真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本。其中,真实人脸的人脸图像样本可以指包含现实世界中存在的真实人物的脸部照片、帧图像、视频截图等,例如明星的脸部照片、用户自己的脸部照片。虚拟角色对应的角色脸截图样本可以指对虚拟场景中虚拟角色的脸部进行截图得到的图像,例如包含游戏角色的脸部截图。
S102:将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息。
第一图像特征可以指人脸图像样本中人脸的特征;第一捏脸参数信息可以对应人脸图像样本中人脸的多维信息,具体可以包括眼睛、鼻子、嘴、眉毛等面部信息。第二图像特征可以指角色脸截图样本中人脸的特征;第二捏脸参数信息可以对应角色脸截图样本中人 脸的多维信息,具体可以包括眼睛、鼻子、嘴、眉毛等面部信息。
将人脸图像样本输入至待训练的脸部处理模型中,待训练的脸部处理模型中的编码模型可以提取人脸图像样本对应的第一图像特征和第一捏脸参数信息;以及将角色脸截图样本输入至待训练的脸部处理模型中,待训练的脸部处理模型中的编码模型可以提取角色脸截图样本对应的第二图像特征和第二捏脸参数信息。
为了能够更清楚、更全面地获取到人脸图像样本和角色脸截图样本中的图像特征和捏脸参数信息,在一种实施方式中,可以将人脸图像样本和角色脸截图样本分别输入至待训练的脸部处理模型的编码器中,分别对人脸图像样本和角色脸截图样本进行下采样,得到人脸图像样本和角色脸截图样本分别在第一预设维度下的第七图像特征和第八图像特征;然后,将第七图像特征和第八图像特征分别输入至待训练的脸部处理模型的解码器中,对第七图像特征和第八图像特征进行上采样,得到人脸图像样本和角色脸截图样本分别在第二预设维度下的第九图像特征和第十图像特征;然后,将第七图像特征和第九图像特征、第八图像特征和第十图像特征分别输入至所述待训练的脸部处理模型的回归器中,得到人脸图像样本对应的第一图像特征和第一捏脸参数信息、角色脸截图样本对应的第二图像特征和第二捏脸参数信息。
其中,编码器和解码器组成的编解码器结构可以作为待训练的脸部处理模型中的发生器,得到人脸图像样本的第一图像特征和角色脸截图样本的第二图像特征。示例性地,编解码器结构可以为VQVAE或VQ-VAE-2结构。
编码器可以对人脸图像样本和角色脸截图样本分别进行下采样,对人脸图像样本或角色脸截图样本进行降维处理,得到人脸图像样本在第一预设维度下的第七图像特征、角色脸截图样本在第一预设维度下的第八图像特征。在一种实施方式中,还可以对第七图像特征进行量化处理,得到离散化的第七图像特征;同时,对第八图像特征进行量化处理,得到离散化的第八图像特征。
解码器是与编码器类似的对称结构,可以对降维处理之后的人脸图像样本和角色脸截图样本分别进行上采样,使得降维处理之后的人脸图像样本和角色脸截图样本回到原始大小,也就是人脸图像样本和角色脸截图样本分别在第二预设维度下的第九图像特征和第十图像特征。在一种实施方式中,还可以对第九图像特征进行量化处理,得到离散化的第九图像特征;对第十图像特征进行量化处理,得到离散化的第十图像特征。
将七图像特征和第九图像特征进行连接得到第一连接后的图像特征、将第八图像特征和第十图像特征进行连接得到第二连接后的图像特征。这里,第一连接后的图像特征和第二连接后的图像特征可以为立体的多维图像特征。
回归器可以位于解码器的最后一层,回归器可以有预设层数的卷积层组成。基于第一连接后的图像特征,在回归器的倒数第二层可以得到人脸图像样本的第一图像特征,在回归器的倒数第一层可以得到人脸图像样本对应的第一捏脸参数信息。基于第二连接后的图像特征,在回归器的倒数第二层可以得到角色脸截图样本的第二图像特征,在回归器的倒数第一层可以得到角色脸截图样本对应的第二捏脸参数信息。
S103:基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基 于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息。
这里,可以利用最大平均差异(Maximum Mean Discrepancy,MMD),确定第一图像特征的向量均值与第二图像特征之间的向量均值之间的差值,并基于该差值确定图像特征损失信息。并且,可以利用MMD确定第一捏脸参数信息的向量均值与第二捏脸参数信息之间的向量均值之间的差值,并基于该差值确定捏脸参数损失信息。
S104:基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
在本公开实施例中的权重参数信息可以指编解码器结构中的权重参数。通过最小化图像特征损失信息的平方与捏脸参数损失信息的平方之和,可以调整权重参数信息。具体地,在一种实施方式中,可以对图像特征损失信息和捏脸参数损失信息的和求导,得到待训练的脸部处理模型中的权重参数信息的梯度函数。然后将当前的权重参数信息代入至梯度函数中,得到迭代后的梯度向量;其中,梯度向量可以表示损失信息在该点处的方向导数沿着该方向取得最大值。最后,基于迭代后的梯度向量,对当前的权重参数信息进行更新,得到更新后的权重参数信息。
每次迭代中均按照上述实施过程执行,以此类推,直至达到预设训练截止条件时停止迭代,得到最终更新后的权重参数信息,以使得第一图像特征和第二图像特征之间的距离足够小、第一捏脸参数信息和第二捏脸参数信息之间的距离足够小。其中,预设训练截止条件可以是预设迭代次数,也可以是相邻两次迭代之间的损失信息差值小于设定阈值,这里可以不做具体限定。
在达到预设训练截止条件时,可以得到训练好的脸部处理模型。将真实人脸的待处理人脸图像输入至训练好的脸部处理模型之后,可以得到对应虚拟角色的捏脸参数信息。基于虚拟角色的捏脸参数信息,可以渲染出对应的虚拟角色。
通过使用上述实施例中训练好的脸部处理模型得到的捏脸参数信息,渲染出来的虚拟角色与真实人脸差异较小,具有较高的相似性,用户的游戏体验更好。
为了保证在利用训练好的脸部处理模型得到角色脸截图的捏脸参数信息,并利用角色脸截图的捏脸参数信息渲染出来的虚拟角色与输入的角色脸截图是无差别的,在一种实施方式中,可以在训练待训练的脸部处理模型的过程中,通过加入人脸恢复损失信息,调整待训练的脸部处理模型中的权重参数信息的方式得以实现。
具体地,在得到角色脸截图样本对应的第二捏脸参数信息之后,可以将角色脸截图样本的第二捏脸参数信息输入至预先训练的模仿器中,得到第一人脸生成图像;然后,基于角色脸截图样本的第一图像信息和第一人脸生成图像的第二图像信息,得到人脸恢复损失信息;最后,基于图像特征损失信息、捏脸参数损失信息和人脸恢复损失信息,调整待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
其中,预先训练的模仿器可以实现游戏引擎的功能,基于第二捏脸参数信息,渲染得到第一人脸生成图像。
角色脸截图样本的第一图像信息可以指角色脸截图样本在红、绿、蓝三个通道下的颜 色信息;第一人脸生成图像的第二图像信息可以指第一人脸生成图像在红、绿、蓝三个通道下的颜色信息。
基于角色脸截图样本在向量形式下的第一图像信息和第一人脸生成图像在向量形式下的第二图像信息,可以确定两个向量之间的差值的平方和,即人脸恢复损失信息。通过最小化这两个向量之间的差值的平方和,调整待训练的脸部处理模型中的权重参数信息,可以实现角色脸截图的捏脸参数信息渲染出来的虚拟角色与输入的角色脸截图之间的差异更小。
其中,在一种实施方式中,模仿器可以是通过以下步骤训练得到的:
获取角色脸截图样本的第六捏脸参数信息;将角色脸截图样本的第六捏脸参数信息作为输入、角色脸截图样本作为输出,对待训练的模仿器进行训练,得到训练好的模仿器。
第六捏脸参数信息可以对应角色脸截图样本中人脸的多维向量,具体可以包括鼻子、眼睛、嘴等面部信息。这里,第六捏脸参数信息可以是从角色脸截图样本对应的虚拟角色所在的客户端导出来的;还可以是通过前述待训练的脸部处理模型得到的,即与第二捏脸参数信息相同。其中,通过待训练的脸部处理模型得到的第六捏脸参数信息可能会由于待训练的脸部处理模型不够完善,存在第六捏脸参数信息与角色脸截图样本不匹配的问题,导致对模仿器的训练效果不好。
为了保证同一张真实人脸的不同人脸图像样本,得到捏脸参数信息后,基于捏脸参数信息渲染出来的虚拟角色是相似的,在一种实施方式中,可以在训练待训练的脸部处理模型的过程中,通过加入人脸一致性损失信息,调整待训练的脸部处理模型中的权重参数信息的方式得以实现。
具体地,在得到角色人脸图像样本对应的第一捏脸参数信息之后,可以将人脸图像样本的第一捏脸参数信息输入至预先训练的模仿器中,得到第二人脸生成图像;然后,将人脸图像样本和第二人脸生成图像输入至预先训练的第一人脸识别模型中,分别得到人脸图像样本的第三图像特征和第二人脸生成图像的第四图像特征;以及将人脸图像样本和第二人脸生成图像输入至预先训练的第二人脸识别模型中,分别得到人脸图像样本的第五图像特征和第二人脸生成图像的第六图像特征;接下来,基于第三图像特征和第四图像特征,确定第一人脸一致性损失信息;以及,基于第五图像特征和第六图像特征,确定第二人脸一致性损失信息;最后,基于图像特征损失信息、捏脸参数损失信息、第一人脸一致性损失信息和第二人脸一致性损失信息,调整待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
其中,这里用到的预先训练的模仿器中前述模仿器可以是相同的,因此这里对模仿器的训练过程不再进行赘述,具体训练过程可以参见前文。
这里用到了两个人脸识别模型,即第一人脸识别模型和第二人脸识别模型。这两个人脸识别模型可以为不同的人脸识别模型。例如,第一人脸识别模型可以为网格渲染器,第二人脸识别模型可以为人脸识别网络LightCNN-29v2。
第一人脸一致性损失信息可以表示人脸图像样本的第三图像特征和第二人脸生成图像的第四图像特征之间的损失,基于向量形式下的第三图像特征和向量形式下的第四图像特 征,可以确定两个向量之间的差值的平方和,即第一人脸一致性损失信息。第二人脸一致性损失信息可以表示人脸图像样本的第五图像特征和第二人脸生成图像的第六图像特征之间的损失,基于向量形式下的第五图像特征和向量形式下的第六图像特征,可以确定两个向量之间的差值的平方和,即第二人脸一致性损失信息。
通过最小化第一人脸一致性损失信息与第二人脸一致性损失信息之和,调整待训练的脸部处理模型中的权重参数信息,可以使得基于同一张真实人脸的不同人脸图像样本,渲染出来的虚拟角色之间的相似度更高。
为了保证基于同一张真实人脸得到捏脸参数信息后渲染的虚拟角色更相似,不同真实人脸得到捏脸参数信息后渲染的虚拟角色更不相似,在一种实施方式中,可以在训练待训练的脸部处理模型的过程中,通过加入人脸对比损失信息,调整待训练的脸部处理模型中的权重参数信息的方式得以实现。
其中,人脸图像样本可以包括多个人脸图像对;每个人脸图像对中可以包含同一人的两张不同人脸图像。
针对每个人脸图像对,人脸图像样本对应的第一捏脸参数信息包括人脸图像样本对中两张人脸图像分别对应的第三捏脸参数信息和第四捏脸参数信息,以及其他任一人脸图像对应的第五捏脸参数信息。
在得到人脸图像样本对应的第一捏脸参数信息之后,可以基于第三捏脸参数信息、第四捏脸参数信息以及第五捏脸参数信息,确定人脸对比损失信息;然后,基于图像特征损失信息、捏脸参数损失信息和人脸对比损失信息,调整待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
这里,可以先分别计算第三捏脸参数信息与第四捏脸参数信息的第一乘积、第三捏脸参数信息与第五捏脸参数信息的第二乘积;然后计算第一乘积与第二乘积之和;然后计算第一乘积与第一乘积、第二乘积之和的比值;最后对该比值取对数的负值,即人脸对比损失信息。
通过最小化人脸对比损失信息,调整待训练的脸部处理模型中的权重参数信息,可以实现同一张真实人脸得到捏脸参数信息后渲染的虚拟角色更相似,不同真实人脸得到捏脸参数信息后渲染的虚拟角色更不相似。
进一步地,为了保证基于真实人脸得到捏脸参数信息后渲染的虚拟角色与真实人脸的相似程度更高,在一种实施方式中,可以在训练待训练的脸部处理模型的过程中,同时使用图像特征损失信息、捏脸参数损失信息、人脸恢复损失信息、人脸一致性损失信息和人脸对比损失信息,来调整待训练的脸部处理模型中的权重参数信息。
参见图2所示,为本公开实施例提供的另一种脸部处理方法的流程图,所述方法包括S201~S202,其中:
S201:获取真实人脸的待处理人脸图像。
S202:将所述待处理人脸图像输入至按照上述实施例所述的脸部处理方法训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
其中,脸部处理模型的训练过程可以参照前述实施例,这里不再赘述。
参见图3所示,为本公开实施例提供的另一种脸部处理方法的流程图,在图3所示的脸部处理方法中,同时使用了图像特征损失信息、捏脸参数损失信息、人脸恢复损失信息、人脸一致性损失信息和人脸对比损失信息,对脸部处理模型进行训练。脸部处理模型中包含编解码器,这里以编解码器为VQ-VAE-2模型为例进行描述。编解码器中包含编码器和解码器,可以定义编码器(Encoder)用符号E表示,解码器(Decoder)用符号D表示。
步骤1:获取训练样本。
训练样本中包含真实人脸的多个人脸图像对和虚拟角色对应的角色脸截图样本。真实人脸的多个人脸图像对包括源域(Source Domain)(即真实域)里无标签的图像。每个人脸图像对中包含同一人的两张不同人脸图像样本。虚拟角色对应的角色脸截图样本包括目标域(Target Domain)(即虚拟域)里有标签的图像。在每轮训练中,可以将任一人脸图像对、其他任一人脸图像以及任一虚拟角色对应的角色脸截图样本作为输入。这里,可以将人脸图像对中的任一人脸图像样本标记为将人脸图像对中的另一人脸图像样本标记为将其他任一人脸图像标记为将任一虚拟角色对应的角色脸截图样本标记为It
步骤2:将训练样本输入至编码器中,得到训练样本中各个图像分别对应的第一图像特征。
以训练样本中的各个图像对应256×256维的原始图像特征为例。编码器E可以将各个图像进行下采样,得到64×64维的底层潜映射,然后对64×64维的底层潜映射进行进一步缩小,得到量化后的32×32维的顶层潜映射,即第一图像特征。
通过上述下采样过程,可以分别得到任一人脸图像对、其他任一人脸图像以及任一虚拟角色对应的角色脸截图样本对应的32×32维的第一图像特征。
步骤3:将训练样本中各个图像分别对应的第一图像特征输入至解码器中,得到训练样本中各个图像分别对应的第二图像特征。
解码器D是与编码器E类似的对称结构,可以先对各个图像分别对应的第一图像特征进行上采样,得到原始维度的图像特征,即256×256维的图像特征。然后,进行量化处理,得到量化后的64×64维的顶层潜映射,即第二图像特征,与前面64×64维的底层潜映射的维度相同。
步骤4:将第二图像特征输入至脸部处理模型中的回归器中,得到训练样本中各个图像分别对应的第三图像特征以及各个图像分别对应的捏脸参数信息。
这里,定义回归器(Regressor)用符号R表示。回归器R设置在解码器D的最后一层。回归器R是由6层卷积层组成的。
这里,将步骤2得到的32×32维的顶层潜映射与步骤3得到的64×64维的顶层潜映射进行连接,可以得到64×64×128维的立体潜映射。如表1所示,将64×64×128维的立体潜映射作为输入,输入到回归器R中,经过回归器R中各个卷积层对立体潜映射的处理,可以在回归器R的L5层,也就是回归器R的第五层,得到2×2×256维的第三图像特征,然后在回归器R的最后一层,得到1×1×267维的捏脸参数信息。
表1
这里,人脸图像样本对应的第三图像特征可以表示为fS,任一虚拟角色对应的角色脸截图样本It对应的第三图像特征可以表示为ft。人脸图像样本对应的捏脸参数信息可以表示为P,人脸图像样本对应的捏脸参数信息可以表示为P+,其他任一人脸图像对应的捏脸参数信息可以表示为P-,任一虚拟角色对应的角色脸截图样本It对应的捏脸参数信息可以表示为Pt
步骤5:将人脸图像对中的人脸图像样本对应的捏脸参数信息P以及角色脸截图样本It对应的捏脸参数信息Pt分别输入至利用预先训练好的模拟器,生成人脸图像样本对应的第一人脸生成图像Is'以及角色脸截图样本It对应的第二人脸生成图像It'
这里,定义模拟器(Imitator)用符号G表示。定义第二人脸生成图像It'的图像信息表示为G(Pt),以及角色脸截图样本It的图像信息表示为It。这里的图像信息可以为红、绿、蓝的颜色通道信息。
在对模拟器G进行训练的时候,可以计算第二人脸生成图像It'的图像信息G(Pt)与角色脸截图样本It的图像信息It之间的损失,然后利用该损失对对模拟器进行训练。具体地,可以计算第二人脸生成图像It'的图像信息G(Pt)与角色脸截图样本It的图像信息It差值的平方。
第二人脸生成图像It'的图像信息G(Pt)与角色脸截图样本It的图像信息It之间的损失Limitation可以为:
通过最小化损失Limitation对模拟器G进行训练,训练好的模拟器G可以使得生成的第二人脸生成图像It'与角色脸截图样本It更相似。
步骤6:基于人脸图像样本对应的第三图像特征fS以及角色脸截图样本It对应的第三图像特征ft,确定图像特征损失信息;基于人脸图像样本对应的捏脸参数信息P以及角色脸截图样本It对应的捏脸参数信息Pt,确定捏脸参数损失信息;以及,基于图像特征损失信息和捏脸参数损失信息,确定第一损失信息。
这里可以使用最大平均差异MMD确定人脸图像样本与角色脸截图样本It之间的距离,这里得到第一损失信息为也就是域损失信息。
其中,对应图像特征损失信息,对应捏脸参数损失信息,Hk为具有核函数的核希尔伯特空间,核函数可以为高斯核函数σ为带宽,用于控制高斯核函数的局部作用范围。
步骤7:基于角色脸截图样本It的图像信息It和第二人脸生成图像It'的图像信息It',确定第二损失信息。
基于角色脸截图样本It的图像信息和第二人脸生成图像It'的图像信息,可以先确定第一子损失:
这里沿用VQ-VAE-2模型中的损失,定义为第二子损失,第二子损失可以为:其中,e是角色脸截图样本It的量化编码;sg为操作符,用于指示停止梯度的操作;α是超参数,这里可以设α=0.25。
第二损失信息为:Lprestored=Lparam+βLdiffer,即人脸恢复损失信息;其中β为权重系数,这里可以设β=0.25。
步骤8:将人脸图像样本和第一人脸生成图像Is'输入至预先训练的人脸重建模型中,分别得到人脸图像样本的第四图像特征和第一人脸生成图像Is'的第五图像特征;以及,将人脸图像样本和第一人脸生成图像Is'输入至预先训练的人脸识别模型中,分别得到人脸图像样本的第六图像特征和第一人脸生成图像Is'的第七图像特征;基于第四图像特征和第五图像特征,确定第一人脸一致性损失信息;以及,基于第六图像特征和第七图像特征,确定第二人脸一致性损失信息;以及,基于第一人脸一致性损失信息和第二人脸一致性损失信息,确定第三损失信息。
人脸重建模型可以标记为F3d,人脸识别模型可以标记为Fid
人脸图像样本的第四图像特征可以表示为F3d(IS),第一人脸生成图像Is'的第五图像特征可以表示为F3d(IS'),可以得到第一人脸一致性损失信息为:
人脸图像样本的第六图像特征可以表示为Fid(IS),第一人脸生成图像Is'的第七图像特征可以表示为Fid(IS'),可以得到第二人脸一致性损失信息为:
将第一人脸一致性损失信息和第二人脸一致性损失信息相加,可以得到第三损失信息:Lconsistenc y=L3d+Lid,即人脸一致性损失信息。
步骤9:基于人脸图像样本对应的捏脸参数信息P+,人脸图像样本对应的捏脸参数信息P-,角色脸截图样本It对应的捏脸参数信息Pt,确定第四损失信息。
这里,第四损失信息可以为:
即人脸对比损失信息;其中,τ为预设系数。
步骤10:基于第一损失信息、第二损失信息、第三损失信息、第四损失信息,可以确定总损失信息。
总损失信息为:L=λ1Ldomain2Lprestored3Lconsistenc y4Lcontrastiv e;其中,λ1=1,λ2=0.01,λ3=0.02,λ4=0.02。
步骤11:基于总损失信息,调整待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
调整待训练的脸部处理模型中的权重参数信息可以参照前述的调整过程,这里不再赘述。利用训练好的脸部处理模型,可以用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。得到的虚拟角色的捏脸参数信息可以在游戏引擎中渲染得到虚拟角色。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与脸部处理方法对应的脸部处理装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述脸部处理方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参照图4所示,为本公开实施例提供的一种脸部处理装置的架构示意图,所述装置包括:第一获取模块401、第一输入模块402、第一确定模块403、调整模块404;其中,
第一获取模块401,用于获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本;
第一输入模块402,用于将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息;
第一确定模块403,用于基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息;
调整模块404,用于基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
一种可选的实施方式中,所述装置还包括:
第二输入模块,用于将所述角色脸截图样本的第二捏脸参数信息输入至预先训练的模仿器中,得到第一人脸生成图像;
第二确定模块,用于基于所述角色脸截图样本的第一图像信息和所述第一人脸生成图像的第二图像信息,得到人脸恢复损失信息;
调整模块404,具体用于:
基于所述图像特征损失信息、所述捏脸参数损失信息和所述人脸恢复损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
一种可选的实施方式中,所述装置还包括:
第三输入模块,用于将所述人脸图像样本的第一捏脸参数信息输入至预先训练的模仿器中,得到第二人脸生成图像;
第三确定模块,用于将所述人脸图像样本和所述第二人脸生成图像输入至预先训练的第一人脸识别模型中,分别得到所述人脸图像样本的第三图像特征和所述第二人脸生成图像的第四图像特征;以及,将所述人脸图像样本和所述第二人脸生成图像输入至预先训练的第二人脸识别模型中,分别得到所述人脸图像样本的第五图像特征和所述第二人脸生成图像的第六图像特征;
第四确定模块,用于基于所述第三图像特征和所述第四图像特征,确定第一人脸一致性损失信息;以及,基于所述第五图像特征和所述第六图像特征,确定第二人脸一致性损失信息;
调整模块404,具体用于:
基于所述图像特征损失信息、所述捏脸参数损失信息、所述第一人脸一致性损失信息和所述第二人脸一致性损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
一种可选的实施方式中,所述人脸图像样本包括多个人脸图像对;每个所述人脸图像对中包含同一人的两张不同人脸图像;
针对每个人脸图像对,所述人脸图像样本对应的第一捏脸参数信息包括所述人脸图像样本对中两张人脸图像分别对应的第三捏脸参数信息和第四捏脸参数信息,以及所述其他任一人脸图像对应的第五捏脸参数信息;
所述装置还包括:
第五确定模块,用于基于所述第三捏脸参数信息、所述第四捏脸参数信息以及所述第五捏脸参数信息,确定人脸对比损失信息;
调整模块404,具体用于:
基于所述图像特征损失信息、所述捏脸参数损失信息和所述人脸对比损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
一种可选的实施方式中,调整模块404,具体用于:
对所述图像特征损失信息和所述捏脸参数损失信息的和求导,得到所述待训练的脸部处理模型中的权重参数的梯度函数;
将当前的权重参数信息代入至所述梯度函数中,得到迭代后的梯度向量;
基于所述迭代后的梯度向量,对所述当前的权重参数信息进行更新,得到更新后的权重参数信息。
一种可选的实施方式中,第一输入模块402,具体用于:
将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型的编码 器中,分别对所述人脸图像样本和所述角色脸截图样本进行下采样,得到所述人脸图像样本和所述角色脸截图样本分别在第一预设维度下的第七图像特征和第八图像特征;
将所述第七图像特征和所述第八图像特征分别输入至所述待训练的脸部处理模型的解码器中,对所述第七图像特征和所述第八图像特征进行上采样,得到所述人脸图像样本和所述角色脸截图样本分别在第二预设维度下的第九图像特征和第十图像特征;
将所述第七图像特征和所述第九图像特征、所述第八图像特征和所述第十图像特征分别输入至所述待训练的脸部处理模型的回归器中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息。
一种可选的实施方式中,所述装置还包括:
第二获取模块,用于获取所述角色脸截图样本的第六捏脸参数信息;
训练模块,用于将所述角色脸截图样本的第六捏脸参数信息作为输入、所述角色脸截图样本作为输出,对待训练的模仿器进行训练,得到训练好的模仿器。
参照图5所示,为本公开实施例提供的另一种脸部处理装置的架构示意图,所述装置包括:获取模块501、输入模块502;其中,
获取模块501,用于获取真实人脸的待处理人脸图像;
输入模块502,用于将所述待处理人脸图像输入至按照如图1所述的脸部处理方法训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
基于同一技术构思,本公开实施例还提供了一种计算机设备。参照图6所示,为本公开实施例提供的计算机设备600的结构示意图,包括处理器601、存储器602、和总线603。其中,存储器602用于存储执行指令,包括内存6021和外部存储器6022;这里的内存6021也称内存储器,用于暂时存放处理器601中的运算数据,以及与硬盘等外部存储器6022交换的数据,处理器601通过内存6021与外部存储器6022进行数据交换,当计算机设备600运行时,处理器601与存储器602之间通过总线603通信,使得处理器601在执行以下指令:
获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本;
将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息;
基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息;
基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
或,处理器601在执行以下指令:
获取真实人脸的待处理人脸图像;
将所述待处理人脸图像输入至按照上述脸部处理方法训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的脸部处理方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,该计算机产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的脸部处理方法的步骤,具体可参见上述方法实施例,在此不再赘述。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公 开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。

Claims (13)

  1. 一种脸部处理方法,其特征在于,包括:
    获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本;
    将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息;
    基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息;
    基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
  2. 根据权利要求1所述的方法,其特征在于,将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一捏脸参数信息之后,所述方法还包括:
    将所述角色脸截图样本的第二捏脸参数信息输入至预先训练的模仿器中,得到第一人脸生成图像;
    基于所述角色脸截图样本的第一图像信息和所述第一人脸生成图像的第二图像信息,得到人脸恢复损失信息;
    所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
    基于所述图像特征损失信息、所述捏脸参数损失信息和所述人脸恢复损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
  3. 根据权利要求1所述的方法,其特征在于,将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一捏脸参数信息之后,所述方法还包括:
    将所述人脸图像样本的第一捏脸参数信息输入至预先训练的模仿器中,得到第二人脸生成图像;
    将所述人脸图像样本和所述第二人脸生成图像输入至预先训练的第一人脸识别模型中,分别得到所述人脸图像样本的第三图像特征和所述第二人脸生成图像的第四图像特征;以及,将所述人脸图像样本和所述第二人脸生成图像输入至预先训练的第二人脸识别模型中,分别得到所述人脸图像样本的第五图像特征和所述第二人脸生成图像的第六图像特征;
    基于所述第三图像特征和所述第四图像特征,确定第一人脸一致性损失信息;以及,基于所述第五图像特征和所述第六图像特征,确定第二人脸一致性损失信息;
    所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
    基于所述图像特征损失信息、所述捏脸参数损失信息、所述第一人脸一致性损失信息和所述第二人脸一致性损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得 到训练好的脸部处理模型。
  4. 根据权利要求1所述的方法,其特征在于,所述人脸图像样本包括多个人脸图像对;每个所述人脸图像对中包含同一人的两张不同人脸图像;
    针对每个人脸图像对,所述人脸图像样本对应的第一捏脸参数信息包括所述人脸图像样本对中两张人脸图像分别对应的第三捏脸参数信息和第四捏脸参数信息,以及所述其他任一人脸图像对应的第五捏脸参数信息;
    将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一捏脸参数信息之后,所述方法还包括:
    基于所述第三捏脸参数信息、所述第四捏脸参数信息以及所述第五捏脸参数信息,确定人脸对比损失信息;
    所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
    基于所述图像特征损失信息、所述捏脸参数损失信息和所述人脸对比损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型。
  5. 根据权利要求1所述的方法,其特征在于,所述基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型,包括:
    对所述图像特征损失信息和所述捏脸参数损失信息的和求导,得到所述待训练的脸部处理模型中的权重参数的梯度函数;
    将当前的权重参数信息代入至所述梯度函数中,得到迭代后的梯度向量;
    基于所述迭代后的梯度向量,对所述当前的权重参数信息进行更新,得到更新后的权重参数信息。
  6. 根据权利要求1所述的方法,其特征在于,所述将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息,包括:
    将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型的编码器中,分别对所述人脸图像样本和所述角色脸截图样本进行下采样,得到所述人脸图像样本和所述角色脸截图样本分别在第一预设维度下的第七图像特征和第八图像特征;
    将所述第七图像特征和所述第八图像特征分别输入至所述待训练的脸部处理模型的解码器中,对所述第七图像特征和所述第八图像特征进行上采样,得到所述人脸图像样本和所述角色脸截图样本分别在第二预设维度下的第九图像特征和第十图像特征;
    将所述第七图像特征和所述第九图像特征、所述第八图像特征和所述第十图像特征分别输入至所述待训练的脸部处理模型的回归器中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息。
  7. 根据权利要求2所述的方法,其特征在于,将所述角色脸截图样本的第二捏脸参数信息输入至预先训练的模仿器中,得到第一人脸生成图像之前,所述方法还包括:
    获取所述角色脸截图样本的第六捏脸参数信息;
    将所述角色脸截图样本的第六捏脸参数信息作为输入、所述角色脸截图样本作为输出,对待训练的模仿器进行训练,得到训练好的模仿器。
  8. 一种脸部处理方法,其特征在于,包括:
    获取真实人脸的待处理人脸图像;
    将所述待处理人脸图像输入至按照权利要求1~7任一项所述的脸部处理方法训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
  9. 一种脸部处理装置,其特征在于,包括:
    第一获取模块,用于获取真实人脸的人脸图像样本以及虚拟角色对应的角色脸截图样本;
    第一输入模块,用于将所述人脸图像样本和所述角色脸截图样本分别输入至待训练的脸部处理模型中,得到所述人脸图像样本对应的第一图像特征和第一捏脸参数信息、所述角色脸截图样本对应的第二图像特征和第二捏脸参数信息;
    第一确定模块,用于基于所述第一图像特征和所述第二图像特征,确定图像特征损失信息;以及基于所述第一捏脸参数信息和所述第二捏脸参数信息,确定捏脸参数损失信息;
    调整模块,用于基于所述图像特征损失信息和所述捏脸参数损失信息,调整所述待训练的脸部处理模型中的权重参数信息,得到训练好的脸部处理模型;所述训练好的脸部处理模型用于在输入真实人脸的待处理人脸图像后,得到对应虚拟角色的捏脸参数信息。
  10. 一种脸部处理装置,其特征在于,包括:
    获取模块,用于获取真实人脸的待处理人脸图像;
    输入模块,用于将所述待处理人脸图像输入至按照权利要求1~7任一项所述的脸部处理方法训练后得到的脸部处理模型中,得到所述待处理人脸图像的捏脸参数信息;所述捏脸参数信息用于渲染得到虚拟场景中的虚拟角色。
  11. 一种计算机设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至7任一项所述的脸部处理方法的步骤或执行如权利要求8所述的脸部处理方法的步骤。
  12. 一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至7任一项所述的脸部处理方法的步骤或执行如权利要求8所述的脸部处理方法的步骤。
  13. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现如权利要求1至7任一项所述的脸部处理方法的步骤或执行如权利要求8所述的脸部处理方法的步骤。
PCT/CN2023/074288 2022-02-25 2023-02-02 一种脸部处理方法、装置、计算机设备及存储介质 WO2023160350A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210180995.9 2022-02-25
CN202210180995.9A CN115311127A (zh) 2022-02-25 2022-02-25 一种脸部处理方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023160350A1 true WO2023160350A1 (zh) 2023-08-31

Family

ID=83855745

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074288 WO2023160350A1 (zh) 2022-02-25 2023-02-02 一种脸部处理方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN115311127A (zh)
WO (1) WO2023160350A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116884077A (zh) * 2023-09-04 2023-10-13 上海任意门科技有限公司 一种人脸图像类别确定方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311127A (zh) * 2022-02-25 2022-11-08 北京字跳网络技术有限公司 一种脸部处理方法、装置、计算机设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902767A (zh) * 2019-04-11 2019-06-18 网易(杭州)网络有限公司 模型训练方法、图像处理方法及装置、设备和介质
CN110717977A (zh) * 2019-10-23 2020-01-21 网易(杭州)网络有限公司 游戏角色脸部处理的方法、装置、计算机设备及存储介质
US20200202111A1 (en) * 2018-12-19 2020-06-25 Netease (Hangzhou) Network Co.,Ltd. Image Processing Method and Apparatus, Storage Medium and Electronic Device
CN111632374A (zh) * 2020-06-01 2020-09-08 网易(杭州)网络有限公司 游戏中虚拟角色的脸部处理方法、装置及可读存储介质
CN111729314A (zh) * 2020-06-30 2020-10-02 网易(杭州)网络有限公司 一种虚拟角色的捏脸处理方法、装置及可读存储介质
CN113052962A (zh) * 2021-04-02 2021-06-29 北京百度网讯科技有限公司 模型训练、信息输出方法,装置,设备以及存储介质
CN113409437A (zh) * 2021-06-23 2021-09-17 北京字节跳动网络技术有限公司 一种虚拟角色捏脸的方法、装置、电子设备及存储介质
CN115311127A (zh) * 2022-02-25 2022-11-08 北京字跳网络技术有限公司 一种脸部处理方法、装置、计算机设备及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200202111A1 (en) * 2018-12-19 2020-06-25 Netease (Hangzhou) Network Co.,Ltd. Image Processing Method and Apparatus, Storage Medium and Electronic Device
CN109902767A (zh) * 2019-04-11 2019-06-18 网易(杭州)网络有限公司 模型训练方法、图像处理方法及装置、设备和介质
CN110717977A (zh) * 2019-10-23 2020-01-21 网易(杭州)网络有限公司 游戏角色脸部处理的方法、装置、计算机设备及存储介质
CN111632374A (zh) * 2020-06-01 2020-09-08 网易(杭州)网络有限公司 游戏中虚拟角色的脸部处理方法、装置及可读存储介质
CN111729314A (zh) * 2020-06-30 2020-10-02 网易(杭州)网络有限公司 一种虚拟角色的捏脸处理方法、装置及可读存储介质
CN113052962A (zh) * 2021-04-02 2021-06-29 北京百度网讯科技有限公司 模型训练、信息输出方法,装置,设备以及存储介质
CN113409437A (zh) * 2021-06-23 2021-09-17 北京字节跳动网络技术有限公司 一种虚拟角色捏脸的方法、装置、电子设备及存储介质
CN115311127A (zh) * 2022-02-25 2022-11-08 北京字跳网络技术有限公司 一种脸部处理方法、装置、计算机设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116884077A (zh) * 2023-09-04 2023-10-13 上海任意门科技有限公司 一种人脸图像类别确定方法、装置、电子设备及存储介质
CN116884077B (zh) * 2023-09-04 2023-12-08 上海任意门科技有限公司 一种人脸图像类别确定方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN115311127A (zh) 2022-11-08

Similar Documents

Publication Publication Date Title
CN110717977B (zh) 游戏角色脸部处理的方法、装置、计算机设备及存储介质
WO2023160350A1 (zh) 一种脸部处理方法、装置、计算机设备及存储介质
CN109902767B (zh) 模型训练方法、图像处理方法及装置、设备和介质
US10885693B1 (en) Animating avatars from headset cameras
CN111354079B (zh) 三维人脸重建网络训练及虚拟人脸形象生成方法和装置
US11087521B1 (en) Systems and methods for rendering avatars with deep appearance models
US11276231B2 (en) Semantic deep face models
US10839575B2 (en) User-guided image completion with image completion neural networks
CN111632374B (zh) 游戏中虚拟角色的脸部处理方法、装置及可读存储介质
CN113822437B (zh) 深度分层的变分自动编码器
CN109635745A (zh) 一种基于生成对抗网络模型生成多角度人脸图像的方法
CN113272870A (zh) 用于逼真的实时人像动画的系统和方法
CN116109798B (zh) 图像数据处理方法、装置、设备及介质
US11475608B2 (en) Face image generation with pose and expression control
CN110213521A (zh) 一种虚拟即时通信方法
EP3991140A1 (en) Portrait editing and synthesis
WO2023185395A1 (zh) 一种面部表情捕捉方法、装置、计算机设备及存储介质
US11514638B2 (en) 3D asset generation from 2D images
US20220156987A1 (en) Adaptive convolutions in neural networks
CN115914505B (zh) 基于语音驱动数字人模型的视频生成方法及系统
Huang et al. Real-world automatic makeup via identity preservation makeup net
WO2022166840A1 (zh) 人脸属性编辑模型的训练方法、人脸属性编辑方法及设备
CN113781324B (zh) 一种老照片修复方法
WO2023185398A1 (zh) 一种脸部处理方法、装置、计算机设备及存储介质
CN115880400A (zh) 卡通数字人的形象生成方法、装置、电子设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23758975

Country of ref document: EP

Kind code of ref document: A1