CN116862757A - Method, device, electronic equipment and medium for controlling face stylization degree - Google Patents

Method, device, electronic equipment and medium for controlling face stylization degree Download PDF

Info

Publication number
CN116862757A
CN116862757A CN202310569269.0A CN202310569269A CN116862757A CN 116862757 A CN116862757 A CN 116862757A CN 202310569269 A CN202310569269 A CN 202310569269A CN 116862757 A CN116862757 A CN 116862757A
Authority
CN
China
Prior art keywords
face
stylized
model
target
degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310569269.0A
Other languages
Chinese (zh)
Inventor
刘思远
甘启
张辉
章子维
张璐
陶明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Renyimen Technology Co ltd
Original Assignee
Shanghai Renyimen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Renyimen Technology Co ltd filed Critical Shanghai Renyimen Technology Co ltd
Priority to CN202310569269.0A priority Critical patent/CN116862757A/en
Publication of CN116862757A publication Critical patent/CN116862757A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method, a device, electronic equipment and a medium for controlling the degree of facial stylization, and relates to the technical field of vision. Acquiring a target stylized image to train a real face StyleGAN model to obtain an initial stylized face StyleGAN model; respectively inputting the initial vector to a real face StyleGAN model and an initial stylized face StyleGAN model to obtain first face data and second face data; training the initial stylized face StyleGAN model according to the difference value of the two face data to obtain a target stylized face StyleGAN model. According to the application, the target face image generated by the target stylized face StyleGAN model is used for realizing the stylized image generation constrained by the face identity information, so that the authenticity and controllability of the generated image can be improved, and the influence caused by characteristic distortion is avoided.

Description

Method, device, electronic equipment and medium for controlling face stylization degree
Technical Field
The present application relates to the field of vision technologies, and in particular, to a method, an apparatus, an electronic device, and a medium for controlling a facial stylization degree.
Background
The face stylization technology is a technology capable of converting an original face image into face images of different styles, and can be used in various fields of portrait art creation, video entertainment, virtual reality, games and the like. In the prior art, the face stylization technology is mainly realized based on a StyleGAN model. Specifically, a real face StyleGAN model trained in advance is finely tuned by utilizing a certain number of target stylized face images, so that a stylized face StyleGAN model is obtained. Because the pre-trained real face StyleGAN model contains a large amount of face priori knowledge, the finely tuned stylized face StyleGAN model can also output a stylized face with higher quality.
In practical use, when generating a stylized face by using the stylized face StyleGAN model after fine adjustment, if the degree of fine adjustment is shallow, the stylized effect of the obtained stylized face is insufficient; if the fine tuning is deep, the obtained face information of the stylized face is seriously lost.
Therefore, providing a method for controlling the degree of facial stylization to balance the degree of realism and the degree of stylization of the resulting stylized face is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a method, a device, electronic equipment and a medium for controlling the degree of facial stylization, which realize the generation of stylized images constrained by face identity information by using a face image generated by a target stylized face StyleGAN model, and can improve the authenticity and controllability of the generated images and avoid the influence caused by characteristic distortion.
In order to solve the technical problems, the application provides a method for controlling the degree of facial stylization, which comprises the following steps:
acquiring a target stylized image, and training a real face StyleGAN model according to the target stylized image to obtain an initial stylized face StyleGAN model;
inputting an initial vector to the real face StyleGAN model to obtain first face data;
inputting the initial vector to the initial stylized face StyleGAN model to obtain second face data;
training the initial stylized face StyleGAN model according to the difference value between the first face data and the second face data to obtain a target stylized face StyleGAN model;
and acquiring the face to be stylized, and inputting the face to be stylized into the target stylized face StyleGAN model to obtain the target stylized face.
Preferably, training the initial stylized face style gan model according to the difference value between the first face data and the second face data to obtain a target stylized face style gan model, including:
training the initial stylized face StyleGAN model according to the first face data, the second face data and the face identity information loss function to obtain the target stylized face StyleGAN model.
Preferably, training the initial stylized face style gan model according to the first face data, the second face data, and the face identity information loss function to obtain the target stylized face style gan model, including:
extracting first identity information of the first face data by using a preset face identity information extraction model;
extracting second identity information of the second face data by using the preset face identity information extraction model;
calculating the face identity information loss function according to the first identity information and the second identity information;
training the initial stylized face StyleGAN model according to the face identity information loss function to obtain the target stylized face StyleGAN model.
Preferably, acquiring the target stylized image includes:
acquiring initial stylized images, and extracting key points of each initial stylized image by using a preset face key point model;
and carrying out alignment operation on the real face according to the key points to obtain a target stylized image.
Preferably, inputting the face to be stylized into the target stylized face style gan model to obtain a target stylized face, including:
in the training process, acquiring a plurality of stylized faces with different stylized degrees output by the target stylized face StyleGAN model;
determining a preset area in each stylized face according to the style degree of each stylized face;
and fusing the preset areas to obtain the target stylized face.
Preferably, in the training process, obtaining a plurality of stylized faces with different stylized degrees output by the target stylized face style gan model includes:
in the training process, determining a plurality of stylized faces with different stylized degrees output by the target stylized face StyleGAN model according to different iteration times;
the iteration times are positively correlated with the output stylized degree of the stylized face.
Preferably, the determining the preset area in each stylized face includes:
and determining a preset area in each stylized face by using a preset key point positioning model.
Preferably, when the stylized face includes two stylized faces of a first stylized degree and a second stylized degree, and the first stylized degree is smaller than the second stylized degree, determining a preset area in each of the stylized faces according to the style degree of each of the stylized faces includes:
positioning a first five-sense region on the stylized face of the first stylized degree through a preset face key point positioning model;
positioning a second five-sense organ region on the stylized face of the second stylized degree through the preset face key point positioning model;
fusing the preset areas to obtain a target stylized face, wherein the method comprises the following steps:
and replacing a second five-sense organ region on the stylized face with the second stylized degree with the first five-sense organ region, and taking the replaced stylized face with the second stylized degree as the target stylized face.
In order to solve the technical problem, the application also provides a device for controlling the facial stylization degree, which comprises:
the initial model generation unit is used for acquiring a target stylized image, training a real face StyleGAN model according to the target stylized image, and obtaining an initial stylized face StyleGAN model;
the first face data generating unit is used for inputting an initial vector to the real face StyleGAN model to obtain first face data;
the second face data generating unit is used for inputting the initial vector to the initial stylized face StyleGAN model to obtain second face data;
the target model generating unit is used for training the initial stylized face StyleGAN model according to the difference value between the first face data and the second face data to obtain a target stylized face StyleGAN model;
the target stylized face generation unit is used for acquiring the face to be stylized and inputting the face to be stylized into the target stylized face StyleGAN model to obtain the target stylized face.
In order to solve the technical problem, the present application further provides an electronic device, including:
a memory for storing a computer program;
a processor for implementing the steps of the method of controlling the degree of facial stylization as described above when storing a computer program.
To solve the above technical problem, the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for controlling the facial stylization degree as described above.
The application provides a method, a device, electronic equipment and a medium for controlling the degree of facial stylization, and relates to the technical field of vision. Acquiring a target stylized image to train a real face StyleGAN model to obtain an initial stylized face StyleGAN model; respectively inputting the initial vector to a real face StyleGAN model and an initial stylized face StyleGAN model to obtain first face data and second face data; training the initial stylized face StyleGAN model according to the difference value of the two face data to obtain a target stylized face StyleGAN model. According to the application, the generation of the stylized image constrained by the face identity information is realized through the face image generated by the target stylized face StyleGAN model, so that the authenticity and controllability of the generated image can be improved, and the influence caused by the characteristic distortion is avoided.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required in the prior art and the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for controlling a facial stylization degree according to the present application;
FIG. 2 is a block diagram of a structure for adjusting an initial stylized face StyleGAN model provided by the application;
FIG. 3 is a block diagram of a device for controlling the stylization degree of a human face according to the present application;
fig. 4 is a block diagram of an electronic device according to the present application.
Detailed Description
The core of the application is to provide a method, a device, electronic equipment and a medium for controlling the degree of facial stylization, which realize the generation of stylized images constrained by face identity information by using a face image generated by a target stylized face StyleGAN model, and can improve the authenticity and controllability of the generated images and avoid the influence caused by characteristic distortion.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 1, fig. 1 is a flow chart of a method for controlling a facial stylization degree according to the present application, where the method includes:
s11: acquiring a target stylized image, and training a real face StyleGAN model according to the target stylized image to obtain an initial stylized face StyleGAN model;
specifically, the real face style gan model is a generation model based on GAN (Generative Adversarial Networks) neural network structure, which is capable of generating a new image by learning real image data. The target stylized image is used as an input for training the real face StyleGAN model, where the target stylized image may be regarded as a constraint condition for adjusting parameters of the real face StyleGAN model to obtain a preliminary stylized face StyleGAN model (i.e., an initial stylized face StyleGAN model). For example, some optimization algorithms may be used to minimize the objective function so that the generated image is closest to the target stylized image. The initial stylized face StyleGAN model trained in this manner may be used to generate sufficient sample data.
Specifically, in this embodiment, some real face image data is first prepared for training a real face StyleGAN model (this step is the same as the manner in the prior art, and is not described here again). Then, a target stylized image is acquired, for example, an image is extracted from a certain painter work as a target style, or a certain cartoon character is taken as a target style. Next, an initial stylized face style gan model is obtained using a preset algorithm, which is capable of generating a sample image that approximates the target stylized image.
The step can provide enough sample data for the subsequent steps, and can also effectively improve the quality and accuracy of the generated image. By reasonably adjusting parameters, more realistic images can be obtained.
S12: inputting the initial vector to a real face StyleGAN model to obtain first face data;
in the above embodiment, an initial stylized face StyleGAN model has been obtained, which is then used to generate the first batch of data. In order to obtain a better image, a random noise vector may be used as an initial vector, so that the diversity of samples and the degree of difference from the target stylized image may be increased.
Specifically, the already trained real face StyleGAN model is loaded first, and then we need to set some parameters, such as the size of the initial vector, the number of generated pictures, the size and file format, etc. Finally, the initial vector is input into the real face StyleGAN model to generate a batch of pictures (i.e. first face data).
Where the initial vector is typically a random noise vector, it should be noted that since the image corresponding to the same initial vector is not unique, we may need to generate multiple samples and filter in a later step.
The step can provide enough sample data for the subsequent steps, and can also effectively improve the quality and accuracy of the generated image.
S13: inputting the initial vector to an initial stylized face StyleGAN model to obtain second face data;
in the above embodiment, a preliminary stylized face StyleGAN model has been obtained, which is then used to generate a second batch of data. Specifically, the trained initial stylized face StyleGAN model and the initial vector are loaded first, and then the initial vector is mapped through the initial stylized face StyleGAN model, so that second face data are finally obtained.
The step can further increase the diversity and the difference degree of the samples and improve the quality and the accuracy of the generated image. By extracting the characteristic information among different styles, more stable and accurate image style transformation can be obtained.
S14: training the initial stylized face StyleGAN model according to the difference value between the first face data and the second face data to obtain a target stylized face StyleGAN model;
s15: and acquiring the face to be stylized, and inputting the face to be stylized into a target stylized face StyleGAN model to obtain the target stylized face.
In the above embodiment, two batches of face data (first face data and second face data, respectively) have been obtained for training the real face StyleGAN model and the initial stylized face StyleGAN model, respectively. Then, in order to train the target stylized face StyleGAN model, we need to compare the real face StyleGAN model and the initial stylized face StyleGAN model to find different features and differences, and then feed back these difference information to the initial stylized face StyleGAN model to finely adjust the parameters of the initial stylized face StyleGAN model, so as to obtain a more accurate model.
The specific way of comparing the difference between two batches of face data may be: and inputting the two batches of face data into a pre-training model or program, and calculating the degree of difference between the first face data and the second face data by using different evaluation indexes so as to optimize the initial stylized face StyleGAN model. Finally, we can save the trained target stylized face style gan model for subsequent applications.
The quality and accuracy of the generated image can be further improved by comparing the difference degree between the two batches of sample data, and key characteristics of the target stylized image are reserved. By continuously adjusting parameters of the initial stylized face StyleGAN model, more realistic and accurate images can be obtained to balance the degree of realism and the degree of stylization of the face generated by the target stylized face StyleGAN model.
In summary, according to the method for controlling the stylization degree of the face, the target stylized image is obtained to train the real face StyleGAN model to obtain an initial stylized face StyleGAN model; respectively inputting the initial vector to a real face StyleGAN model and an initial stylized face StyleGAN model to obtain first face data and second face data; training the initial stylized face StyleGAN model according to the difference value of the two face data to obtain a target stylized face StyleGAN model. According to the application, the generation of the stylized image constrained by the face identity information is realized through the face image generated by the target stylized face StyleGAN model, so that the authenticity and controllability of the generated image can be improved, and the influence caused by the characteristic distortion is avoided.
Based on the above embodiments:
as a preferred embodiment, training the initial stylized face style gan model according to the difference between the first face data and the second face data to obtain the target stylized face style gan model, including:
training the initial stylized face StyleGAN model according to the first face data, the second face data and the face identity information loss function to obtain a target stylized face StyleGAN model.
The embodiment aims at limiting how to perform meaningful difference calculation between the initial vector and the target stylized image when the face is stylized, and training the initial stylized face StyleGAN model by using the difference to obtain the target stylized face StyleGAN model.
Specifically, based on the idea of style migration, the purpose of changing the style of the picture can be achieved by fusing pictures of different styles. In the technology, a target stylized image needs to be acquired first, and then data generation is performed through a real face StyleGAN model and an initial stylized face StyleGAN model. Namely, the initial vector is input into an initial stylized face StyleGAN model and a real face StyleGAN model to respectively obtain first face data and second face data, and a group of difference data is obtained by carrying out difference operation on the two data. And then adding a face identity information loss function, and performing targeted training through the constraint of face identity information to finally obtain the target stylized face StyleGAN model.
It should be noted that, the step of training and adjusting the initial stylized face style gan model according to the difference value and the face identity information loss function may be performed several times until the training times reach the preset training times or until the image output by the target stylized face style gan model reaches the preset effect.
The mode in the embodiment can effectively realize the generation of the target stylized image. Because the original face data (the first face data) and the initial stylized data (the second face data) are respectively considered in the training process, the image generated by the target stylized face StyleGAN model can be ensured to have the characteristics of a target style, and meanwhile, the identity information of the original face can be better reserved, so that the reality degree and the stylized degree of the generated face are better balanced.
Referring to fig. 2, fig. 2 is a block diagram illustrating a structure of a style gan model for adjusting an initial stylized face according to the present application.
As a preferred embodiment, training the initial stylized face style gan model according to the first face data, the second face data, and the face identity information loss function to obtain the target stylized face style gan model, including:
extracting first identity information of first face data by using a preset face identity information extraction model;
extracting second identity information of the second face data by using a preset face identity information extraction model;
calculating a face identity information loss function according to the first identity information and the second identity information;
training the initial stylized face StyleGAN model according to the face identity information loss function to obtain a target stylized face StyleGAN model.
Specifically, in the process of realizing the face style migration, the conversion of the target style is realized by training the StyleGAN model. But merely by the extrinsic style of the image, it is not guaranteed that the converted image can retain the identity of the original person. Therefore, the training process needs to be restrained by the loss function of the face identity information, so that a more accurate target style conversion effect is achieved.
Therefore, in this embodiment, the training process is constrained by the face identity information loss function, so as to achieve a more accurate target style conversion effect. Specifically, the face identity information loss function guides the initial stylized face StyleGAN model to pay more attention to the preservation and conversion of the identity information in the training process by calculating the difference of the first data and the second data on the identity information. Therefore, the obtained target stylized face StyleGAN model can more accurately combine the target style with the original identity characteristics, and therefore higher-quality style migration is achieved.
Specifically, the specific process of constraining the training process through the face identity information loss function is as follows: and respectively extracting the first identity information corresponding to the first face data and the second identity information corresponding to the second face data by using the trained preset face identity information extraction model, and then calculating a face identity information loss function according to the first identity information and the second identity information so as to train the initial stylized face StyleGAN model according to the face identity information loss function to obtain the target stylized face StyleGAN model.
Therefore, the mode in the embodiment can combine the target style and the original identity feature, and achieves higher-quality style migration. Specifically, for any original face image, only the corresponding initial vector is required to be input, so that the corresponding target stylized image can be obtained through the trained target stylized face StyleGAN model, and the migration of styles can be realized while the original identity characteristics are reserved. In addition, because the model optimization is performed by adopting the face identity information loss function, the model training effect is more accurate, and the quality and stability of style migration are improved.
As a preferred embodiment, acquiring the target stylized image includes:
acquiring initial stylized images, and extracting key points of each initial stylized image by using a preset face key point model;
and carrying out alignment operation on the real face according to the key points to obtain a target stylized image.
Specifically, to achieve the fusion of the real face and the target style, the technical problem to be solved is how to acquire the target stylized image for training the required initial stylized face StyleGAN model.
Therefore, in this embodiment, it is necessary to first obtain an initial stylized image, where the manner of obtaining the initial stylized image may be selected from an existing dataset or may be generated by a StyleGAN model, which is not limited herein. And then automatically extracting key points of each initial stylized image through a preset face key point model. The key points are used to perform alignment operations on the real face (the alignment operations may include, but are not limited to, rotation/translation/magnification or reduction, etc.) so as to facilitate subsequent stylization processing.
For example, assume that a stylized face generation model of a cartoon character is required. First, some initial stylized image (corresponding to the cartoon character) needs to be generated from the existing dataset or through the stylgan model. And then, extracting key points of each initial stylized image by using a preset face key point model, and aligning the real face to the initial stylized image. Then, an image with a target style is input and compared with an initial stylized image, an image close to the target style is obtained through a convolutional neural network training model, the image is used as a target stylized image to be used later, so that a real face StyleGAN model is trained, and the target stylized face StyleGAN model is obtained.
In summary, the specific implementation manner of acquiring the target stylized image provided in this embodiment may effectively improve quality and fidelity of the stylized image generated by the target stylized face style gan model.
As a preferred embodiment, inputting a face to be stylized into a target stylized face StyleGAN model to obtain a target stylized face includes:
in the training process, acquiring a plurality of stylized faces with different stylized degrees output by a target stylized face StyleGAN model;
determining a preset area in each stylized face according to the style degree of each stylized face;
and fusing all preset areas to obtain the target stylized face.
Specifically, after the target stylized face style gan is obtained, in order to further optimize the stylized effect, the stylized face style gan is made to more meet the needs and expectations of users. In the embodiment, when the target stylized face StyleGAN model is trained, the stylized effect can be further optimized by performing the stylized operation on different areas in addition to the optimization of the overall style effect.
Therefore, the trained target stylized face StyleGAN model is used to generate a series of stylized faces with different stylized degrees, and each preset area is determined by comparing and analyzing the stylized faces and marked on the corresponding stylized face; and fusing all preset areas to generate the synthesized target stylized face.
For example, assuming that the user desires to style his own photograph into a cartoon character, an initial target stylized face StyleGAN model is obtained after the operations of S11-S14. Further, through the operation in this embodiment, a series of stylized faces with different stylized degrees are generated, and the stylized faces are compared and analyzed, so as to determine each preset area, such as the color, the line, the five-sense organ area, the background area, the face contour area, and the like of the picture. And finally, fusing all preset areas to generate a synthesized target stylized face, thereby realizing a finer stylized effect.
It should be noted that, the method of fusing the preset area may be alpha fusion, or may be other implementation methods, which is not particularly limited herein.
In summary, in the manner of this embodiment, the stylized effect may be further optimized, so that the generated target stylized face is closer to the needs and expectations of the user. Meanwhile, more user-defined options can be provided for the user, so that the user can more flexibly adjust the stylized effect of different areas, and personalized requirements are met.
As a preferred embodiment, in the training process, acquiring a plurality of stylized faces with different stylized degrees output by the target stylized face style gan model includes:
in the training process, determining a plurality of stylized faces with different stylized degrees output by a target stylized face StyleGAN model according to different iteration times;
the iteration times are positively correlated with the stylized degree of the output stylized face.
The present embodiment aims to define how to efficiently determine a desired degree of style when a target is stylized, and to output stylized faces of different degrees of stylization.
Specifically, by training the target stylized face StyleGAN model, the degree of style changes as the number of iterations increases in the output stylized face. Therefore, the effect of performing multiple stylized treatments on the target can be achieved by outputting stylized faces with different style degrees according to different iteration times.
For example, if 100 iterations are increased, a stylized face with different styles is output, so that a plurality of stylized faces with different styles can be obtained. And then determining respective preset areas according to the stylization degree of each stylized face. And finally, fusing all preset areas to obtain the target stylized face.
It should be understood that in the same training process, the preset areas are not overlapped with each other, and the target style face obtained after the fusion of the preset areas is a complete image.
Taking the case of using the cartoon character style to style a real face as an example, the stylized faces with different cartoon character styles can be obtained from the output stylized faces through training the target stylized face StyleGAN model. By selecting the unique preset areas such as eyes, lips and the like for fusion, a target stylized face integrating the styles of a plurality of cartoon characters is obtained.
In summary, through the specific implementation manner in this embodiment, the style of the target stylized face may be more diversified, and the user may select according to his own preference, so as to achieve a free customization effect.
As a preferred embodiment, the process of determining the preset area in each stylized face includes:
and determining a preset area in each stylized face by using a preset key point positioning model.
Specifically, the preset keypoint location model may determine the positions of various constituent parts of the face, such as eyes, nose, mouth, face shape, etc., through a face keypoint detection algorithm or a face pose estimation algorithm. These keypoints may be used to define the location and size of various preset areas.
Therefore, in this embodiment, the generated stylized face image is preprocessed by using the preset key point positioning model, so as to obtain the position and the size of each preset area, so that the corresponding areas in different stylized face images are combined with the application of the image fusion technology on each preset area, and the combined parts are spliced together to generate the final target stylized face.
The method in the embodiment can improve the accuracy and naturalness of the generated result, reduce the time and workload required by manual adjustment and improve the generation efficiency.
As a preferred embodiment, when the stylized face includes two stylized faces of a first stylized degree and a second stylized degree, and the first stylized degree is smaller than the second stylized degree, determining a preset area in each of the stylized faces according to the style degree of each of the stylized faces includes:
positioning a first five-sense region on a stylized face of a first stylized degree through a preset face key point positioning model;
positioning a second five-sense organ region on a stylized face of a second stylized degree through a preset face key point positioning model;
fusing all preset areas to obtain a target stylized face, wherein the method comprises the following steps:
and replacing the second five-sense organ area on the stylized face with the second stylized degree with the first five-sense organ area, and taking the replaced stylized face with the second stylized degree as the target stylized face.
Firstly, a preset human face key point positioning model is described, the model is based on deep learning, and key points of a human face, including eyes, nose, mouth and other five sense organs, can be accurately identified by training a large amount of human face data. According to the coordinate information of the key points, the positioning and the replacement of the specific five-sense organ area can be realized.
The application aims to provide a specific implementation mode for obtaining a target stylized face based on fusion of preset areas of face data with different stylized degrees, in particular to a method for positioning a first five-sense organ area and a second five-sense organ area on the target stylized face through a preset face key point positioning model and determining a replacement position and a replacement mode so as to achieve the purpose of fusing the stylized faces with two stylized degrees.
The specific implementation steps are as follows: the method comprises the steps of positioning a first five-sense organ region on a stylized face with a first stylized degree and a second five-sense organ region on the stylized face with a second stylized degree through a preset face key point positioning model. It should be noted that the positioning of these two areas should be based on the same face to ensure that the effect after replacement is natural. And then replacing the second five sense organ region on the stylized face of the second stylized degree with the first five sense organ region. The replacement process can be implemented by adopting a simple image processing technology, such as image fusion or stitching, and the like, and can also adopt an alpha fusion technology. At this time, the replaced stylized face of the second stylized degree is the target stylized face.
For example, in the training process, the first stylized degree of the stylized face generated by the target stylized face StyleGAN model of a male face image is insufficient in stylization degree but obvious in reality, such as wrinkles and beard are obvious; the stylized face of the second stylized degree generated by the target stylized face StyleGAN model of the male face image has stronger stylized degree but lower reality, such as the hairstyle of the face image and the background and other areas behind the face image are changed greatly, but the similarity is lost; by the method for fusing the deep five sense organs provided in the embodiment, the five sense organ regions in the stylized face with the second stylized degree are adaptively replaced by the five sense organ regions of the stylized face with the first stylized degree. The method not only ensures the similarity of the identity of the human face, but also improves the stylization of the human face image, and obtains the best result.
In summary, the manner in this embodiment can enable the user to achieve a personalized stylized effect by locating and replacing the specific five sense organs, so as to better satisfy the requirements thereof. Meanwhile, the accuracy and naturalness of the stylized effect can be improved by using the preset face key point positioning model, so that the user experience is improved.
Referring to fig. 3, fig. 3 is a block diagram of a device for controlling a facial stylization degree according to the present application, where the device includes:
an initial model generating unit 31, configured to acquire a target stylized image, train a real face StyleGAN model according to the target stylized image, and obtain an initial stylized face StyleGAN model;
a first face data generating unit 32, configured to input an initial vector to a real face style gan model, to obtain first face data;
a second face data generating unit 33, configured to input an initial vector to an initial stylized face StyleGAN model, to obtain second face data;
a target model generating unit 34, configured to train the initial stylized face style gan model according to the difference between the first face data and the second face data, to obtain a target stylized face style gan model;
the target stylized face generating unit 35 is configured to obtain a face to be stylized, and input the face to be stylized to the target stylized face style gan model to obtain the target stylized face.
For the description of the device for controlling the facial stylization degree, refer to the above embodiment, and the disclosure is not repeated here.
In order to solve the above technical problems, the present application further provides an electronic device, please refer to fig. 4, fig. 4 is a block diagram of the electronic device provided by the present application, the electronic device includes:
a memory 41 for storing a computer program;
a processor 42 for implementing the steps of the method of controlling the degree of facial stylization as described above when storing the computer program. For the description of the electronic device, please refer to the above embodiment, and the description of the present application is omitted herein.
In order to solve the above technical problem, the present application further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method for controlling the facial stylization degree as described above. For the description of the computer-readable storage medium, refer to the above embodiments, and the disclosure is not repeated here.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. A method of controlling the degree of facial stylization, comprising:
acquiring a target stylized image, and training a real face StyleGAN model according to the target stylized image to obtain an initial stylized face StyleGAN model;
inputting an initial vector to the real face StyleGAN model to obtain first face data;
inputting the initial vector to the initial stylized face StyleGAN model to obtain second face data;
training the initial stylized face StyleGAN model according to the difference value between the first face data and the second face data to obtain a target stylized face StyleGAN model;
and acquiring the face to be stylized, and inputting the face to be stylized into the target stylized face StyleGAN model to obtain the target stylized face.
2. The method of claim 1, wherein training the initial stylized face style gan model according to the difference between the first face data and the second face data to obtain a target stylized face style gan model comprises:
training the initial stylized face StyleGAN model according to the first face data, the second face data and the face identity information loss function to obtain the target stylized face StyleGAN model.
3. The method of controlling a degree of facial stylization of claim 2, wherein training the initial stylized face stylized gan model according to the first face data, the second face data, and a face identity information loss function to obtain the target stylized face stylized gan model comprises:
extracting first identity information of the first face data by using a preset face identity information extraction model;
extracting second identity information of the second face data by using the preset face identity information extraction model;
calculating the face identity information loss function according to the first identity information and the second identity information;
training the initial stylized face StyleGAN model according to the face identity information loss function to obtain the target stylized face StyleGAN model.
4. The method of controlling a degree of facial stylization of claim 1, wherein obtaining a target stylized image comprises:
acquiring initial stylized images, and extracting key points of each initial stylized image by using a preset face key point model;
and carrying out alignment operation on the real face according to the key points to obtain a target stylized image.
5. A method of controlling a degree of face stylization as recited in any one of claims 1-4, wherein inputting the face to be stylized into the target stylized face style gan model to obtain the target stylized face includes:
in the training process, acquiring a plurality of stylized faces with different stylized degrees output by the target stylized face StyleGAN model;
determining a preset area in each stylized face according to the style degree of each stylized face;
and fusing the preset areas to obtain the target stylized face.
6. The method for controlling a degree of facial stylization of claim 5, wherein, in the training process, obtaining a plurality of stylized faces with different degrees of stylization output by the target stylized face style gan model comprises:
in the training process, determining a plurality of stylized faces with different stylized degrees output by the target stylized face StyleGAN model according to different iteration times;
the iteration times are positively correlated with the output stylized degree of the stylized face.
7. The method of controlling a degree of facial stylization of claim 5, wherein determining a predetermined area in each of the stylized faces comprises:
and determining a preset area in each stylized face by using a preset key point positioning model.
8. The method of controlling a degree of facial stylization of claim 5, wherein when the stylized face comprises two stylized faces of a first degree of stylization and a second degree of stylization, and the first degree of stylization is less than the second degree of stylization, determining a preset area in each of the stylized faces according to the degree of style of each of the stylized faces comprises:
positioning a first five-sense region on the stylized face of the first stylized degree through a preset face key point positioning model;
positioning a second five-sense organ region on the stylized face of the second stylized degree through the preset face key point positioning model;
fusing the preset areas to obtain a target stylized face, wherein the method comprises the following steps:
and replacing a second five-sense organ region on the stylized face with the second stylized degree with the first five-sense organ region, and taking the replaced stylized face with the second stylized degree as the target stylized face.
9. A device for controlling the degree of facial stylization, comprising:
the initial model generation unit is used for acquiring a target stylized image, training a real face StyleGAN model according to the target stylized image, and obtaining an initial stylized face StyleGAN model;
the first face data generating unit is used for inputting an initial vector to the real face StyleGAN model to obtain first face data;
the second face data generating unit is used for inputting the initial vector to the initial stylized face StyleGAN model to obtain second face data;
the target model generating unit is used for training the initial stylized face StyleGAN model according to the difference value between the first face data and the second face data to obtain a target stylized face StyleGAN model;
the target stylized face generation unit is used for acquiring the face to be stylized and inputting the face to be stylized into the target stylized face StyleGAN model to obtain the target stylized face.
10. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method of controlling the degree of facial stylization of any one of claims 1-8 when storing a computer program.
11. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method of controlling the degree of facial stylization as claimed in any one of claims 1 to 8.
CN202310569269.0A 2023-05-19 2023-05-19 Method, device, electronic equipment and medium for controlling face stylization degree Pending CN116862757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310569269.0A CN116862757A (en) 2023-05-19 2023-05-19 Method, device, electronic equipment and medium for controlling face stylization degree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310569269.0A CN116862757A (en) 2023-05-19 2023-05-19 Method, device, electronic equipment and medium for controlling face stylization degree

Publications (1)

Publication Number Publication Date
CN116862757A true CN116862757A (en) 2023-10-10

Family

ID=88217883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310569269.0A Pending CN116862757A (en) 2023-05-19 2023-05-19 Method, device, electronic equipment and medium for controlling face stylization degree

Country Status (1)

Country Link
CN (1) CN116862757A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102013107A (en) * 2010-09-06 2011-04-13 浙江大学 Selective image stylizing method based on nonlinear filtering
WO2019128508A1 (en) * 2017-12-28 2019-07-04 Oppo广东移动通信有限公司 Method and apparatus for processing image, storage medium, and electronic device
CN110706151A (en) * 2018-09-13 2020-01-17 南京大学 Video-oriented non-uniform style migration method
CN112991358A (en) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model
CN114913061A (en) * 2022-06-02 2022-08-16 北京字跳网络技术有限公司 Image processing method and device, storage medium and electronic equipment
CN115063513A (en) * 2022-05-09 2022-09-16 网易(杭州)网络有限公司 Image processing method and device
CN115187450A (en) * 2022-06-27 2022-10-14 北京奇艺世纪科技有限公司 Image generation method, image generation device and related equipment
US20220335114A1 (en) * 2021-04-20 2022-10-20 National Tsing Hua University Verification method and verification apparatus based on attacking image style transfer

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102013107A (en) * 2010-09-06 2011-04-13 浙江大学 Selective image stylizing method based on nonlinear filtering
WO2019128508A1 (en) * 2017-12-28 2019-07-04 Oppo广东移动通信有限公司 Method and apparatus for processing image, storage medium, and electronic device
CN110706151A (en) * 2018-09-13 2020-01-17 南京大学 Video-oriented non-uniform style migration method
CN112991358A (en) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model
WO2022068451A1 (en) * 2020-09-30 2022-04-07 北京字节跳动网络技术有限公司 Style image generation method and apparatus, model training method and apparatus, device, and medium
US20220335114A1 (en) * 2021-04-20 2022-10-20 National Tsing Hua University Verification method and verification apparatus based on attacking image style transfer
CN115063513A (en) * 2022-05-09 2022-09-16 网易(杭州)网络有限公司 Image processing method and device
CN114913061A (en) * 2022-06-02 2022-08-16 北京字跳网络技术有限公司 Image processing method and device, storage medium and electronic equipment
CN115187450A (en) * 2022-06-27 2022-10-14 北京奇艺世纪科技有限公司 Image generation method, image generation device and related equipment

Similar Documents

Publication Publication Date Title
CN110717977B (en) Method, device, computer equipment and storage medium for processing game character face
CN106778928B (en) Image processing method and device
US11875221B2 (en) Attribute decorrelation techniques for image editing
CN111632374B (en) Method and device for processing face of virtual character in game and readable storage medium
US9734613B2 (en) Apparatus and method for generating facial composite image, recording medium for performing the method
CN113287118A (en) System and method for face reproduction
CN106682632B (en) Method and device for processing face image
CN111127309B (en) Portrait style migration model training method, portrait style migration method and device
CN111108508B (en) Face emotion recognition method, intelligent device and computer readable storage medium
WO2022166797A1 (en) Image generation model training method, generation method, apparatus, and device
CN114266695A (en) Image processing method, image processing system and electronic equipment
CN114913303A (en) Virtual image generation method and related device, electronic equipment and storage medium
US20240020810A1 (en) UNIVERSAL STYLE TRANSFER USING MULTl-SCALE FEATURE TRANSFORM AND USER CONTROLS
CN111340913B (en) Picture generation and model training method, device and storage medium
Ham et al. Cogs: Controllable generation and search from sketch and style
CN110598097B (en) Hair style recommendation system, method, equipment and storage medium based on CNN
US20220101122A1 (en) Energy-based variational autoencoders
CN112862672B (en) Liu-bang generation method, device, computer equipment and storage medium
CN113222841A (en) Image processing method, device, equipment and medium
CN115392216B (en) Virtual image generation method and device, electronic equipment and storage medium
CN116862757A (en) Method, device, electronic equipment and medium for controlling face stylization degree
US20220101145A1 (en) Training energy-based variational autoencoders
CN115018996A (en) Method and device for generating 3D face model according to real person face photo
CN114037644A (en) Artistic digital image synthesis system and method based on generation countermeasure network
CN114332119A (en) Method and related device for generating face attribute change image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination