WO2022100690A1 - 动物脸风格图像生成方法、模型训练方法、装置和设备 - Google Patents
动物脸风格图像生成方法、模型训练方法、装置和设备 Download PDFInfo
- Publication number
- WO2022100690A1 WO2022100690A1 PCT/CN2021/130301 CN2021130301W WO2022100690A1 WO 2022100690 A1 WO2022100690 A1 WO 2022100690A1 CN 2021130301 W CN2021130301 W CN 2021130301W WO 2022100690 A1 WO2022100690 A1 WO 2022100690A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- image
- animal
- style
- animal face
- Prior art date
Links
- 241001465754 Metazoa Species 0.000 title claims abstract description 672
- 238000000034 method Methods 0.000 title claims abstract description 97
- 238000012549 training Methods 0.000 title claims abstract description 83
- 230000000694 effects Effects 0.000 claims abstract description 42
- 230000001131 transforming effect Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 17
- 230000011218 segmentation Effects 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 abstract description 19
- 230000002452 interceptive effect Effects 0.000 abstract description 7
- 230000009466 transformation Effects 0.000 description 13
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 10
- 238000012545 processing Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 9
- 230000004927 fusion Effects 0.000 description 8
- 230000003993 interaction Effects 0.000 description 8
- 241000282326 Felis catus Species 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 241000282472 Canis lupus familiaris Species 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000001815 facial effect Effects 0.000 description 3
- 241000282330 Procyon lotor Species 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/02—Affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/803—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present disclosure relates to the technical field of image processing, and in particular, to a method for generating an animal face style image, a method for training a model, an apparatus and a device.
- Transforming image styles refers to transforming one or more images from one style to another.
- the types of style transformation supported in current video interaction applications are still limited and less interesting, which leads to poor user experience and is difficult to meet the user's personalized image style transformation needs.
- the embodiments of the present disclosure provide a method for generating an animal face style image, a method for training a model, an apparatus and a device.
- an embodiment of the present disclosure provides a method for generating an animal face style image, including:
- the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face, and the animal face style image generation model is based on the first face sample image and the first animal face style Sample image training is obtained, the first animal face style sample image is generated by a pre-trained animal face generation model based on the first human face sample image, and the animal face generation model is based on the second human face sample image and the first animal Face sample images are trained.
- an embodiment of the present disclosure also provides a method for training an animal face style image generation model, including:
- the image generation model is trained based on the second human face sample image and the first animal face sample image to obtain an animal face generation model
- a first animal face style sample image corresponding to the first human face sample image is obtained based on the animal face generation model; wherein, the first animal face style sample image refers to the human face on the first face sample image.
- the image after the face is transformed into an animal face;
- the animal face style image generation model is used to obtain an animal face style image corresponding to the original face image
- the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face. image.
- an embodiment of the present disclosure further provides a device for generating an animal face style image, including:
- the original face image acquisition module is used to obtain the original face image
- a style image generation module used for generating a model of the pre-trained animal face style image to obtain an animal face style image corresponding to the original face image
- the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face, and the animal face style image generation model is based on the first face sample image and the first animal face style Sample image training is obtained, the first animal face style sample image is generated by a pre-trained animal face generation model based on the first human face sample image, and the animal face generation model is based on the second human face sample image and the first animal Face sample images are trained.
- an embodiment of the present disclosure further provides an apparatus for training an animal face style image generation model, including:
- the animal face generation model training module is used to train the image generation model based on the second human face sample image and the first animal face sample image to obtain the animal face generation model;
- a style sample image generation module configured to obtain a first animal face style sample image corresponding to the first human face sample image based on the animal face generation model; wherein, the first animal face style sample image refers to the The image of the human face on the face sample image transformed into the animal face;
- a style image generation model training module configured to train a style image generation model based on the first human face sample image and the first animal face style sample image to obtain an animal face style image generation model
- the animal face style image generation model is used to obtain an animal face style image corresponding to the original face image
- the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face. image.
- embodiments of the present disclosure further provide an electronic device, including a memory and a processor, wherein: a computer program is stored in the memory, and when the computer program is executed by the processor, the processor Execute any of the animal face style image generation methods or the animal face style image generation model training methods provided in the embodiments of the present disclosure.
- an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the processor executes the computer program provided by the embodiment of the present disclosure. Either an animal face style image generation method or an animal face style image generation model training method.
- the animal face style image generation model can be pre-trained in the server, and then sent to the terminal for the terminal to call and generate the animal face style image corresponding to the original face image, which can enrich the information in the terminal.
- the video interactive application as an example, call the animal face style image generation model to get the animal face style image corresponding to the original face image, which can not only enrich the image editing function of the application, but also improve the video
- the fun of interactive applications provides users with more novel special effects and gameplay, thereby improving the user experience.
- animal face style image generation model it is possible to dynamically generate animal face style images adapted to the original face images of users according to the original face images of different users, so as to improve the intelligence of generating animal face style images, and to present them. Better image effects, such as getting more realistic animal face style images.
- FIG. 1 is a flowchart of a method for generating an animal face style image according to an embodiment of the present disclosure
- FIG. 2 is a flowchart of another method for generating an animal face style image according to an embodiment of the present disclosure
- FIG. 3 is a flowchart of a method for training an animal face style image generation model according to an embodiment of the present disclosure
- FIG. 4 is a schematic structural diagram of an apparatus for generating an animal face style image according to an embodiment of the present disclosure
- FIG. 5 is a schematic structural diagram of an apparatus for training an animal face style image generation model according to an embodiment of the present disclosure
- FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
- FIG. 1 is a flowchart of a method for generating an animal face style image according to an embodiment of the present disclosure.
- the animal face style image generation method can be executed by an animal face style image generation device, which can be implemented by software and/or hardware, and can be integrated on any electronic device with computing capabilities, such as a smart phone, tablet computer, laptop computer Wait for the terminal.
- the animal face style image generation device can be implemented in the form of an independent application program or a small program integrated on the public platform, and can also be implemented as an application program with a style image generation function or a functional module integrated in the applet.
- Functional applications may include, but are not limited to, video interaction applications, and the applet may include, but is not limited to, video interaction applets, and the like.
- the method for generating an animal face style image can be applied to a scene in which an animal face style image is obtained.
- the animal face style images or the animal face style sample images both refer to images obtained after transforming a human face into an animal face, for example, transforming a human face into a cat's face or a dog's face, etc.
- the faces of other animals get an image of the animal face style class.
- the expression on the human face can be consistent with the expression on the animal face
- the facial features on the human face can also be consistent with the facial features on the animal face, such as a smile on the human face.
- the face of the corresponding animal also shows a smiling expression; the eyes on the human face are in the open state, and the eyes on the corresponding animal face are also in the open state, etc.
- the method for generating an animal face style image may include:
- the image stored in the terminal may be acquired or an image or video may be captured in real time by an image capturing device of the terminal.
- the animal face style image generating apparatus acquires the original face image to be processed according to the user's image selection operation, image capturing operation or image uploading operation in the terminal.
- a photo-taking prompt can be displayed on the image capture interface.
- the photographing prompt information can be used to prompt the user to place the face of the face image in the image acquisition interface at a preset position on the terminal screen (for example, the middle position of the screen, etc.), adjust the distance between the face and the terminal screen (adjust the The distance can be used to obtain the appropriate size of the face area in the image acquisition interface, to avoid the face area being too large or too small, etc.) and adjust the rotation angle of the face (different rotation angles correspond to different face orientations, such as frontal or sideways.
- the user shoots an image according to the photographing prompt information, so that the video interaction application can conveniently obtain the original face image that meets the input requirements of the animal face-style image generation model.
- the input requirements of the animal face style image generation model may refer to the constraints on the input image, such as the position of the face on the input image, the size of the input image, etc.
- the video interaction application can also pre-store a photo template according to the input requirements of the animal face style image generation model, and the photo template predefines the position of the user's face on the image, the size of the face area on the image, the Angle, image size and other information, the video interaction application can obtain the required original face image by using the photographing template according to the user's photographing operation.
- the images captured by the user can be cropped, zoomed, rotated, etc. process to obtain the original face image that matches the model input.
- the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face
- the animal face style image generation model has the function of transforming a human face into an animal face.
- the animal face style image generation model is trained based on the first human face sample image and the first animal face style sample image, and the first animal face style sample image is generated by the pre-trained animal face generation model based on the first face sample image, that is, the animal
- the face generation model has the function of generating a corresponding animal face style image for any face image, and the corresponding first animal face style sample image is obtained by transforming the face on the first face sample image into an animal face.
- the animal face generation model is trained based on the second human face sample image and the first animal face sample image.
- the first animal face sample image refers to an animal face image showing real animal face features.
- the second human face sample image and the first The face sample images may be the same face images or different face images, which are not specifically limited in the embodiment of the present disclosure.
- the plurality of first animal face sample images participating in the training of the animal face generation model correspond to the same animal type.
- the plurality of first animal face sample images participating in the training of the animal face generation model are all animal face images corresponding to cats or dogs.
- the multiple first animal face sample images participating in the training of the animal face generation model can also correspond to the animal face images belonging to the same species under the same animal type, such as multiple first animal face samples participating in the training of the animal face generation model
- the images are all animal face images corresponding to the raccoon cat breed or the Persian cat breed, that is, in the embodiment of the present disclosure, a plurality of animal face generation models can be separately trained for different animal types or different animal breeds under the same animal type, so that Each animal face generation model has the function of generating a specific type or specific breed of animal face images.
- the first animal face sample image may be obtained by collecting animal images photographed for animals on the Internet.
- the above model training process may include: first, training an image generation model based on the second human face sample image and the first animal face sample image to obtain an animal face generation model, wherein the available image generation model may be: Including but not limited to Generative Adversarial Networks (GAN, Generative Adversarial Networks) models, Style-Based Generative Adversarial Networks (Stylegan, Style-Based Generator Architecture for Generative Adversarial Networks) models, etc.
- GAN Generative Adversarial Networks
- Style-Based Generative Adversarial Networks Style-Based Generative Adversarial Networks
- Stylegan Style-Based Generator Architecture for Generative Adversarial Networks
- a first animal face style sample image corresponding to the first human face sample image is obtained based on the animal face generation model, where the first animal face style sample image refers to a face after transforming the face on the first face sample image into an animal face Image.
- the style image generation model is trained based on the first human face sample image and the first animal face style sample image to obtain an animal face style image generation model, wherein the available style image generation model may include conditions such as conditional generative adversarial networks ( CGAN, Conditional Generative Adversarial Networks) model, Cycle Consistent Generative Adversarial Networks (Cycle-GAN, Cycle Consistent Adversarial Networks) model, etc.
- CGAN conditional generative adversarial networks
- Cycle-GAN Cycle Consistent Generative Adversarial Networks
- the first animal face style sample image corresponding to the first face sample image is obtained by using the animal face generation model, and then the first face sample image and the first animal face style sample image are used as paired training samples for training Obtaining the animal face style image generation model can ensure the training effect of the animal face style image generation model, and then can ensure that the generated animal face style image corresponding to the original face image has a better display effect, such as obtaining a more realistic animal face. style image.
- the first face sample image is based on the difference between the face key points on the first original face sample image and the animal face key points on the first original animal face sample image
- the first correspondence is obtained after adjusting the face position of the first original face sample image
- the second face sample image is based on the second correspondence between the face key points on the second original face sample image and the animal face key points on the first original animal face sample image.
- the image is obtained after adjusting the face position;
- the first animal face sample image is obtained by adjusting the position of the animal face on the first original animal face sample image based on the first correspondence or the second correspondence.
- the first correspondence between the face key points and the animal face key points on the first original animal face sample image so as to adjust the face position of the first original face sample image based on the first correspondence
- the first face sample image of the input requirements of the face generation model or the animal face style image generation model (for example, the face position on the image, the image size, etc.); similarly, the first animal face sample image can also be pre-based on the first face sample image
- the correspondence relationship is obtained by adjusting the position of the animal face on the first original animal face sample image, and the first animal face sample image also meets the input requirements of the model.
- an affine transformation matrix for adjusting the position of the face on the first original face sample image may be constructed based on the face key points participating in the first correspondence, and based on this
- the affine transformation matrix is used to adjust the face position of the first original face sample image to obtain the first face sample image; based on the animal face key points participating in the first correspondence, it is used to adjust the animal on the first original animal face sample image.
- the affine transformation matrix of the face position, and based on the affine transformation matrix, the animal face position adjustment is performed on the first original animal face sample image to obtain the first animal face sample image.
- the specific construction of the affine transformation matrix can refer to the principle of affine transformation.
- the affine transformation matrix may be related to parameters such as scaling parameters and cropping ratios of the first original human face sample image or the first original animal face sample image, that is, in the process of adjusting the position of the human face or the position of the animal face, the Image processing operations may include cropping, scaling, rotation, etc., which may be determined according to image processing requirements.
- Image adjustment is performed based on the same keypoint correspondence, and the final first human face sample image and the first animal face sample image have the same image size, and the face area on the first human face sample image and the first animal face
- the animal face area on the sample image corresponds to the same image position.
- the human face area is located in the center area of the first human face sample image, and the animal face area is also located in the center area of the first animal face sample image.
- the difference between the area of the face area is less than the area threshold (the value can be set flexibly), that is, the area of the face area matches the area of the animal face area, so as to ensure that the first animal face style with better display effect can be generated based on the animal face generation model.
- animal face style image generation model can ensure a better model training effect and avoid the animal face style images generated by using the animal face style image generation model.
- Mismatch affects the display effect of animal face style images, for example, the animal face area is too large or too small compared to the human face area.
- the animal face generation model Before training to obtain the animal face generation model, it is also possible to first determine the second correspondence between the face key points in the second original face sample image and the animal face key points in the first original animal face sample image. Then based on the second correspondence, the second original face sample image is adjusted to the face position, and the involved image processing operations can include cropping, scaling, rotation, etc., to obtain a second face that meets the input image conditions of the image generation model Sample image.
- the first animal that meets the input requirements of the image generation model after adjusting the position of the animal face on the first original animal face sample image in advance based on the second correspondence. face sample image.
- an affine transformation matrix for adjusting the position of the face on the second original face sample image may also be constructed based on the face key points participating in the second correspondence, based on The animal face key points participating in the second correspondence construct an affine transformation matrix for adjusting the position of the animal face on the first original animal face sample image.
- the finally obtained second human face sample image and the first animal face sample image have the same image size, and the human face area on the second human face sample image and the animal face area on the first animal face sample image correspond to the same image position , for example, the human face area is located in the center area of the second human face sample image, and the animal face area is also located in the center area of the first animal face sample image, etc.
- the difference between the area of the human face area and the area of the animal face area is less than the area threshold (take The value can be set flexibly), that is, the area of the face area matches the area of the animal face area, so as to ensure a better model training effect based on high-quality training samples.
- the animal face style image generation model is obtained by training based on the first human face sample image and the second animal face style sample image, and the second animal face style sample image is replaced by the background area in the first animal face style sample image. obtained after the background region in a sample face image.
- background replacement in the process of training to obtain the animal face style image generation model, the influence of the background area on the animal face style sample image on the model training effect can be minimized, so as to ensure a better model training effect, and then ensure the generated Animal face style images have a better display effect.
- the second animal face style sample image is obtained by fusing the first animal face style sample image and the first human face sample image based on the second animal face mask image; the second animal face mask image is obtained by fusing the first animal face style sample image and the first human face sample image;
- the pre-trained animal face segmentation model is obtained based on the first animal face style sample image, and the second animal face mask image is used to determine the animal face area on the first animal face style sample image as the second animal face style sample image. animal face area.
- the animal face segmentation model can be obtained by training based on the second animal face sample image and the position labeling result of the animal face region on the second animal face sample image. On the basis of ensuring that the animal face segmentation model has the function of generating a mask image corresponding to the animal face region on the image, those skilled in the art can use any available training method to implement, which is not specifically limited in the embodiment of the present disclosure.
- the animal face style image generation model can be pre-trained in the server, and then sent to the terminal for the terminal to call and generate the animal face style image corresponding to the original face image, which can enrich the information in the terminal.
- the video interactive application as an example, call the animal face style image generation model to get the animal face style image, which can not only enrich the image editing function of the application, but also enhance the interest of the application and provide users with a comparison Novel special effects gameplay, thereby improving the user experience.
- using the animal face style image generation model can realize the original face images of different users, dynamically generate animal face style images adapted to the user's original face images, improve the intelligence of generating animal face style images, and present a better image. good image effect.
- FIG. 2 is a flowchart of another method for generating an animal face style image according to an embodiment of the present disclosure, which is further optimized and expanded based on the above technical solution, and can be combined with each of the above optional embodiments.
- the method for generating an animal face style image may include:
- the application or applet can display the animal feature type selection interface to the user, and the animal feature types can be distinguished according to different animal types, such as Cat face effects or dog face effects can also be distinguished according to different animal species, such as raccoon cat face effects or Persian cat face effects;
- the terminal determines the type of animal effects that the user currently needs to generate corresponding to which animal according to the type of animal effects selected by the user.
- the animal face style image is used to determine the correspondence between the key points of the animal's face and the key points of the human face. This correspondence can be pre-stored in the terminal for the terminal to call according to the type of animal special effects.
- the terminal may also establish a correspondence between the animal face key points and the human face key points after determining the animal face corresponding to the animal special effect type selected by the user and recognizing the human face key points on the user image.
- the user image may be an image obtained by the terminal according to an image selection operation, an image capturing operation or an image uploading operation performed by the user in the terminal.
- the face position of the user image is adjusted to obtain the original face image.
- the original face image meets the input requirements of the animal face style image generation model.
- the input requirements corresponding to the model (such as the face position on the image, image size, etc.) are also determined at the same time. Therefore, the terminal uses the key point recognition technology to identify the face key points on the user image. Then, adjust the face position of the user image based on the determined corresponding relationship. For example, the terminal can use the face key points on the user image that belong to the corresponding relationship to construct an affine transformation matrix for adjusting the face position on the user image. , using the affine transformation matrix to adjust the face position on the user image, and the involved image processing operations may include cropping, scaling, rotation, etc., to obtain the original face image that meets the input requirements of the animal face style image generation model.
- the background area on the user image refers to the remaining image area on the user image except the face area.
- image processing technology can be used to extract the animal face area from the animal face style image, and the background area from the user image, and then according to the position of the background area and the position of the face area on the user image, the two Fusion (or blending). That is, on the target animal face style image finally displayed to the user, except that the user face features become animal face features, the image background retains the background area on the user image, which avoids the need for the user image in the process of generating the animal face style image. Changes to the upper background area.
- the animal face area on the animal face style image is fused with the background area on the user image to obtain a target animal face style image corresponding to the user image, including:
- an intermediate result image with the same image size as the user image is obtained; wherein, the position of the animal face area on the intermediate result image is the same as the position of the human face area on the user image;
- the corresponding relationship between the animal face key points and the human face key points on the user image, the animal face style image is mapped to the image coordinates corresponding to the user image, and the intermediate result image is obtained.
- a first animal face mask image corresponding to the animal effect type is determined.
- the user image and the intermediate result image are fused to obtain the target animal face style image corresponding to the user image; wherein, the first animal face mask image is used to fuse the image on the intermediate result image
- the animal face region is determined as the animal face region on the target animal face style image.
- the first animal face mask image By using the first animal face mask image to realize the fusion of the user image and the intermediate result image, on the basis of ensuring that the target animal face style image is successfully obtained, it helps to improve the efficiency of image fusion processing.
- the user image and the intermediate result image are fused to obtain a target animal face style image corresponding to the user image, which may include:
- the smooth transition between the background area on the user image and the animal face area on the intermediate result image can be achieved, and the image fusion effect can be optimized. , to ensure the final presentation of the target animal face style image.
- the special effect logo selected by the user can also be determined according to the special effect selection operation of the user on the image editing interface. , adding a special effect corresponding to the special effect identifier selected by the user to the aforementioned target animal face style image or the aforementioned animal face style image, so as to further enhance the interest of image editing.
- the special effects selectable by the user may include any type of props or stickers, etc., which are not specifically limited in this embodiment of the present disclosure.
- the user image after the user image is obtained, first adjust the face position of the user image according to the corresponding relationship between the animal face key points corresponding to the animal face special effect type selected by the user and the human face key points to obtain the original face image, and then use the animal face style image generation model to obtain the animal face style image corresponding to the original face image, and finally fuse the animal face area on the animal face style image with the background area on the user image to display it to the user
- the target animal face style image of the user's face while animalizing the user's facial features, retains the original background on the user's image, and enriches the image editing function in the terminal.
- calling the animal face style image generation model to obtain animal face style images not only enriches the image editing functions of the application, but also improves the fun of the application, providing users with relatively novel special effects gameplay. , thereby improving the user experience.
- FIG. 3 is a flowchart of a method for training an animal face style image generation model provided by an embodiment of the present disclosure, which is applied to the situation of how to train an animal face style image generation model with the function of transforming a human face into an animal face.
- the animal face style image generation model training method can be executed by an animal face style image generation model training device, which can be implemented by software and/or hardware, and can be integrated in a server.
- the animal face style image generation model training method provided by the embodiments of the present disclosure is executed in cooperation with the animal face style image generation method provided by the embodiments of the present disclosure.
- the method for training an animal face style image generation model may include:
- the first animal face style sample image refers to an image obtained by transforming the human face on the first human face sample image into an animal face.
- S303 Train a style image generation model based on the first human face sample image and the first animal face style sample image to obtain an animal face style image generation model.
- the animal face style image generation model is used to obtain an animal face style image corresponding to the original face image, and the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face.
- the model training method provided by the embodiment of the present disclosure further includes:
- the model training method provided by the embodiment of the present disclosure further includes: determining a face key on the first original face sample image The first correspondence between the points and the animal face key points on the first original animal face sample image; based on the first correspondence, the animal face position adjustment is performed on the first original face sample image to obtain the first face sample image .
- the model training method provided by the embodiment of the present disclosure further includes: The background area of is replaced with the background area in the first face sample image, and the second animal face style sample image is obtained.
- the style image generation model is trained based on the first human face sample image and the first animal face style sample image to obtain an animal face style image generation model, including: based on the first face sample image and the second animal face style sample
- the image-to-style image generation model is trained to obtain an animal face style image generation model.
- replacing the background area in the first animal face style sample image with the background area in the first human face sample image to obtain a second animal face style sample image including: based on a pre-trained animal face segmentation model, obtaining an animal face mask image corresponding to the first animal face style sample image; based on the animal face mask image, the first animal face style sample image and the first human face sample image are fused to obtain a second animal face style sample image ; wherein, the animal face mask image is used to determine the animal face area on the first animal face style sample image as the animal face area on the second animal face style sample image.
- the model training method provided by the embodiment of the present disclosure further includes: acquiring a second animal face sample image and a position labeling result of the animal face region on the second animal face sample image; based on the second animal face sample image and the animal face The position labeling result of the face area is trained to obtain an animal face segmentation model.
- the animal face style image generation model can be pre-trained in the server, and then sent to the terminal for the terminal to call and generate the animal face style image corresponding to the original face image, which can enrich the information in the terminal.
- the video interactive application as an example, call the animal face style image generation model to get the animal face style image, which can not only enrich the image editing function of the application, but also enhance the interest of the application and provide users with a comparison Novel special effects gameplay, thereby improving the user experience.
- FIG. 4 is a schematic structural diagram of an apparatus for generating an animal face style image according to an embodiment of the present disclosure, which is applied to the situation of how to transform a user's face into an animal face.
- the animal face style image generating apparatus can be implemented by software and/or hardware, and can be integrated on any electronic device with computing capabilities, such as user terminals such as smart phones, tablet computers, and notebook computers.
- the animal face style image generating apparatus 400 provided by the embodiment of the present disclosure includes an original face image obtaining module 401 and a style image generating module 402, wherein:
- an original face image acquisition module 401 configured to acquire an original face image
- a style image generation module 402 configured to use a pre-trained animal face style image generation model to obtain an animal face style image corresponding to the original face image;
- the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face, and the animal face style image generation model is trained based on the first face sample image and the first animal face style sample image.
- the first animal face style sample image is generated by a pre-trained animal face generation model based on the first human face sample image, and the animal face generation model is trained based on the second human face sample image and the first animal face sample image.
- the apparatus 400 provided in this embodiment of the present disclosure further includes:
- the corresponding relationship determination module is used to determine the corresponding relationship between the animal face key points corresponding to the animal special effect type and the human face key points according to the animal special effect type selected by the user;
- the face position adjustment module is used to adjust the face position of the user image based on the corresponding relationship between the animal face key points corresponding to the animal special effects type and the human face key points, so as to obtain the original face image;
- the face image meets the input requirements of the animal face style image generation model;
- the image fusion module is configured to fuse the animal face area on the animal face style image with the background area on the user image to obtain the target animal face style image corresponding to the user image.
- the image fusion module includes:
- an intermediate result image determining unit used for obtaining an intermediate result image with the same image size as the user image based on the animal face style image; wherein, the position of the animal face region on the intermediate result image is the same as the position of the human face region on the user image;
- a first animal face mask image determining unit for determining the first animal face mask image corresponding to the animal special effect type
- the image fusion unit is used to fuse the user image and the intermediate result image based on the first animal face mask image to obtain the target animal face style image corresponding to the user image; wherein, the first animal face mask image is used for The animal face region on the intermediate result image is determined as the animal face region on the target animal face style image.
- the first face sample image is based on the first correspondence between the face key points on the first original face sample image and the animal face key points on the first original animal face sample image.
- the original face sample image is obtained after adjusting the face position;
- the second face sample image is based on the second correspondence between the face key points on the second original face sample image and the animal face key points on the first original animal face sample image.
- the image is obtained after adjusting the face position;
- the first animal face sample image is obtained by adjusting the position of the animal face on the first original animal face sample image based on the first correspondence or the second correspondence.
- the animal face style image generation model is obtained by training based on the first human face sample image and the second animal face style sample image, and the second animal face style sample image is replaced by the background area in the first animal face style sample image. obtained after the background region in a sample face image.
- the second animal face style sample image is obtained by fusing the first animal face style sample image and the first human face sample image based on the second animal face mask image;
- the second animal face mask image is obtained by the pre-trained animal face segmentation model based on the first animal face style sample image, and the second animal face mask image is used to determine the animal face area on the first animal face style sample image is the animal face region on the second animal face style sample image.
- the apparatus for generating an animal face style image provided by the embodiment of the present disclosure can execute any of the methods for generating an animal face style image provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
- the description in any method embodiment of the present disclosure for the content that is not described in detail in the apparatus embodiment of the present disclosure, reference may be made to the description in any method embodiment of the present disclosure.
- FIG. 5 is a schematic structural diagram of an apparatus for training an animal face style image generation model provided by an embodiment of the present disclosure, which is applied to the situation of how to train an animal face style image generation model with the function of transforming a human face into an animal face.
- the animal face style image generation model training device can be implemented by software and/or hardware, and can be integrated in a server.
- the animal face style image generation model training apparatus 500 may include an animal face generation model training module 501, a style sample image generation module 502, and a style image generation model training module 503, wherein:
- the animal face generation model training module 501 is used for training the image generation model based on the second human face sample image and the first animal face sample image to obtain the animal face generation model;
- the style sample image generation module 502 is used to obtain a first animal face style sample image corresponding to the first face sample image based on the animal face generation model; wherein, the first animal face style sample image refers to the first face sample image The image on which the human face is transformed into an animal face;
- the style image generation model training module 503 is used for training the style image generation model based on the first human face sample image and the first animal face style sample image to obtain the animal face style image generation model;
- the animal face style image generation model is used to obtain an animal face style image corresponding to the original face image, and the animal face style image refers to an image obtained by transforming the human face on the original face image into an animal face.
- the apparatus 500 provided in this embodiment of the present disclosure further includes:
- the second correspondence determination module is used to determine the second correspondence between the face key points in the second original face sample image and the animal face key points in the first original animal face sample image;
- a face position adjustment module for performing face position adjustment on the second original face sample image based on the second correspondence to obtain a second face sample image
- An animal face position adjustment module configured to perform an animal face position adjustment on the first original animal face sample image based on the second correspondence to obtain a first animal face sample image
- a first correspondence determination module configured to determine the first correspondence between the face key points on the first original face sample image and the animal face key points on the first original animal face sample image
- the face position adjustment module is used to adjust the position of the animal face on the first original face sample image based on the first correspondence, so as to obtain the first face sample image.
- the apparatus 500 provided in this embodiment of the present disclosure further includes:
- the background area replacement module is used to replace the background area in the first animal face style sample image with the background area in the first face sample image to obtain the second animal face style sample image;
- style image generation model training module 503 is specifically used for:
- the style image generation model is trained based on the first human face sample image and the second animal face style sample image to obtain an animal face style image generation model.
- the background area replacement module includes:
- the animal face mask image determining unit is configured to obtain the animal face mask image corresponding to the first animal face style sample image based on the pre-trained animal face segmentation model;
- the image fusion unit is used for fusing the first animal face style sample image and the first human face sample image based on the animal face mask image to obtain the second animal face style sample image;
- the animal face region on the first animal face style sample image is determined as the animal face region on the second animal face style sample image.
- the apparatus 500 provided in this embodiment of the present disclosure further includes:
- a sample image and labeling result obtaining module used for obtaining the second animal face sample image and the position labeling result of the animal face region on the second animal face sample image
- the animal face segmentation model training module is used for training an animal face segmentation model based on the second animal face sample image and the position labeling result of the animal face region.
- the animal face style image generation model training device provided by the embodiment of the present disclosure can execute any animal face style image generation model training method provided by the embodiment of the present disclosure, and has corresponding functional modules and beneficial effects of the execution method.
- any animal face style image generation model training method provided by the embodiment of the present disclosure, and has corresponding functional modules and beneficial effects of the execution method.
- FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, which is used to exemplarily illustrate an electronic device that implements the animal face style image generation method or the animal face style image generation model training method provided by the embodiment of the present disclosure.
- the electronic devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, Mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, servers, and the like.
- the electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and occupancy scope of the embodiments of the present disclosure.
- electronic device 600 includes one or more processors 601 and memory 602 .
- Processor 601 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 600 to perform desired functions.
- CPU central processing unit
- Processor 601 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 600 to perform desired functions.
- Memory 602 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
- Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others.
- Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
- One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 601 may execute the program instructions to implement the animal face style image generation method or the animal face style image generation model training method provided by the embodiments of the present disclosure, and further Other desired functions can be implemented.
- Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
- the method for generating an animal face style image may include: obtaining an original face image; using a pre-trained animal face style image generation model to obtain an animal face style image corresponding to the original face image; wherein, the animal face style image refers to a The face on the original face image is transformed into an image of an animal face.
- the animal face style image generation model is trained based on the first face sample image and the first animal face style sample image.
- the first animal face style sample image is pre-trained
- the animal face generation model is generated based on the first human face sample image, and the animal face generation model is trained based on the second human face sample image and the first animal face sample image.
- the training method for the animal face style image generation model may include: training the image generation model based on the second human face sample image and the first animal face sample image to obtain the animal face generation model; The first animal face style sample image corresponding to the face sample image; wherein, the first animal face style sample image refers to the image after transforming the human face on the first face sample image into an animal face; based on the first face sample image Train the style image generation model with the first animal face style sample image to obtain an animal face style image generation model; wherein, the animal face style image generation model is used to obtain an animal face style image corresponding to the original face image, and the animal face style The image refers to the image after transforming the human face on the original face image into an animal face.
- the electronic device 600 may also perform other optional implementations provided by the method embodiments of the present disclosure.
- the electronic device 600 may also include an input device 603 and an output device 604 interconnected by a bus system and/or other form of connection mechanism (not shown).
- the input device 603 may also include, for example, a keyboard, a mouse, and the like.
- the output device 604 can output various information to the outside, including the determined distance information, direction information, and the like.
- the output device 604 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
- the electronic device 600 may also include any other suitable components according to the specific application.
- the embodiments of the present disclosure may also be computer program products, which include computer program instructions that, when executed by the processor, cause the processor to perform the animal face style image generation provided by the embodiments of the present disclosure method or animal face style image generation model training method.
- the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc., as well as conventional procedural programming language, such as "C" language or similar programming language.
- the program code may execute entirely on the user electronic device, partly on the user electronic device, as a stand-alone software package, partly on the user electronic device and partly on the remote electronic device, or entirely on the remote electronic device execute on.
- embodiments of the present disclosure may further provide a computer-readable storage medium on which computer program instructions are stored, and when executed by the processor, the computer program instructions cause the processor to execute the animal face style image generation provided by the embodiments of the present disclosure method or animal face style image generation model training method.
- the method for generating an animal face style image may include: obtaining an original face image; using a pre-trained animal face style image generation model to obtain an animal face style image corresponding to the original face image; wherein, the animal face style image refers to a
- the face on the original face image is transformed into an image of an animal face, and the animal face style image generation model is trained based on the first face sample image and the first animal face style sample image, and the first animal face style sample image is pre-trained
- the animal face generation model is generated based on the first human face sample image, and the animal face generation model is trained based on the second human face sample image and the first animal face sample image.
- the training method for the animal face style image generation model may include: training the image generation model based on the second human face sample image and the first animal face sample image to obtain the animal face generation model; The first animal face style sample image corresponding to the face sample image; wherein, the first animal face style sample image refers to the image after transforming the human face on the first face sample image into an animal face; based on the first face sample image Train the style image generation model with the first animal face style sample image to obtain an animal face style image generation model; wherein, the animal face style image generation model is used to obtain an animal face style image corresponding to the original face image, and the animal face style The image refers to the image after transforming the human face on the original face image into an animal face.
- the processor may also cause the processor to execute other optional implementations provided by the method embodiments of the present disclosure.
- a computer-readable storage medium can employ any combination of one or more readable media.
- the readable medium may be a readable signal medium or a readable storage medium.
- the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (17)
- 一种动物脸风格图像生成方法,其特征在于,包括:获取原始人脸图像;利用预先训练的动物脸风格图像生成模型,得到与所述原始人脸图像对应的动物脸风格图像;其中,所述动物脸风格图像是指将所述原始人脸图像上的人脸变换为动物脸后的图像,所述动物脸风格图像生成模型基于第一人脸样本图像和第一动物脸风格样本图像训练得到,所述第一动物脸风格样本图像由预先训练的动物脸生成模型基于所述第一人脸样本图像生成,所述动物脸生成模型基于第二人脸样本图像和第一动物脸样本图像训练得到。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:根据用户选择的动物特效类型,确定所述动物特效类型对应的动物脸关键点与人脸关键点之间的对应关系;基于所述动物特效类型对应的动物脸关键点与人脸关键点之间的对应关系对用户图像进行人脸位置调整,以得到原始人脸图像;其中,所述原始人脸图像符合所述动物脸风格图像生成模型的输入要求;
- 根据权利要求2所述的方法,其特征在于,所述方法还包括:将所述动物脸风格图像上的动物脸区域与所述用户图像上的背景区域进行融合,得到与所述用户图像对应的目标动物脸风格图像。
- 根据权利要求3所述的方法,其特征在于,将所述动物脸风格图像上的动物脸区域与所述用户图像上的背景区域进行融合,得到与所述用户图像对应的目标动物脸风格图像,包括:基于所述动物脸风格图像,得到与所述用户图像具有相同图像尺寸的中间结果图像;其中,所述中间结果图像上的动物脸区域位置与所述用户图像上的人脸区域位置相同;确定与所述动物特效类型对应的第一动物脸部蒙版图像;基于所述第一动物脸部蒙版图像,将所述用户图像与所述中间结果图像进行融合,得到与所述用户图像对应的目标动物脸风格图像;其中,所述第一动物脸部蒙版图像用于将所述中间结果图像上的动物脸区域确定为所述目标动物脸风格图像上的动物脸区域。
- 根据权利要求1所述的方法,其特征在于:所述第一人脸样本图像是基于第一原始人脸样本图像上的人脸关键点和第一原始动物脸样本图像上的动物脸关键点之间的第一对应关系,对所述第一原始人脸样本图像进行人脸位置调整后得到;所述第二人脸样本图像是基于第二原始人脸样本图像上的人脸关键点和所述第一原始动物脸样本图像上的动物脸关键点之间的第二对应关系,对所述第二原始人脸样本图像进行人脸位置调整后得到;所述第一动物脸样本图像是基于所述第一对应关系或所述第二对应关系,对所述第一 原始动物脸样本图像进行动物脸位置调整后得到。
- 根据权利要求1所述的方法,其特征在于:所述动物脸风格图像生成模型基于所述第一人脸样本图像和第二动物脸风格样本图像训练得到,所述第二动物脸风格样本图像由所述第一动物脸风格样本图像中的背景区域替换为所述第一人脸样本图像中的背景区域后得到。
- 根据权利要求6所述的方法,其特征在于:所述第二动物脸风格样本图像是基于第二动物脸部蒙版图像,对所述第一动物脸风格样本图像和所述第一人脸样本图像进行融合后得到;所述第二动物脸部蒙版图像由预先训练动物脸部分割模型基于所述第一动物脸风格样本图像得到,所述第二动物脸部蒙版图像用于将所述第一动物脸风格样本图像上的动物脸区域确定为所述第二动物脸风格样本图像上的动物脸区域。
- 一种动物脸风格图像生成模型训练方法,其特征在于,包括:基于第二人脸样本图像和第一动物脸样本图像对图像生成模型进行训练,得到动物脸生成模型;基于所述动物脸生成模型得到与第一人脸样本图像对应的第一动物脸风格样本图像;其中,所述第一动物脸风格样本图像是指将所述第一人脸样本图像上的人脸变换为动物脸后的图像;基于所述第一人脸样本图像和所述第一动物脸风格样本图像对风格图像生成模型进行训练,得到动物脸风格图像生成模型;其中,所述动物脸风格图像生成模型用于得到与原始人脸图像对应的动物脸风格图像,所述动物脸风格图像是指将所述原始人脸图像上的人脸变换为动物脸后的图像。
- 根据权利要求8所述的方法,其特征在于,所述方法还包括:确定第二原始人脸样本图像中的人脸关键点和第一原始动物脸样本图像中的动物脸关键点之间的第二对应关系;基于所述第二对应关系对所述第二原始人脸样本图像进行人脸位置调整,以得到所述第二人脸样本图像;基于所述第二对应关系对所述第一原始动物脸样本图像进行动物脸位置调整,以得到所述第一动物脸样本图像。
- 根据权利要求9所述的方法,其特征在于,所述方法还包括:确定第一原始人脸样本图像上的人脸关键点和所述第一原始动物脸样本图像上的动物脸关键点之间的第一对应关系;基于所述第一对应关系对所述第一原始人脸样本图像进行动物脸位置调整,以得到所述第一人脸样本图像。
- 根据权利要求8所述的方法,其特征在于,所述方法还包括:将所述第一动物脸风格样本图像中的背景区域替换为所述第一人脸样本图像中的背景区域,得到第二动物脸风格样本图像;所述基于所述第一人脸样本图像和所述第一动物脸风格样本图像对风格图像生成模型进行训练,得到动物脸风格图像生成模型,包括:基于所述第一人脸样本图像和所述第二动物脸风格样本图像对所述风格图像生成模型进行训练,得到所述动物脸风格图像生成模型。
- 根据权利要求11所述的方法,其特征在于,所述将所述第一动物脸风格样本图像中的背景区域替换为所述第一人脸样本图像中的背景区域,得到第二动物脸风格样本图像,包括:基于预先训练动物脸部分割模型,得到所述第一动物脸风格样本图像对应的动物脸部蒙版图像;基于所述动物脸部蒙版图像,将所述第一动物脸风格样本图像和所述第一人脸样本图像进行融合,得到所述第二动物脸风格样本图像;其中,所述动物脸部蒙版图像用于将所述第一动物脸风格样本图像上的动物脸区域确定为所述第二动物脸风格样本图像上的动物脸区域。
- 根据权利要求12所述的方法,其特征在于,所述方法还包括:获取第二动物脸样本图像以及与所述第二动物脸样本图像上动物脸部区域的位置标注结果;基于所述第二动物脸样本图像和所述动物脸部区域的位置标注结果,训练得到所述动物脸部分割模型。
- 一种动物脸风格图像生成装置,其特征在于,包括:原始人脸图像获取模块,用于获取原始人脸图像;风格图像生成模块,用于利用预先训练的动物脸风格图像生成模型,得到与所述原始人脸图像对应的动物脸风格图像;其中,所述动物脸风格图像是指将所述原始人脸图像上的人脸变换为动物脸后的图像,所述动物脸风格图像生成模型基于第一人脸样本图像和第一动物脸风格样本图像训练得到,所述第一动物脸风格样本图像由预先训练的动物脸生成模型基于所述第一人脸样本图像生成,所述动物脸生成模型基于第二人脸样本图像和第一动物脸样本图像训练得到。
- 一种动物脸风格图像生成模型训练装置,其特征在于,包括:动物脸生成模型训练模块,用于基于第二人脸样本图像和第一动物脸样本图像对图像生成模型进行训练,得到动物脸生成模型;风格样本图像生成模块,用于基于所述动物脸生成模型得到与第一人脸样本图像对应的第一动物脸风格样本图像;其中,所述第一动物脸风格样本图像是指将所述第一人脸样本图像上的人脸变换为动物脸后的图像;风格图像生成模型训练模块,用于基于所述第一人脸样本图像和所述第一动物脸风格样本图像对风格图像生成模型进行训练,得到动物脸风格图像生成模型;其中,所述动物脸风格图像生成模型用于得到与原始人脸图像对应的动物脸风格图像, 所述动物脸风格图像是指将所述原始人脸图像上的人脸变换为动物脸后的图像。
- 一种电子设备,其特征在于,包括存储器和处理器,其中:所述存储器中存储有计算机程序,当所述计算机程序被所述处理器执行时,所述处理器执行权利要求1-7中任一项所述的动物脸风格图像生成方法,或者执行权利要求8-13中任一项所述的动物脸风格图像生成模型训练方法。
- 一种计算机可读存储介质,其特征在于,所述存储介质中存储有计算机程序,当所述计算机程序被处理器执行时,所述处理器执行权利要求1-7中任一项所述的动物脸风格图像生成方法,或者执行权利要求8-13中任一项所述的动物脸风格图像生成模型训练方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/252,855 US20240005466A1 (en) | 2020-11-13 | 2021-11-12 | Animal face style image generation method and apparatus, model training method and apparatus, and device |
JP2023528414A JP2023549810A (ja) | 2020-11-13 | 2021-11-12 | 動物顔スタイル画像の生成方法、モデルのトレーニング方法、装置及び機器 |
EP21891210.3A EP4246425A4 (en) | 2020-11-13 | 2021-11-12 | ANIMAL HEAD STYLE IMAGE GENERATION METHOD AND APPARATUS, MODEL LEARNING METHOD AND APPARATUS, AND DEVICE |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011269334.0 | 2020-11-13 | ||
CN202011269334.0A CN112330534A (zh) | 2020-11-13 | 2020-11-13 | 动物脸风格图像生成方法、模型训练方法、装置和设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022100690A1 true WO2022100690A1 (zh) | 2022-05-19 |
Family
ID=74318655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/130301 WO2022100690A1 (zh) | 2020-11-13 | 2021-11-12 | 动物脸风格图像生成方法、模型训练方法、装置和设备 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20240005466A1 (zh) |
EP (1) | EP4246425A4 (zh) |
JP (1) | JP2023549810A (zh) |
CN (1) | CN112330534A (zh) |
WO (1) | WO2022100690A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112330534A (zh) * | 2020-11-13 | 2021-02-05 | 北京字跳网络技术有限公司 | 动物脸风格图像生成方法、模型训练方法、装置和设备 |
CN113673422A (zh) * | 2021-08-19 | 2021-11-19 | 苏州中科先进技术研究院有限公司 | 一种宠物种类识别方法及识别系统 |
CN113850890A (zh) * | 2021-09-29 | 2021-12-28 | 北京字跳网络技术有限公司 | 动物形象的生成方法、装置、设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110930297A (zh) * | 2019-11-20 | 2020-03-27 | 咪咕动漫有限公司 | 人脸图像的风格迁移方法、装置、电子设备及存储介质 |
CN111783647A (zh) * | 2020-06-30 | 2020-10-16 | 北京百度网讯科技有限公司 | 人脸融合模型的训练方法、人脸融合方法、装置及设备 |
CN111968029A (zh) * | 2020-08-19 | 2020-11-20 | 北京字节跳动网络技术有限公司 | 表情变换方法、装置、电子设备和计算机可读介质 |
CN112330534A (zh) * | 2020-11-13 | 2021-02-05 | 北京字跳网络技术有限公司 | 动物脸风格图像生成方法、模型训练方法、装置和设备 |
CN112989904A (zh) * | 2020-09-30 | 2021-06-18 | 北京字节跳动网络技术有限公司 | 风格图像生成方法、模型训练方法、装置、设备和介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846793B (zh) * | 2018-05-25 | 2022-04-22 | 深圳市商汤科技有限公司 | 基于图像风格转换模型的图像处理方法和终端设备 |
US10489683B1 (en) * | 2018-12-17 | 2019-11-26 | Bodygram, Inc. | Methods and systems for automatic generation of massive training data sets from 3D models for training deep learning networks |
CN109816589B (zh) * | 2019-01-30 | 2020-07-17 | 北京字节跳动网络技术有限公司 | 用于生成漫画风格转换模型的方法和装置 |
CN109800732B (zh) * | 2019-01-30 | 2021-01-15 | 北京字节跳动网络技术有限公司 | 用于生成漫画头像生成模型的方法和装置 |
CN111340865B (zh) * | 2020-02-24 | 2023-04-07 | 北京百度网讯科技有限公司 | 用于生成图像的方法和装置 |
CN111833242A (zh) * | 2020-07-17 | 2020-10-27 | 北京字节跳动网络技术有限公司 | 人脸变换方法、装置、电子设备和计算机可读介质 |
-
2020
- 2020-11-13 CN CN202011269334.0A patent/CN112330534A/zh active Pending
-
2021
- 2021-11-12 US US18/252,855 patent/US20240005466A1/en active Pending
- 2021-11-12 JP JP2023528414A patent/JP2023549810A/ja active Pending
- 2021-11-12 EP EP21891210.3A patent/EP4246425A4/en active Pending
- 2021-11-12 WO PCT/CN2021/130301 patent/WO2022100690A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110930297A (zh) * | 2019-11-20 | 2020-03-27 | 咪咕动漫有限公司 | 人脸图像的风格迁移方法、装置、电子设备及存储介质 |
CN111783647A (zh) * | 2020-06-30 | 2020-10-16 | 北京百度网讯科技有限公司 | 人脸融合模型的训练方法、人脸融合方法、装置及设备 |
CN111968029A (zh) * | 2020-08-19 | 2020-11-20 | 北京字节跳动网络技术有限公司 | 表情变换方法、装置、电子设备和计算机可读介质 |
CN112989904A (zh) * | 2020-09-30 | 2021-06-18 | 北京字节跳动网络技术有限公司 | 风格图像生成方法、模型训练方法、装置、设备和介质 |
CN112330534A (zh) * | 2020-11-13 | 2021-02-05 | 北京字跳网络技术有限公司 | 动物脸风格图像生成方法、模型训练方法、装置和设备 |
Non-Patent Citations (2)
Title |
---|
CHOI YUNJEY; UH YOUNGJUNG; YOO JAEJUN; HA JUNG-WOO: "StarGAN v2: Diverse Image Synthesis for Multiple Domains", 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 13 June 2020 (2020-06-13), pages 8185 - 8194, XP033803575, DOI: 10.1109/CVPR42600.2020.00821 * |
LIANG ZI WEI: "Generating the Cat and Dog Versions of Trump, and Breaking the Face Editing Tool StarGANv2", 28 April 2020 (2020-04-28), XP055929385, Retrieved from the Internet <URL:https://ishare.ifeng.com/c/s/7w2keWci1Dq> * |
Also Published As
Publication number | Publication date |
---|---|
EP4246425A1 (en) | 2023-09-20 |
EP4246425A4 (en) | 2024-06-05 |
CN112330534A (zh) | 2021-02-05 |
JP2023549810A (ja) | 2023-11-29 |
US20240005466A1 (en) | 2024-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022100690A1 (zh) | 动物脸风格图像生成方法、模型训练方法、装置和设备 | |
WO2022100680A1 (zh) | 混血人脸图像生成方法、模型训练方法、装置和设备 | |
CN105981368B (zh) | 在成像装置中的照片构图和位置引导 | |
WO2021082760A1 (zh) | 虚拟形象的生成方法、装置、终端及存储介质 | |
WO2022083383A1 (zh) | 图像处理方法、装置、电子设备及计算机可读存储介质 | |
WO2021109678A1 (zh) | 视频生成方法、装置、电子设备及存储介质 | |
WO2022166897A1 (zh) | 脸型调整图像生成方法、模型训练方法、装置和设备 | |
TWI255141B (en) | Method and system for real-time interactive video | |
JP7209851B2 (ja) | 画像変形の制御方法、装置およびハードウェア装置 | |
WO2023051185A1 (zh) | 图像处理方法、装置、电子设备及存储介质 | |
CN111400518A (zh) | 作品生成和编辑方法、装置、终端、服务器和系统 | |
US11367196B2 (en) | Image processing method, apparatus, and storage medium | |
WO2019020061A1 (zh) | 视频台词处理方法、客户端、服务器及存储介质 | |
WO2019227429A1 (zh) | 多媒体内容生成方法、装置和设备/终端/服务器 | |
CN111612842A (zh) | 生成位姿估计模型的方法和装置 | |
US20240119082A1 (en) | Method, apparatus, device, readable storage medium and product for media content processing | |
WO2021190625A1 (zh) | 拍摄方法和设备 | |
WO2022166908A1 (zh) | 风格图像生成方法、模型训练方法、装置和设备 | |
US20230057963A1 (en) | Video playing method, apparatus and device, storage medium, and program product | |
WO2022237633A1 (zh) | 一种图像处理方法、装置、设备及介质 | |
WO2022170982A1 (zh) | 图像处理方法、图像生成方法、装置、设备和介质 | |
JP2023545052A (ja) | 画像処理モデルの訓練方法及び装置、画像処理方法及び装置、電子機器並びにコンピュータプログラム | |
CN112785669B (zh) | 一种虚拟形象合成方法、装置、设备及存储介质 | |
WO2023246823A1 (zh) | 一种视频播放方法、装置、设备及存储介质 | |
US20230215296A1 (en) | Method, computing device, and non-transitory computer-readable recording medium to translate audio of video into sign language through avatar |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21891210 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18252855 Country of ref document: US Ref document number: 2023528414 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021891210 Country of ref document: EP Effective date: 20230612 |