WO2022222810A1 - 一种虚拟形象的生成方法、装置、设备及介质 - Google Patents

一种虚拟形象的生成方法、装置、设备及介质 Download PDF

Info

Publication number
WO2022222810A1
WO2022222810A1 PCT/CN2022/086518 CN2022086518W WO2022222810A1 WO 2022222810 A1 WO2022222810 A1 WO 2022222810A1 CN 2022086518 W CN2022086518 W CN 2022086518W WO 2022222810 A1 WO2022222810 A1 WO 2022222810A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sample image
sample
target
generator
Prior art date
Application number
PCT/CN2022/086518
Other languages
English (en)
French (fr)
Inventor
吕伟伟
黄奇伟
白须
陈朗
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Priority to EP22790918.1A priority Critical patent/EP4207080A4/en
Publication of WO2022222810A1 publication Critical patent/WO2022222810A1/zh
Priority to US18/069,024 priority patent/US12002160B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • G06T17/205Re-meshing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/506Illumination models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/24Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/12Bounding box
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2215/00Indexing scheme for image rendering
    • G06T2215/16Using real world measurements to influence rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2021Shape modification

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, and in particular, to a method, device, device and medium for generating a virtual image.
  • 3D avatars with cartoon, animation and other styles are widely used in virtual anchors, e-commerce, news media and other scenarios, and have attracted more and more users' attention and love.
  • three-dimensional virtual images are generally rendered and generated by means of CG (Computer Graphics, computer graphics).
  • CG Computer Graphics, computer graphics
  • the avatars generated by the CG method are single and lack personalization. If you want to realize a variety of avatars, you need to start from modeling, which takes a long time to make and has high labor costs; In terms of rendering visual effects such as image fidelity and light and shadow complexity, it is difficult to use hardware devices with limited performance (such as mobile phones) to generate satisfactory virtual images.
  • the present disclosure provides a method, apparatus, device and medium for generating a virtual image.
  • An embodiment of the present disclosure provides a method for generating a virtual image, including:
  • the first generator is obtained by training based on the first sample image set and the second sample image set generated by the three-dimensional model.
  • the method before using the first generator to obtain the avatar corresponding to the target image, the method further includes:
  • the step of obtaining the virtual image corresponding to the target image by using the first generator includes:
  • a first image is generated by the first generator; wherein the first image includes a target style feature and a content feature of the target image, and the target style feature is obtained by the first generator from the second sample The style features learned from the images in the image set;
  • the avatar corresponding to the target image is determined based on the first image.
  • determining the avatar corresponding to the target image based on the first image further comprising:
  • the second image is determined as the avatar corresponding to the target image.
  • the acquisition method of the second sample image included in the second sample image set is:
  • the establishment of multiple target three-dimensional models with different faces includes:
  • the second sample image set further includes sample images of the three-dimensional model of faces
  • the first sample image set includes sample images including real faces
  • the method further includes:
  • a sample image containing a three-dimensional model of a face and a sample image containing a real face are generated based on the second generator.
  • generating a sample image of a three-dimensional model containing a face and a sample image containing a real face including:
  • a sample image of a three-dimensional model containing a face and a sample image containing a real face are generated by alternately using forward blending and reverse blending by the second generator; wherein the forward blending is generated based on the first sample image
  • a sample image containing a three-dimensional model of a face, and the inverse blending is to generate a sample image containing a real face based on the second sample image.
  • the training process of the first generator includes:
  • the first generator is trained based on the target loss value.
  • the determining a target loss value based on the image loss value, the content loss value, and the style loss value includes:
  • the weight coefficient is used to adjust the similarity between the avatar sample image and the first sample image
  • a weighted sum of the image loss value, the content loss value, and the style loss value is determined as a target loss value.
  • the obtaining the target image in response to user input includes:
  • the image acquisition instruction includes: selection operation, photographing operation, uploading operation, gesture input or motion input;
  • a target image is acquired.
  • the images in the first sample image set are images containing real faces
  • the images in the second sample image set are images generated by a three-dimensional model containing faces.
  • Embodiments of the present disclosure also provide an apparatus for generating a virtual image, including:
  • an image acquisition model for acquiring a target image in response to user input
  • An image generation model used for obtaining the virtual image corresponding to the target image by using the first generator
  • the first generator is obtained by training based on the first sample image set and the second sample image set generated by the three-dimensional model.
  • An embodiment of the present disclosure further provides an electronic device, the electronic device includes: a processor; a memory for storing instructions executable by the processor; the processor for reading the memory from the memory The instructions can be executed, and the instructions can be executed to implement the method for generating a virtual image provided by the embodiments of the present disclosure.
  • An embodiment of the present disclosure further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute the method for generating a virtual image provided by the embodiment of the present disclosure.
  • Embodiments of the present disclosure provide a method, apparatus, device, and medium for generating an avatar.
  • the solution acquires a target image in response to user input, and uses a first generator to obtain an avatar corresponding to the target image.
  • the technical solution utilizes the first generator, which effectively simplifies the generation method of the avatar, improves the generation efficiency, and can generate the avatar corresponding to the target image one-to-one, making the avatar more diverse;
  • the first generator is easy to deploy in various production environments, reducing performance requirements on hardware devices.
  • FIG. 1 is a schematic flowchart of a method for generating a virtual image according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of an initial three-dimensional model provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a mesh deformation provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a method for generating an avatar sample image according to an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of a generation process of a virtual image according to an embodiment of the present disclosure
  • FIG. 6 is a structural block diagram of an apparatus for generating a virtual image according to an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the embodiments of the present disclosure provide a method, apparatus, device, and medium for generating a virtual image, and the technology can be used in various scenarios such as games and live broadcasts that need to generate virtual images. For ease of understanding, the following describes the embodiments of the present disclosure in detail.
  • FIG. 1 is a schematic flowchart of a method for generating a virtual image according to an embodiment of the present disclosure.
  • the method may be executed by a device for generating virtual images, wherein the device may be implemented by software and/or hardware, and may generally be integrated in an electronic device .
  • the method includes:
  • Step S102 acquiring a target image in response to a user input.
  • the target image is an image containing the face of at least one target object
  • the target object may be, for example, a real object with a specific face such as a person or an animal.
  • the embodiment of the present disclosure takes a person as a target object for description, and correspondingly, the face is a human face.
  • the image acquisition instruction input by the user may be detected first; wherein, the image acquisition instruction may include but is not limited to: selection operation, photographing operation, upload operation, gesture input or motion input; the above-mentioned gesture input such as clicking on the screen , long press, short press, drag, and touch the screen to form patterns, etc., action input such as hand movements such as extending fingers and waving, facial expressions such as blinking, opening mouth, etc. It can be understood that the above is only an example of the image acquisition instruction, and should not be construed as a limitation.
  • the target image is acquired.
  • the target image may be an image captured by an image capturing device in response to a capturing operation, or may be an image manually uploaded from a local storage in response to an uploading operation.
  • the target image may include the face of at least one target subject. Considering that the target images obtained by different methods may have different sizes, the original image can be obtained first, and then according to the preset size, the area image containing the face of the target object is cropped from the original image, and the cropped area image is obtained. as the target image.
  • Step S104 generating an avatar corresponding to the target image by the first generator.
  • the first generator can be, for example, a GAN (Generative Adversarial Networks, generative adversarial network), which is obtained by training based on the first sample image set and the second sample image set generated by the three-dimensional model; the first sample image set is obtained by training.
  • the images of are images containing real faces, and the images in the second sample image set are images generated based on 3D models containing faces.
  • the specific implementation process of the first generator training will be described below.
  • the trained first generator has the function of generating an avatar, and the avatar in this embodiment is a virtual three-dimensional image with animation style, oil painting style and other styles.
  • a target image is acquired in response to a user input, and an avatar corresponding to the target image is generated by a first generator.
  • this technical solution utilizes the first generator, which effectively simplifies the generation method of the virtual image, reduces the generation cost, improves the generation efficiency, and can generate the virtual image corresponding to the target image one-to-one, so that the virtual image can be generated.
  • the image is more diverse; at the same time, the first generator is easy to deploy in various production environments, reducing the performance requirements for hardware devices.
  • this embodiment provides a method for acquiring the second sample image included in the second sample image set, with reference to the following:
  • an initial 3D model of the face can be established by using 3D modeling tools such as MAYA and 3D Max, as shown in Figure 2; Different multiple target 3D models.
  • the initial 3D model has a mesh topology
  • mesh deformation refers to deforming the local mesh topology corresponding to the target part, so that the target part represented by the deformed mesh can meet the user's deformation requirements.
  • mesh deformation is performed on the unilateral eye part of the initial three-dimensional model, so that the eyes are deformed from the open state to the closed state, and the target three-dimensional model whose face is one eye closed is obtained;
  • mesh deformation is performed on the mouth part of the initial 3D model, so that the mouth is transformed from a normal closed state to a pouting state, and a target 3D model with a pouting expression on the face is obtained. It can be understood that FIG.
  • 3 is only an example of two mesh deformations, and there may also be various forms of mesh deformations, such as the deformation of the width of the facial contour, etc., which will not be listed here.
  • various target 3D models with diverse facial forms can be obtained from an initial 3D model with a single facial form.
  • the target three-dimensional model may be rendered to obtain a plurality of second sample images.
  • texture information such as hairstyle, hair color, skin color, face shape, fat and thinness of the model can be rendered, so as to obtain a second sample image containing different three-dimensional face models;
  • Examples of different second sample images are: the second sample image P1 is the face of a girl with blue shawl hair, straight eyebrows, red phoenix eyes, and a smiling girl, and the second sample image P2 has gray shawl hair, willow eyebrows, round eyes, and a slightly open mouth A girl's face, the second sample image P3 is a girl's face with brown ponytails, eyebrows covered by a head curtain, an upward angle of view, and a pouty mouth, and the second sample image P4 is a boy's face with short hair and beards.
  • this embodiment may also provide the following method for acquiring images in the second sample image set: based on the second generator A sample image of the three-dimensional model containing the face is generated, and the generated sample image of the three-dimensional model containing the face is added to the second set of sample images.
  • the second generator in this embodiment is configured to generate a sample image of the three-dimensional model including the face based on the image including the real face; the above-mentioned image including the real face may use the first sample image in the first sample image set; the second generation Examples of sample images containing a three-dimensional model of a face generated by the generator are: a sample image of a closed-eye expression and a sample image of a face wearing glasses.
  • Using the second generator can generate rich and diverse second sample images, which increases the diversity of images in the second sample image set.
  • the second sample image generated based on the three-dimensional model and the second sample image generated by the second generator are used together as training data including the three-dimensional face model, and are applied to the training process of the first generator. .
  • the first sample image can be collected from the network or a local database, or the first sample image can be collected by using an image collection device. Since in practical applications, the first sample images containing special facial features such as closed eyes, wearing glasses, laughing, etc. are few or difficult to collect. Therefore, in this embodiment, reference may be made to the above-mentioned first sample image generated by the second generator. In a two-sample image manner, a sample image containing a real face is generated based on the second generator, and the generated sample image containing a real face is added to the first sample image set, thereby increasing the diversity of the first sample image.
  • the specific manner of using the second generator to generate the first sample image and the second sample image includes: combining the first sample image in the first sample image set and the second sample in the second sample image set
  • the image is input to the second generator; the forward mixing and the reverse mixing are alternately used by the second generator to generate a sample image of the three-dimensional model containing the face and a sample image containing the real face; wherein, the forward mixing is based on the first sample.
  • the image generates a sample image containing the three-dimensional model of the face, and the inverse blending generates a sample image containing the real face based on the second sample image.
  • this embodiment Based on the above-mentioned first sample image set and second sample image set, this embodiment provides a training process of the first generator, as shown in steps 1 to 7 below.
  • Step 1 Obtain any first sample image from the first sample image set, and obtain any second sample image from the second sample image set.
  • Step 2 Using the first generator to be trained, based on the second sample image, an avatar sample image corresponding to the first sample image is generated.
  • the first generator to be trained is a generator in a GAN network, and the GAN network further includes a discriminator.
  • the sample content feature of the first sample image is extracted by the first generator, and the sample three-dimensional style feature of the second sample image is extracted.
  • the sample content feature is used to represent the position information of the face in the first sample image, such as the position coordinates of key points such as the mouth, eyes and nose, the face angle and the outline of the hairstyle;
  • the sample three-dimensional style feature is used to represent the face in the second sample image.
  • High-level semantic features such as hue, shape, and texture.
  • the sample content features and the sample three-dimensional style features are fused to obtain a virtual image sample image; the virtual image sample image includes the virtual image corresponding to the real face in the first sample image.
  • sample style feature of the first sample image and the sample 3D content feature of the second sample image can also be extracted by the first generator; and, based on the sample style feature and the sample content The feature generates a first restored image corresponding to the first sample image, and generates a second restored image corresponding to the second sample image based on the three-dimensional style feature of the sample and the three-dimensional content feature of the sample.
  • the sample image representing the avatar and the first sample can be added.
  • Target loss value for image correlation The calculation process of the target loss value is shown in the following steps 3 to 6:
  • Step 3 Calculate the image loss value between the first sample image and the avatar sample image; wherein, the image loss value is used to evaluate the correlation between the first sample image and the avatar sample image.
  • Step 4 Calculate the content loss value between the first sample image and the avatar sample image.
  • Step 5 Calculate the style loss value between the second sample image and the avatar sample image.
  • Step 6 Determine the target loss value based on the image loss value, the content loss value and the style loss value. Specifically, the respective weight coefficients of the image loss value, the content loss value and the style loss value are determined; the weight coefficient is used to adjust the similarity between the avatar sample image and the first sample image, and can be set according to actual needs, for example, When the user needs to generate an avatar that is closer to the real face, the weight coefficient of the content loss value can be set to be greater than the weight coefficient of the style loss value. When the user needs to generate an avatar with a virtual style such as animation, the style loss value can be set. The weight factor is greater than the weight factor of the content loss value. According to the weight coefficient, the weighted sum of the image loss value, the content loss value and the style loss value is determined as the target loss value.
  • the target loss value in this embodiment comprehensively considers the correlation and content loss between the first sample image and the avatar sample image, and the style loss between the second sample image and the avatar sample image.
  • the loss value is used to train the first generator, so that the first generator can better obtain an avatar with a high degree of fit with the target object, thereby improving user experience.
  • Step 7 train the first generator based on the target loss value.
  • the parameters of the first generator can be adjusted based on the target loss value and the training continues; when the target loss value converges to the preset value, the training is ended, and the trained first generator is obtained.
  • a first generator that can be directly applied to virtual image generation is obtained, and after the first generator is compressed, it is transferred to hardware devices such as mobile phones and tablets for use.
  • this embodiment may first detect the key points of the face in the target image, and determine the bounding box containing the key points. Specifically, the key points of the face in the target image can be detected according to the face detection algorithm, and the bounding box containing the key points can be determined.
  • the specific process of generating the virtual image corresponding to the target image by the first generator may include: inputting the bounding box and the key points at the preset position to the first generator, and generating the first image by the first generator.
  • the preset position is a group of positions representing the content of the face, such as the position of the left eye, the position of the brow and the tail of the right eyebrow, and the position of the left side under the nose; of course, it can also be other positions that can represent the content of the face.
  • the first image generated by the first generator includes the target style feature and the content feature of the target image, and the target style feature is the style feature learned by the first generator from the images of the second sample image set, such as anime style features.
  • the avatar corresponding to the target image is determined based on the first image.
  • the first image may be directly determined as the avatar corresponding to the target image.
  • another implementation method can also be provided, namely: (1) extracting the illumination information and low-frequency information in the target image.
  • the frequency of the image is an indicator of the intensity of the grayscale change in the image, and the area with slow grayscale changes in the image corresponds to the low-frequency information.
  • the low-frequency information can represent the rough outline of the face in the target image.
  • the method for generating a virtual image uses the first generator, which effectively simplifies the generation method of the virtual image and improves the generation efficiency.
  • the first sample image set and the second sample image set generated based on the three-dimensional model and the second generator effectively increase the diversity of training data; the trained first generator
  • the first generator In actual use, if you want to obtain a variety of avatars, you only need to change the target image input to the first generator, and then a one-to-one corresponding avatar can be generated for each target image, making the avatar more diverse.
  • the first generator is easy to deploy in various production environments, reducing performance requirements on hardware devices.
  • FIG. 6 is a schematic structural diagram of an apparatus for generating a virtual image according to an embodiment of the present disclosure.
  • the apparatus may be implemented by software and/or hardware, and may generally be integrated in an electronic device, and a target object may be generated by executing a method for generating a virtual image virtual image.
  • the device includes:
  • the image generation model 604 is used to obtain the virtual image corresponding to the target image by using the first generator
  • the first generator is obtained by training based on the first sample image set and the second sample image set generated by the three-dimensional model.
  • the above-mentioned generating apparatus further includes a detection module, the detection module is used for: detecting the key points of the face in the target image, and determining a bounding box containing the key points;
  • the above-mentioned image generation model 604 includes:
  • an input unit for inputting the bounding box and the key points at the preset positions to the first generator
  • An image generating unit configured to obtain a first image by using a first generator; wherein, the first image includes a target style feature and a content feature of the target image, and the target style feature is that the first generator learns from the images of the second sample image set the resulting style characteristics;
  • An avatar generating unit configured to determine an avatar corresponding to the target image based on the first image.
  • the above-mentioned image generation unit is further used for:
  • the above-mentioned generating apparatus includes a second sample image acquisition module, which includes:
  • the modeling unit is used to establish multiple target 3D models with different faces
  • the rendering unit is used for rendering the target three-dimensional model to obtain a plurality of second sample images.
  • the above-mentioned modeling unit is specifically used for: establishing an initial three-dimensional model of the face; and performing mesh deformation on different parts of the initial three-dimensional model to obtain multiple target three-dimensional models with different faces.
  • the above-mentioned generating apparatus further includes a sample image generating module, which is used for:
  • a sample image containing a three-dimensional model of a face is added to the second sample image set, and a sample image containing a real face is added to the first sample image set.
  • the above-mentioned sample image generation module is specifically used for:
  • the second generator alternately adopts forward blending and reverse blending to generate a sample image of a three-dimensional model containing a face and a sample image containing a real face; wherein the forward blending is to generate a three-dimensional model containing the face based on the first sample image.
  • Sample image, inverse blending is to generate a sample image containing the real face based on the second sample image.
  • the above-mentioned generating apparatus further includes a training module, which is used for:
  • the above-mentioned training module is specifically used for:
  • the weight coefficient is used to adjust the similarity between the avatar sample image and the first sample image
  • the weighted sum of the image loss value, the content loss value and the style loss value is determined as the target loss value.
  • the above image acquisition model 602 includes:
  • an instruction detection unit configured to detect an image acquisition instruction input by a user; wherein, the image acquisition instruction includes: selection operation, photographing operation, uploading operation, gesture input or motion input;
  • the image acquisition unit is used for acquiring the target image in response to the image acquisition instruction.
  • the virtual image generating apparatus provided by the embodiment of the present disclosure can execute the virtual image generating method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects of the execution method.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 700 includes: a processor 701 ; a memory 702 for storing executable instructions of the processor 701 ; the processor 701 for reading the executable instructions from the memory 702 instructions, and execute the instructions to implement the method for generating an avatar in the above embodiment.
  • Processor 701 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 700 to perform desired functions.
  • CPU central processing unit
  • Processor 701 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 700 to perform desired functions.
  • Memory 702 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 701 may execute the program instructions to implement the above-described method for generating avatars and/or the embodiments of the present disclosure Other desired features.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the electronic device 700 may also include an input device 703 and an output device 704 interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 703 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 704 can output various information to the outside, including the determined distance information, direction information, and the like.
  • the output devices 704 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • the electronic device 700 may also include any other suitable components according to the specific application.
  • embodiments of the present disclosure may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to execute the avatars described in the embodiments of the present disclosure generation method.
  • the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be computer-readable storage media on which computer program instructions are stored, and when executed by the processor, the computer program instructions cause the processor to execute the avatar provided by the embodiments of the present disclosure generation method.
  • the computer-readable storage medium may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • Embodiments of the present disclosure also provide a computer program product, including computer programs/instructions, which implement the methods in the embodiments of the present disclosure when the computer programs/instructions are executed by a processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Graphics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Geometry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Computer Hardware Design (AREA)
  • Architecture (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本公开实施例涉及一种虚拟形象的生成方法、装置、设备及介质,其中该方法包括:响应于用户输入获取目标图像;利用第一生成器得到目标图像对应的虚拟形象;其中,第一生成器是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的。本公开利用第一生成器,有效简化了虚拟形象的生成方式,提升了生成效率,能够生成与目标图像一一对应的虚拟形象,使虚拟形象更加多样化;同时,第一生成器易于部署在各种生产环境中,降低了对硬件设备的性能要求。

Description

一种虚拟形象的生成方法、装置、设备及介质
本申请要求于2021年04月20日提交中国专利局、申请号为202110433895.8、发明名称为“一种虚拟形象的生成方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及人工智能技术领域,尤其涉及一种虚拟形象的生成方法、装置、设备及介质。
背景技术
具有卡通、动漫等风格的三维虚拟形象广泛应用于虚拟主播、电商、新闻媒体等场景中,受到了越来越多用户的关注与喜爱。目前,三维虚拟形象一般是通过CG(Computer Graphics,计算机图形学)的方式渲染生成。但是,CG方式生成的虚拟形象单一,缺乏个性化,如果要实现多种多样的虚拟形象,需要从建模开始修改,制作时间较长、人力成本高;而且,渲染过程对图形硬件设备(如显卡)要求极高,在形象逼真程度和光影复杂度等渲染视觉效果方面,利用性能有限的硬件设备(如手机)是很难生成令人满意的虚拟形象的。
发明内容
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开提供了一种虚拟形象的生成方法、装置、设备及介质。
本公开实施例提供了一种虚拟形象的生成方法,包括:
响应于用户输入获取目标图像;
利用第一生成器得到所述目标图像对应的虚拟形象;
其中,所述第一生成器是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的。
可选的,在所述利用所述第一生成器得到所述目标图像对应的虚拟形象之前,所述方法还包括:
检测所述目标图像中面部的关键点,并确定包含所述关键点的包围盒;
所述利用所述第一生成器得到所述目标图像对应的虚拟形象的步骤,包括:
将所述包围盒和预设位置处的所述关键点输入至所述第一生成器;
通过所述第一生成器生成第一图像;其中,所述第一图像包括目标风格特征和所述目标图像的内容特征,所述目标风格特征是所述第一生成器从所述第二样本图像集的图像中学习得到的风格特征;
基于所述第一图像确定所述目标图像对应的虚拟形象。
可选的,基于所述第一图像确定所述目标图像对应的虚拟形象,还包括:
提取所述目标图像中的光照信息和低频信息;
将所述光照信息和低频信息与第一图像的同层次信息进行融合,得到第二图像;
将所述第二图像确定为所述目标图像对应的虚拟形象。
可选的,所述第二样本图像集中包括的第二样本图像的获取方式为:
建立面部不同的多个目标三维模型;
对各个所述目标三维模型进行渲染,得到多个第二样本图像。
可选的,所述建立面部不同的多个目标三维模型,包括:
建立面部的初始三维模型;
对所述初始三维模型的不同部位进行网格形变,得到面部不同的多个目标三维模型。
可选的,所述第二样本图像集还包括包含面部的三维模型的样本图像,所述第一样本图像集包括包含真实面部的样本图像,所述方法还包括:
基于第二生成器生成包含面部的三维模型的样本图像和包含真实面部的样本图像。
可选的,所述基于第二生成器生成包含面部的三维模型的样本图像和包含真实面部的样本图像,包括:
将所述第一样本图像集中的第一样本图像和所述第二样本图像集中的第二样本图像输入至第二生成器;
通过所述第二生成器交替采用正向混合和逆向混合,生成包含面部的三维模型的样本图像和包含真实面部的样本图像;其中,所述正向混合是基于所述第一样本图像生成包含面部的三维模型的样本图像,所述逆向混合是基于所述第二样本图像生成包含真实面部的样本图像。
可选的,所述第一生成器的训练过程包括:
从所述第一样本图像集中获取任一第一样本图像,从所述第二样本图像集中获取任一第二样本图像;
利用待训练的第一生成器,基于第二样本图像生成第一样本图像对应的虚拟形象样本图像;
计算所述第一样本图像和所述虚拟形象样本图像之间的图像损失值;其中,所述图像损失值用于衡量所述第一样本图像和所述虚拟形象样本图像之间相关度;
计算所述第一样本图像和所述虚拟形象样本图像之间的内容损失值;
计算所述第二样本图像和所述虚拟形象样本图像之间的风格损失值;
基于所述图像损失值、所述内容损失值和所述风格损失值确定目标损失值;
基于所述目标损失值训练所述第一生成器。
可选的,所述基于所述图像损失值、所述内容损失值和所述风格损失值确定目标损失值,包括:
确定所述图像损失值、所述内容损失值和所述风格损失值各自的权系数;所述权系数用于调整虚拟形象样本图像与第一样本图像之间的相似度;
根据所述权系数,将所述图像损失值、所述内容损失值和所述风格损失值的加权和确定为目标损失值。
可选的,所述响应于用户输入获取目标图像,包括:
检测用户输入的图像获取指令;其中,所述图像获取指令包括:选择操作、拍摄操作、上传操作、手势输入或动作输入;
响应于所述图像获取指令,获取目标图像。
可选的,所述第一样本图像集中的图像是包含真实面部的图像,所述第二样本图像集中的图像是包含面部的三维模型生成的图像。
本公开实施例还提供了一种虚拟形象的生成装置,包括:
图像获取模型,用于响应于用户输入获取目标图像;
形象生成模型,用于利用第一生成器得到所述目标图像对应的虚拟形象;
其中,所述第一生成器是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的。
本公开实施例还提供了一种电子设备,所述电子设备包括:处理器;用于存储所述处理器可执行指令的存储器;所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现如本公开实施例提供的虚拟形象的生成方法。
本公开实施例还提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行如本公开实施例提供的虚拟形象的生成方法。
本公开实施例提供的技术方案与现有技术相比具有如下优点:
本公开实施例提供了一种虚拟形象的生成方法、装置、设备及介质,该方案响应于用户输入获取目标图像,利用第一生成器得到所述目标图像对应的虚拟形象。相比于现有CG方式,本技术方案利用第一生成器,有效简化了虚拟形象的生成方式,提升了生成效率,能够生成与目标图像一一对应的虚拟形象,使虚拟形象更加多样化;同时,第一生成器易于部署在各种生产环境中,降低了对硬件设备的性能要求。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种虚拟形象的生成方法的流程示意图;
图2为本公开实施例提供的一种初始三维模型的示意图;
图3为本公开实施例提供的一种网格形变的示意图;
图4为本公开实施例提供的一种虚拟形象样本图像的生成方式示意图;
图5为本公开实施例提供的一种虚拟形象的生成过程示意图;
图6为本公开实施例提供的一种虚拟形象的生成装置的结构框图;
图7为本公开实施例提供的一种电子设备的结构示意图。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
考虑到现有基于CG生成虚拟形象的方式,存在着形象单一,需要消耗大量的时间人力才能实现虚拟形象的多样化,而且,渲染过程对图形硬件设备要求极高等问题。基于此,本公开实施例提供了一种虚拟形象的生成方法、装置、设备及介质,该技术可用于游戏、直播等各种需要生成虚拟形象的场景中。为便于理解,以下对本公开实施例进行详细介绍。
图1为本公开实施例提供的一种虚拟形象的生成方法的流程示意图,该方法可以由虚拟形象的生成装置执行,其中该装置可以采用软件和/或硬件实现,一般可集成在电子设备中。如图1所示,该方法包括:
步骤S102,响应于用户输入获取目标图像。
其中,目标图像为包含有至少一个目标对象的面部的图像,该目标对象例如可以为人、动物等具有面部具体形象的真实对象。为便于理解,作为一种示例,本公开实施例以人作为目标对象进行描述,相应的,面部为人脸。
在一些实施例中,可以首先检测用户输入的图像获取指令;其中,图像获取指令可以包括但不限于:选择操作、拍摄操作、上传操作、手势输入或动作输入;上述手势输入比如针对屏幕的点击、长按、短按、拖拽和触摸屏幕形成的图案等,动作输入比如伸手指、挥手等手部动作,眨眼、张嘴等面部表情动作等。可以理解的是,以上仅为图像获取指令的示例,不应理解为限制。
然后响应于图像获取指令,获取目标图像。在一些具体实施例中,目标图像可以是响应于拍摄操作,通过图像采集装置拍摄的图像,也可以是响应于上传操作,由人工从本地存储上传的图像。该目标图像可以包括至少一个目标对象的面部。考虑到通过不同方式获取的目标图像可能尺寸各异,从而,可以先获取原始图像,然后按照预设尺寸,从原始图像中裁剪出包含目标对象的面部的区域图像,并将裁剪得到的区域图像作为目标图像。
步骤S104,通过第一生成器生成目标图像对应的虚拟形象。
其中,该第一生成器诸如可以为GAN(Generative Adversarial Networks,生成式对抗网络),是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的;第一样本图像集中的图像是包含真实面部的图像,第二样本图像集中的图像是基于包含面部的三维模型生成的图像。关于第一生成器训练的具体实现过程将在下文展开描述。训练好的第一生成器具有生成虚拟形象的功能,本实施例中的虚拟形象为具有动漫风格、油画风格等其他风格的虚拟的三维形象。
本公开实施例提供的虚拟形象的生成方法,响应于用户输入获取目标图像,通过第一生成器生成目标图像对应的虚拟形象。相比于现有CG方式,本技术方案利用第一生成器,有效简化了虚拟形象的生成方式,降低了生成成本,提升了生成效率,能够生成与目标图像一一对应的虚拟形象,使虚拟形象更加多样化;同时,第一生成器易于部署在各种生产环境中,降低了对硬件设备的性能要求。
为了更好地理解本公开提供的虚线形象的生成方法,以下对本公开实施例展开描述。
针对用于训练第一生成器的第二样本图像集,本实施例提供一种第二样本图像集中包括的第二样本图像的获取方式,参照如下所示:
建立面部不同的多个目标三维模型。在一种具体实现方式中,可以先通过MAYA、3D Max等三维建模工具,建立面部的初始三维模型,如图2所示;再对初始三维模型的不同部位分别进行网格形变,得到面部不同的多个目标三维模型。
初始三维模型具有网格拓扑结构,网格形变是指对目标部位对应的局部网格拓扑结构进行形变,以使形变后的网格所表示的目标部位满足用户的形变需求。在一个具体示例中,参照图3所示,对初始三维模型的单侧眼睛部位进行网格形变,使眼睛由睁开状态形变为闭合状态,得到面部为闭合一只眼睛的目标三维模型;还比如,对初始三维模型的嘴巴部位进行网格形变,使嘴巴由正常的闭合状态形变为撇嘴状态,得到面部为撇嘴表情的目标三维模型。可以理解的是,图3仅为两种网格形变的示例,此外还可以有多种形式的网格形变, 诸如面部轮廓宽窄的形变等,在此不再一一列举。本实施例通过网格形变,可以由面部形式单一的初始三维模型得到面部形式多样化的多种目标三维模型。
接下来,可以对目标三维模型进行渲染,得到多个第二样本图像。具体的,针对每个目标三维模型,均可以对该模型的发型、发色、肤色、脸型胖瘦、光照等纹理信息进行渲染,以得到包含不同三维面部模型的第二样本图像;渲染得到的不同第二样本图像示例为:第二样本图像P1为蓝色披肩发、一字眉、丹凤眼、微笑的女孩人脸,第二样本图像P2为灰色披肩发、柳叶眉、圆眼睛、嘴巴微张的女孩人脸,第二样本图像P3为棕色扎马尾、被头帘遮住眉毛、眼睛视角向上、嘟嘴的女孩人脸,第二样本图像P4为短发、有胡须的男孩人脸。
考虑到基于三维模型生成的第二样本图像集中的图像可能包含的三维模型的形象有限,基于此,本实施例还可以提供如下一种获取第二样本图像集中图像的方式:基于第二生成器生成包含面部的三维模型的样本图像,并将生成的包含面部的三维模型的样本图像加入至第二样本图像集。本实施例中的第二生成器用于基于包括真实面容的图像生成包含面部的三维模型的样本图像;上述包括真实面容的图像可以采用第一样本图像集中的第一样本图像;第二生成器生成的包含面部的三维模型的样本图像的示例诸如:闭眼表情的样本图像和面部戴眼镜的样本图像。利用第二生成器可以生成丰富多样的第二样本图像,增加了第二样本图像集中图像的多样性。在本实施例中,基于三维模型生成的第二样本图像,以及利用第二生成器生成的第二样本图像,共同作为包含有面部三维模型的训练数据,应用于第一生成器的训练过程中。
针对用于训练第一生成器的第一样本图像集,通常可以从网络、本地数据库中采集第一样本图像,或者利用图像采集装置采集第一样本图像。由于在实际应用中,包含诸如紧闭双眼、戴眼镜、大笑等特殊面部特征的第一样本图像较少或很难采集,因此,本实施例可以参照上述利用第二生成器生成的第二样本图像的方式,基于第二生成器生成包含真实面部的样本图像,并将生成的包含真实面部的样本图像加入至第一样本图像集,由此增加第一样本图像的多样性。
在上述实施例中,利用第二生成器生成第一样本图像和第二样本图像的具体方式包括:将第一样本图像集中的第一样本图像和第二样本图像集中的第二 样本图像输入至第二生成器;通过第二生成器交替采用正向混合和逆向混合,生成包含面部的三维模型的样本图像和包含真实面部的样本图像;其中,正向混合是基于第一样本图像生成包含面部的三维模型的样本图像,逆向混合是基于第二样本图像生成包含真实面部的样本图像。
基于上述第一样本图像集和第二样本图像集,本实施例提供一种第一生成器的训练过程,参照如下步骤1至步骤7所示。
步骤1,从第一样本图像集中获取任一第一样本图像,从第二样本图像集中获取任一第二样本图像。
步骤2,利用待训练的第一生成器,基于第二样本图像生成第一样本图像对应的虚拟形象样本图像。
在一种实施例中,待训练的第一生成器为GAN网络中的生成器,GAN网络中还包括判别器。如图4所示,通过第一生成器提取第一样本图像的样本内容特征,以及提取第二样本图像的样本三维风格特征。样本内容特征用于表示第一样本图像中面部的位置信息,如口眼鼻等关键点的位置坐标、面部角度和发型轮廓等信息;样本三维风格特征用于表示第二样本图像中面部的色调、形状、纹理等高层语义特征。而后将样本内容特征和样本三维风格特征进行融合,得到虚拟形象样本图像;虚拟形象样本图像中包含与第一样本图像中真实面部对应的虚拟形象。
此外需要说明的是,在实际应用中,还可以通过第一生成器提取第一样本图像的样本风格特征,以及提取第二样本图像的样本三维内容特征;以及,基于样本风格特征和样本内容特征生成第一样本图像对应的第一还原图像,基于样本三维风格特征和样本三维内容特征生成第二样本图像对应的第二还原图像。
为了保证第一生成器的输出图像与输入图像之间的相关性,也即保证虚拟形象与真实面部更相像,在第一生成器训练过程中,可以增加表示虚拟形象样本图像和第一样本图像相关度的目标损失值。该目标损失值的计算过程参照如下步骤3至步骤6所示:
步骤3,计算第一样本图像和虚拟形象样本图像之间的图像损失值;其中,图像损失值用于对第一样本图像和虚拟形象样本图像之间相关度进行评估。
步骤4,计算第一样本图像和虚拟形象样本图像之间的内容损失值。
步骤5,计算第二样本图像和虚拟形象样本图像之间的风格损失值。
步骤6,基于图像损失值、内容损失值和风格损失值确定目标损失值。具体的,确定图像损失值、内容损失值和风格损失值各自的权系数;权系数用于调整虚拟形象样本图像与第一样本图像之间的相似度,且可以根据实际需求设置,比如,当用户需要生成的虚拟形象更加贴近真实面部时,可以设置内容损失值的权系数大于风格损失值的权系数,当用户需要生成的虚拟形象更具有动漫等虚拟风格时,可以设置风格损失值的权系数大于内容损失值的权系数。根据权系数,将图像损失值、内容损失值和风格损失值的加权和确定为目标损失值。
本实施例中的目标损失值,综合考虑了第一样本图像和虚拟形象样本图像之间的相关度和内容损失,以及第二样本图像和虚拟形象样本图像之间的风格损失,通过该目标损失值对第一生成器进行训练,使得第一生成器能够更好的获取与目标对象贴合度高的虚拟形象,提高了用户体验。
步骤7,基于目标损失值训练第一生成器。本实施例可以基于目标损失值调整第一生成器的参数并继续训练;当目标损失值收敛至预设值时结束训练,得到训练好的第一生成器。
通过上述训练过程,得到可直接应用于虚拟形象生成的第一生成器,将该第一生成器进行压缩后,迁移到手机、平板等硬件设备上使用。
如图5所示,在通过第一生成器生成目标对象对应的虚拟形象之前,本实施例可以首先检测目标图像中面部的关键点,并确定包含关键点的包围盒。具体可根据人脸检测算法检测目标图像中面部的关键点,并确定包含关键点的包围盒。
而后,通过第一生成器生成目标图像对应的虚拟形象的具体过程可以包括:将包围盒和预设位置处的关键点输入至第一生成器,通过第一生成器生成第一图像。其中,预设位置为一组代表人脸内容的位置,如可以包括左眼位置、右侧眉毛的眉头和眉尾位置、鼻子下方的左侧位置;当然也可以为其他能够表示人脸内容的位置。通过第一生成器生成的第一图像包括目标风格特征和目标图像的内容特征,该目标风格特征是第一生成器从第二样本图像集的图像中学习 得到的风格特征,如动漫风格特征。
基于第一图像确定目标图像对应的虚拟形象。在一种实现方式中,可以直接将第一图像确定为目标图像对应的虚拟形象。
为了使虚拟形象的真实性更高,与目标图像中的对象、环境更匹配,还可以提供另一种实现方式,即:(1)提取目标图像中的光照信息和低频信息。在实际应用中,图像的频率是表征图像中灰度变化剧烈程度的指标,图像中灰度变化缓慢的区域对应为低频信息,基于此,通过低频信息可以表示目标图像中面部的大致轮廓。(2)将光照信息和低频信息与第一图像的同层次信息进行融合,得到第二图像。(3)将第二图像确定为目标图像对应的虚拟形象。
综上,本公开实施例提供的虚拟形象的生成方法,利用第一生成器,有效简化了虚拟形象的生成方式,提升了生成效率。在第一生成器的训练过程中,通过第一样本图像集和基于三维模型和第二生成器生成的第二样本图像集,有效增加了训练数据的多样性;训练好的第一生成器在实际使用中,如果想要获得丰富多样的虚拟形象,只需变更输入至第一生成器的目标图像,便能够对各张目标图像生成一一对应的虚拟形象,使虚拟形象更加多样化,从而满足用户对虚拟形象的个性化、多样化需求。同时,第一生成器易于部署在各种生产环境中,降低了对硬件设备的性能要求。
图6为本公开实施例提供的一种虚拟形象的生成装置的结构示意图,该装置可由软件和/或硬件实现,一般可集成在电子设备中,可通过执行虚拟形象的生成方法来生成目标对象的虚拟形象。如图6所示,该装置包括:
图像获取模型602,用于响应于用户输入获取目标图像;
形象生成模型604,用于利用第一生成器得到目标图像对应的虚拟形象;
其中,第一生成器是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的。
在一种实施例中,上述生成装置还包括检测模块,该检测模块用于:检测目标图像中面部的关键点,并确定包含关键点的包围盒;
相应地,上述形象生成模型604包括:
输入单元,用于将包围盒和预设位置处的关键点输入至第一生成器;
图像生成单元,用于利用第一生成器得到第一图像;其中,第一图像包括目标风格特征和目标图像的内容特征,目标风格特征是第一生成器从第二样本图像集的图像中学习得到的风格特征;
形象生成单元,用于基于第一图像确定目标图像对应的虚拟形象。
在一种实施例中,上述形象生成单元还用于:
提取目标图像中的光照信息和低频信息;将光照信息和低频信息与第一图像的同层次信息进行融合,得到第二图像;将第二图像确定为目标图像对应的虚拟形象。
在一种实施例中,上述生成装置包括第二样本图像获取模块,其包括:
建模单元,用于建立面部不同的多个目标三维模型;
渲染单元,用于对目标三维模型进行渲染,得到多个第二样本图像。
在一种实施例中,上述建模单元具体用于:建立面部的初始三维模型;对初始三维模型的不同部位进行网格形变,得到面部不同的多个目标三维模型。
在一种实施例中,上述生成装置还包括样本图像生成模块,其用于:
基于第二生成器生成包含面部的三维模型的样本图像和包含真实面部的样本图像;
将包含面部的三维模型的样本图像加入至第二样本图像集,将包含真实面部的样本图像加入至第一样本图像集。
在一种实施例中,上述样本图像生成模块具体用于:
将第一样本图像集中的第一样本图像和第二样本图像集中的第二样本图像输入至第二生成器;
通过第二生成器交替采用正向混合和逆向混合,生成包含面部的三维模型的样本图像和包含真实面部的样本图像;其中,正向混合是基于第一样本图像生成包含面部的三维模型的样本图像,逆向混合是基于第二样本图像生成包含真实面部的样本图像。
在一种实施例中,上述生成装置还包括训练模块,其用于:
从第一样本图像集中获取任一第一样本图像,从第二样本图像集中获取任一第二样本图像;
利用待训练的第一生成器,基于第二样本图像生成第一样本图像对应的虚 拟形象样本图像;
计算第一样本图像和虚拟形象样本图像之间的图像损失值;其中,图像损失值用于对第一样本图像和虚拟形象样本图像之间相关度进行评估;
计算第一样本图像和虚拟形象样本图像之间的内容损失值;
计算第二样本图像和虚拟形象样本图像之间的风格损失值;
基于图像损失值、内容损失值和风格损失值确定目标损失值;
基于目标损失值训练第一生成器。
在一种实施例中,上述训练模块具体用于:
确定图像损失值、内容损失值和风格损失值各自的权系数;权系数用于调整虚拟形象样本图像与第一样本图像之间的相似度;
根据权系数,将图像损失值、内容损失值和风格损失值的加权和确定为目标损失值。
在一种实施例中,上述图像获取模型602包括:
指令检测单元,用于检测用户输入的图像获取指令;其中,图像获取指令包括:选择操作、拍摄操作、上传操作、手势输入或动作输入;
图像获取单元,用于响应于图像获取指令,获取目标图像。
本公开实施例所提供的虚拟形象的生成装置可执行本发明任意实施例所提供的虚拟形象的生成方法,具备执行方法相应的功能模块和有益效果。
图7为本公开实施例提供的一种电子设备的结构示意图。如图7所示,电子设备700包括:处理器701;用于存储所述处理器701可执行指令的存储器702;所述处理器701,用于从所述存储器702中读取所述可执行指令,并执行所述指令以实现上述实施例中的虚拟形象的生成方法。
处理器701可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备700中的其他组件以执行期望的功能。
存储器702可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、 硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器701可以运行所述程序指令,以实现上文所述的本公开的实施例的虚拟形象的生成方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。
在一个示例中,电子设备700还可以包括:输入装置703和输出装置704,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。
此外,该输入装置703还可以包括例如键盘、鼠标等等。
该输出装置704可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出装置704可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。
当然,为了简化,图7中仅示出了该电子设备700中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备700还可以包括任何其他适当的组件。
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本公开实施例所述虚拟形象的生成方法。
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本公开实施例所提供的虚拟形象的生成方法。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或 多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
本公开实施例还提供了一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现本公开实施例中的方法。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文所述的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (14)

  1. 一种虚拟形象的生成方法,其特征在于,包括:
    响应于用户输入获取目标图像;
    利用第一生成器得到所述目标图像对应的虚拟形象;
    其中,所述第一生成器是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的。
  2. 根据权利要求1所述的方法,其特征在于,在所述利用所述第一生成器得到所述目标图像对应的虚拟形象之前,所述方法还包括:
    检测所述目标图像中面部的关键点,并确定包含所述关键点的包围盒;
    所述利用所述第一生成器得到所述目标图像对应的虚拟形象的步骤,包括:
    将所述包围盒和预设位置处的所述关键点输入至所述第一生成器;
    通过所述第一生成器生成第一图像;其中,所述第一图像包括目标风格特征和所述目标图像的内容特征,所述目标风格特征是所述第一生成器从所述第二样本图像集的图像中学习得到的风格特征;
    基于所述第一图像确定所述目标图像对应的虚拟形象。
  3. 根据权利要求2所述的方法,其特征在于,基于所述第一图像确定所述目标图像对应的虚拟形象,还包括:
    提取所述目标图像中的光照信息和低频信息;
    将所述光照信息和低频信息与第一图像的同层次信息进行融合,得到第二图像;
    将所述第二图像确定为所述目标图像对应的虚拟形象。
  4. 根据权利要求1所述的方法,其特征在于,所述第二样本图像集中包括的第二样本图像的获取方式为:
    建立面部不同的多个目标三维模型;
    对各个所述目标三维模型进行渲染,得到多个第二样本图像。
  5. 根据权利要求4所述的方法,其特征在于,所述建立面部不同的多个目标三维模型,包括:
    建立面部的初始三维模型;
    对所述初始三维模型的不同部位进行网格形变,得到面部不同的多个目标 三维模型。
  6. 根据权利要求4所述的方法,其特征在于,所述第二样本图像集还包括包含面部的三维模型的样本图像,所述第一样本图像集包括包含真实面部的样本图像,所述方法还包括:
    基于第二生成器生成包含面部的三维模型的样本图像和包含真实面部的样本图像。
  7. 根据权利要求6所述的方法,其特征在于,所述基于第二生成器生成包含面部的三维模型的样本图像和包含真实面部的样本图像,包括:
    将所述第一样本图像集中的第一样本图像和所述第二样本图像集中的第二样本图像输入至第二生成器;
    通过所述第二生成器交替采用正向混合和逆向混合,生成包含面部的三维模型的样本图像和包含真实面部的样本图像;其中,所述正向混合是基于所述第一样本图像生成包含面部的三维模型的样本图像,所述逆向混合是基于所述第二样本图像生成包含真实面部的样本图像。
  8. 根据权利要求1所述的方法,其特征在于,所述第一生成器的训练过程包括:
    从所述第一样本图像集中获取任一第一样本图像,从所述第二样本图像集中获取任一第二样本图像;
    利用待训练的第一生成器,基于第二样本图像生成第一样本图像对应的虚拟形象样本图像;
    计算所述第一样本图像和所述虚拟形象样本图像之间的图像损失值;其中,所述图像损失值用于衡量所述第一样本图像和所述虚拟形象样本图像之间相关度;
    计算所述第一样本图像和所述虚拟形象样本图像之间的内容损失值;
    计算所述第二样本图像和所述虚拟形象样本图像之间的风格损失值;
    基于所述图像损失值、所述内容损失值和所述风格损失值确定目标损失值;
    基于所述目标损失值训练所述第一生成器。
  9. 根据权利要求8所述的方法,其特征在于,所述基于所述图像损失值、所述内容损失值和所述风格损失值确定目标损失值,包括:
    确定所述图像损失值、所述内容损失值和所述风格损失值各自的权系数;所述权系数用于调整虚拟形象样本图像与第一样本图像之间的相似度;
    根据所述权系数,将所述图像损失值、所述内容损失值和所述风格损失值的加权和确定为目标损失值。
  10. 根据权利要求1所述的方法,其特征在于,所述响应于用户输入获取目标图像,包括:
    检测用户输入的图像获取指令;其中,所述图像获取指令包括:选择操作、拍摄操作、上传操作、手势输入或动作输入;
    响应于所述图像获取指令,获取目标图像。
  11. 根据权利要求1所述的方法,其特征在于,所述第一样本图像集中的图像是包含真实面部的图像,所述第二样本图像集中的图像是包含面部的三维模型生成的图像。
  12. 一种虚拟形象的生成装置,其特征在于,包括:
    图像获取模型,用于响应于用户输入获取目标图像;
    形象生成模型,用于利用第一生成器得到所述目标图像对应的虚拟形象;
    其中,所述第一生成器是基于第一样本图像集和三维模型生成的第二样本图像集训练得到的。
  13. 一种电子设备,其特征在于,所述电子设备包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现上述权利要求1-11中任一所述的虚拟形象的生成方法。
  14. 一种计算机可读存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序用于执行上述权利要求1-11中任一所述的虚拟形象的生成方法。
PCT/CN2022/086518 2021-04-20 2022-04-13 一种虚拟形象的生成方法、装置、设备及介质 WO2022222810A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22790918.1A EP4207080A4 (en) 2021-04-20 2022-04-13 AVATAR GENERATING METHOD, APPARATUS AND DEVICE AND MEDIUM
US18/069,024 US12002160B2 (en) 2021-04-20 2022-12-20 Avatar generation method, apparatus and device, and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110433895.8A CN113112580B (zh) 2021-04-20 2021-04-20 一种虚拟形象的生成方法、装置、设备及介质
CN202110433895.8 2021-04-20

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/069,024 Continuation US12002160B2 (en) 2021-04-20 2022-12-20 Avatar generation method, apparatus and device, and medium

Publications (1)

Publication Number Publication Date
WO2022222810A1 true WO2022222810A1 (zh) 2022-10-27

Family

ID=76719300

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/086518 WO2022222810A1 (zh) 2021-04-20 2022-04-13 一种虚拟形象的生成方法、装置、设备及介质

Country Status (3)

Country Link
EP (1) EP4207080A4 (zh)
CN (1) CN113112580B (zh)
WO (1) WO2022222810A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112580B (zh) * 2021-04-20 2022-03-25 北京字跳网络技术有限公司 一种虚拟形象的生成方法、装置、设备及介质
CN113592988B (zh) * 2021-08-05 2023-06-30 北京奇艺世纪科技有限公司 三维虚拟角色图像生成方法及装置
CN113938696B (zh) * 2021-10-13 2024-03-29 广州方硅信息技术有限公司 基于自定义虚拟礼物的直播互动方法、系统及计算机设备
CN114120412B (zh) * 2021-11-29 2022-12-09 北京百度网讯科技有限公司 图像处理方法和装置
CN114119935B (zh) * 2021-11-29 2023-10-03 北京百度网讯科技有限公司 图像处理方法和装置
CN115359219B (zh) * 2022-08-16 2024-04-19 支付宝(杭州)信息技术有限公司 虚拟世界的虚拟形象处理方法及装置
CN115809696B (zh) * 2022-12-01 2024-04-02 支付宝(杭州)信息技术有限公司 虚拟形象模型训练方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080309675A1 (en) * 2007-06-11 2008-12-18 Darwin Dimensions Inc. Metadata for avatar generation in virtual environments
CN110348330A (zh) * 2019-06-24 2019-10-18 电子科技大学 基于vae-acgan的人脸姿态虚拟视图生成方法
CN112541963A (zh) * 2020-11-09 2021-03-23 北京百度网讯科技有限公司 三维虚拟形象生成方法、装置、电子设备和存储介质
CN113112580A (zh) * 2021-04-20 2021-07-13 北京字跳网络技术有限公司 一种虚拟形象的生成方法、装置、设备及介质

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2543893A (en) * 2015-08-14 2017-05-03 Metail Ltd Methods of generating personalized 3D head models or 3D body models
US9813673B2 (en) * 2016-01-20 2017-11-07 Gerard Dirk Smits Holographic video capture and telepresence system
CN109427083B (zh) * 2017-08-17 2022-02-01 腾讯科技(深圳)有限公司 三维虚拟形象的显示方法、装置、终端及存储介质
CN110390705B (zh) * 2018-04-16 2023-11-10 北京搜狗科技发展有限公司 一种生成虚拟形象的方法及装置
CN108765550B (zh) * 2018-05-09 2021-03-30 华南理工大学 一种基于单张图片的三维人脸重建方法
CN109284729B (zh) * 2018-10-08 2020-03-03 北京影谱科技股份有限公司 基于视频获取人脸识别模型训练数据的方法、装置和介质
CN110189397A (zh) * 2019-03-29 2019-08-30 北京市商汤科技开发有限公司 一种图像处理方法及装置、计算机设备和存储介质
CN110033505A (zh) * 2019-04-16 2019-07-19 西安电子科技大学 一种基于深度学习的人体动作捕捉与虚拟动画生成方法
CN110796721A (zh) * 2019-10-31 2020-02-14 北京字节跳动网络技术有限公司 虚拟形象的颜色渲染方法、装置、终端及存储介质
CN110782515A (zh) * 2019-10-31 2020-02-11 北京字节跳动网络技术有限公司 虚拟形象的生成方法、装置、电子设备及存储介质
CN111265879B (zh) * 2020-01-19 2023-08-08 百度在线网络技术(北京)有限公司 虚拟形象生成方法、装置、设备及存储介质
CN111354079B (zh) * 2020-03-11 2023-05-02 腾讯科技(深圳)有限公司 三维人脸重建网络训练及虚拟人脸形象生成方法和装置
CN111598979B (zh) * 2020-04-30 2023-03-31 腾讯科技(深圳)有限公司 虚拟角色的面部动画生成方法、装置、设备及存储介质
CN112102468B (zh) * 2020-08-07 2022-03-04 北京汇钧科技有限公司 模型训练、虚拟人物图像生成方法和装置以及存储介质
CN112258592A (zh) * 2020-09-17 2021-01-22 深圳市捷顺科技实业股份有限公司 一种人脸可见光图的生成方法及相关装置
CN112215927B (zh) * 2020-09-18 2023-06-23 腾讯科技(深圳)有限公司 人脸视频的合成方法、装置、设备及介质
CN112330781A (zh) * 2020-11-24 2021-02-05 北京百度网讯科技有限公司 生成模型和生成人脸动画的方法、装置、设备和存储介质
CN112562045B (zh) * 2020-12-16 2024-04-05 北京百度网讯科技有限公司 生成模型和生成3d动画的方法、装置、设备和存储介质
CN112634282B (zh) * 2020-12-18 2024-02-13 北京百度网讯科技有限公司 图像处理方法、装置以及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080309675A1 (en) * 2007-06-11 2008-12-18 Darwin Dimensions Inc. Metadata for avatar generation in virtual environments
CN110348330A (zh) * 2019-06-24 2019-10-18 电子科技大学 基于vae-acgan的人脸姿态虚拟视图生成方法
CN112541963A (zh) * 2020-11-09 2021-03-23 北京百度网讯科技有限公司 三维虚拟形象生成方法、装置、电子设备和存储介质
CN113112580A (zh) * 2021-04-20 2021-07-13 北京字跳网络技术有限公司 一种虚拟形象的生成方法、装置、设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4207080A4

Also Published As

Publication number Publication date
US20230128505A1 (en) 2023-04-27
CN113112580B (zh) 2022-03-25
EP4207080A4 (en) 2024-05-22
EP4207080A1 (en) 2023-07-05
CN113112580A (zh) 2021-07-13

Similar Documents

Publication Publication Date Title
WO2022222810A1 (zh) 一种虚拟形象的生成方法、装置、设备及介质
US11625878B2 (en) Method, apparatus, and system generating 3D avatar from 2D image
US10860838B1 (en) Universal facial expression translation and character rendering system
KR102491140B1 (ko) 가상 아바타 생성 방법 및 장치
KR20210119438A (ko) 얼굴 재연을 위한 시스템 및 방법
JP2022503647A (ja) クロスドメイン画像変換
WO2020103700A1 (zh) 一种基于微表情的图像识别方法、装置以及相关设备
WO2023050992A1 (zh) 用于人脸重建的网络训练方法、装置、设备及存储介质
CN111432267B (zh) 视频调整方法、装置、电子设备及存储介质
CN113362263B (zh) 变换虚拟偶像的形象的方法、设备、介质及程序产品
CN108961369A (zh) 生成3d动画的方法和装置
CN113628327B (zh) 一种头部三维重建方法及设备
JP7483301B2 (ja) 画像処理及び画像合成方法、装置及びコンピュータプログラム
WO2022252866A1 (zh) 一种互动处理方法、装置、终端及介质
CN110688948A (zh) 视频中人脸性别变换方法、装置、电子设备和存储介质
KR20120005587A (ko) 컴퓨터 시스템에서 얼굴 애니메이션 생성 방법 및 장치
WO2022047463A1 (en) Cross-domain neural networks for synthesizing image with fake hair combined with real image
WO2023124391A1 (zh) 妆容迁移及妆容迁移网络的训练方法和装置
WO2022257766A1 (zh) 图像处理方法、装置、设备及介质
WO2024088100A1 (zh) 特效处理方法、装置、电子设备和存储介质
CN117132711A (zh) 一种数字人像定制方法、装置、设备及存储介质
KR102229056B1 (ko) 표정 인식 모델 생성 장치, 방법 및 이러한 방법을 수행하도록 프로그램된 컴퓨터 프로그램을 저장하는 컴퓨터 판독가능한 기록매체
CN114779948B (zh) 基于面部识别的动画人物即时交互控制方法、装置及设备
CN116977547A (zh) 一种三维人脸重建方法、装置、电子设备和存储介质
US12002160B2 (en) Avatar generation method, apparatus and device, and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22790918

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022790918

Country of ref document: EP

Effective date: 20230329

NENP Non-entry into the national phase

Ref country code: DE