WO2020139054A1 - Apparatus and method for generating a virtual avatar - Google Patents
Apparatus and method for generating a virtual avatar Download PDFInfo
- Publication number
- WO2020139054A1 WO2020139054A1 PCT/KR2019/018710 KR2019018710W WO2020139054A1 WO 2020139054 A1 WO2020139054 A1 WO 2020139054A1 KR 2019018710 W KR2019018710 W KR 2019018710W WO 2020139054 A1 WO2020139054 A1 WO 2020139054A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target object
- occlusion
- virtual avatar
- image
- occlusion objects
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/255—Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the disclosure relates to the field of image processing technologies. More particularly, the disclosure relates to an apparatus and method for generating a virtual avatar.
- the related virtual avatar generation scheme is performing model matching directly based on the self-photograph of the user, that is, for the facial features, the corresponding facial feature textures are loaded from the model, and the virtual avatars are combined and generated.
- the related virtual avatar generation scheme may generate an erroneous three-dimensional virtualized avatar in many cases.
- the virtual avatar generated based on the corresponding image has many errors, and the virtual avatar has a low degree of similarity with the user, so that the virtual avatar accurately reflects the user's appearance characteristics, which affects the recognizability of the virtual avatar.
- An aspect of the present disclosure to provide a method and device for generating a virtual avatar, which may improve the similarity between the virtual avatar and the real image.
- a method for generating a virtual avatar comprising:
- the occlusion objects comprise glasses and/or an item or hair blocking facial features.
- the neural network model comprises a convolutional neural network model.
- the external features comprise shape and/or color.
- a device for generating a virtual avatar comprising: a processor, the processor is configured to:
- occlusion object images matching the preset external features from a preset three Dimension image library of occlusion object to load to a corresponding position of the virtual avatar to obtain the virtual avatar of the target object;
- the occlusion objects comprise glasses and/or an item or hair blocking facial features.
- the neural network model comprises a convolutional neural network model.
- the external features comprise shape and/or color.
- a non-transitory computer readable storage medium storing instructions, wherein the instructions, when executed by a processor, causing the processor to perform the method for generating a virtual avatar as described above.
- An electronic device comprising: a non-transitory computer readable storage medium, and a processor capable of accessing to the non-transitory computer readable storage medium.
- the method and device for generating a virtual avatar provided by the embodiments of the present disclosure, according to wearing condition of the occlusion objects of the target object.
- the occlusion objects are removed from the image of the target object by Artificial Intelligence (AI) technology, to restore the image of the target object to an ideal input state for generating the virtual avatar.
- AI Artificial Intelligence
- the external features of the removed occlusion objects are matched to the corresponding occlusion object images and corresponding occlusion object images are loaded to the virtual avatar to obtain the final virtual avatar of the target object.
- Various embodiments of the present disclosure provide a better 3D display effect of the virtual avatar may be got, thereby effectively avoiding the errors due to the influence of the occlusion objects when generating the virtual avatar in the related technology, and improving the similarity between the virtual avatar and the real image of the target object.
- Figure 1 is a flowchart for a method according to various embodiments of the present disclosure.
- Figure 1 is a flowchart for a method according to various embodiments of the present disclosure. As shown in Figure 1, the method for generating a user virtual avatar in an embodiment includes:
- Step 101 Whether a target object to be virtualized in an image wears preset occlusion objects is detected.
- this step it is necessary to detect whether the target object in the image is wearing the preset occlusion objects, so that when there is an occlusion object, the occlusion object is processed first, and then the virtual avatar is generated to improve the similarity between the virtual avatar and the real target object.
- the image may specifically be a self-photograph of a user or another image designated by the user, which is not limited herein.
- the occlusion objects may include glasses and/or an item or hair that blocks facial features.
- the occlusion objects may be a garnish of glasses or the like, sunglasses, or the like, and may also be earrings or an ornament that blocks the facial features, which are not limited thereto.
- Step 102 When the target object wears at least one of the occlusion objects, occlusion objects detected are removed from the image of the target object, by a pre-trained neural network model, and a corresponding virtual avatar is generated according to the obtained image without the occlusion objects; and according to preset external features of each occlusion object removed, occlusion object images matching the preset external features are selected from the preset three Dimension image library of occlusion objects to load to a corresponding position of the virtual avatar, the final virtual avatar of the target object is obtained.
- the pre-trained neural network model is used to remove the occlusion objects one by one from the image of the target object, the image wearing the occlusion objects is restored to the image without the occlusion objects, and then the corresponding virtual avatar is generated based on the image without the occlusion objects to improve the similarity between the virtual avatar and the real image, and avoid influence of the occlusion objects on the accuracy of the virtual avatar.
- the target object may be determined by a person skilled in the art according to requirement of the actual virtualized avatar, and the corresponding target object may be determined, for example, to be a portrait or an image of other creatures.
- a specific training method of the neural network model may include the following steps:
- X1 generating a training data set (in the following, taking the occlusion object as glasses as an example, the others are similar):
- the two groups of images saved the one used for input of deep learning is the group of images wearing glasses, and the ground-truth data is the group of images without wearing glasses. 80% of the two groups of images may be randomly selected as a training set, and the remaining 20% of the two groups of images is used as a test set.
- a codec network model of Context Encoders may be used to repair and reconstruct input images wearing glasses.
- the input images are first scaled to a preset standard size (eg, 128*128), and then final reconstructed images are generated by the codec network model composed of a multi-layer convolutional neural network.
- the specific training process includes the following stages:
- Coding stage the original input images are encoded through an encoder network composed of the multi-layer convolutional neural network (such as a 5-layer convolutional neural network) to obtain coding features of a certain dimensions (such as when the encoder network composed of a 5-layer convolutional neural network is used, coding features of 4000 dimensions will be obtained).
- an encoder network composed of the multi-layer convolutional neural network such as a 5-layer convolutional neural network
- Decoding stage an encoded result obtained in the coding stage is input to a decoder based on a deep convolutional generative adversarial network (DCGAN) structure to generate reconstructed images.
- DCGAN deep convolutional generative adversarial network
- an error value corresponding to the reconstructed image generated is calculated according to a loss function, and the model parameters of the neural network model are adjusted based on minimizing the error value.
- MSE Mean Square Error
- the neural network model includes, but is not limited to, a convolutional neural network model and a generative adversarial network model.
- the external features in this step may be set by those skilled in the art according to actual requirements, and may include features such as shape and/or color, but are not limited thereto, and for example, the external features may also be, a pattern or the like.
- the generation of the virtual avatar in step 102 may be implemented by a related method, and details of the generation of the virtual avatar are not described herein again.
- the matched occlusion object images can be worn on the virtual avatar by a three-dimensional image technology, and the specific method is known to those skilled in the art, and details are not described herein again.
- Step 103 When the target object does not wear the occlusion object, the corresponding virtual avatar is generated directly according to the image of the target object.
- This step may be implemented by using related methods, and details are not described herein again.
- wearing condition of the occlusion objects of the target object is detected before the virtual avatar is generated, and different generating modes are adopted according to whether or not to wear the occlusion objects.
- the occlusion objects are removed from the image of the target object by Artificial Intelligence (AI) technology, to restore the image of the target object to an ideal input state for generating the virtual avatar.
- AI Artificial Intelligence
- the corresponding virtual avatar is generated, and finally, the images of the occlusion objects are matched according to the external features of the occlusion objects and loaded onto the virtual avatar to obtain the final virtual avatar of the target object.
- the virtual avatar is generated based on the image without the occlusion objects after the reconstructed process, thereby ensuring the 3D display effect of the virtual avatar, effectively avoiding the errors due to the influence of the occlusion objects when generating the virtual avatar in the related technology, and improving the similarity between the virtual avatar and the real image of the target object.
- a schematic diagram illustrating a structure of a device for generating a virtual avatar corresponding to the method in the embodiment of the present disclosure the device includes: a processor, wherein the processor is configured to:
- detected occlusion objects are removed from the image of the target object by a pre-trained neural network model, and a corresponding virtual avatar is generated according to the obtained image without the occlusion objects; and according to preset external features of each removed occlusion object, the occlusion object images matching with the external features are selected from a preset 3D image library of occlusion objects to load onto the virtual avatar to obtain a virtual avatar of the target object;
- a corresponding virtual avatar is generated directly according to the image of the target object.
- the occlusion objects may include glasses and/or an item or hair that blocks facial features.
- the neural network model includes, but is not limited to, a convolutional neural network model and a generative adversarial network model.
- the external features may include shape and/or color.
- a non-transitory computer readable storage medium storing instructions, wherein the instructions, when executed by a processor, causing the processor to perform the method for generating the user virtual avatar as described above.
- An electronic device comprising: a non-transitory computer readable storage medium, and a processor capable of accessing to the non-transitory computer readable storage medium.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Geometry (AREA)
- Computer Graphics (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
The present disclosure discloses a method and apparatus for generating a virtual avatar, comprising: detecting whether a target object to be virtualized in an image wears preset occlusion objects; when the target object wears at least one of the occlusion objects: removing, detected occlusion objects from the image of the target object, by a pre-trained neural network model; generating a corresponding virtual avatar according to the obtained image without the occlusion objects; and selecting, according to preset external features of each occlusion object removed, occlusion object images matching the preset external features from a preset three Dimension image library of occlusion object to load to a corresponding position of the virtual avatar to obtain the virtual avatar of the target object; when the target object does not wear the occlusion objects: directly generating the corresponding virtual avatar according to the image of the target object. Through the present disclosure, the similarity between the virtual avatar and the real image may be improved.
Description
The disclosure relates to the field of image processing technologies. More particularly, the disclosure relates to an apparatus and method for generating a virtual avatar.
With the increasing popularity of models such as virtual avatars in terminal devices such as mobile phones, generation methods of a model based on facial expressions and actions occupy the mainstream. The generation of virtual avatars is mostly done by selecting a selfie or by taking a selfie.
The related virtual avatar generation scheme is performing model matching directly based on the self-photograph of the user, that is, for the facial features, the corresponding facial feature textures are loaded from the model, and the virtual avatars are combined and generated.
In the process of implementing the present disclosure, the inventor found that the related virtual avatar generation scheme may generate an erroneous three-dimensional virtualized avatar in many cases. In particular, when a user takes a image with decorative objects such as glasses or earrings, the virtual avatar generated based on the corresponding image has many errors, and the virtual avatar has a low degree of similarity with the user, so that the virtual avatar accurately reflects the user's appearance characteristics, which affects the recognizability of the virtual avatar.
An aspect of the present disclosure to provide a method and device for generating a virtual avatar, which may improve the similarity between the virtual avatar and the real image.
In order to achieve the object above, the technical solution proposed by the present disclosure is:
A method for generating a virtual avatar, comprising:
detecting whether a target object to be virtualized in an image wears preset occlusion objects;
when the target object wears at least one of the occlusion objects:
removing, detected occlusion objects from the image of the target object, by a pre-trained neural network model;
generating a corresponding virtual avatar according to the obtained image without the occlusion objects; and
selecting, according to preset external features of each occlusion object removed, occlusion object images matching the preset external features from a preset three Dimension image library of occlusion object to load to a corresponding position of the virtual avatar to obtain the virtual avatar of the target object;
when the target object does not wear the occlusion objects:
directly generating the corresponding virtual avatar according to the image of the target object.
Preferably, when the target object is a portrait, the occlusion objects comprise glasses and/or an item or hair blocking facial features.
Preferably, the neural network model comprises a convolutional neural network model.
Preferably, the external features comprise shape and/or color.
A device for generating a virtual avatar, comprising: a processor, the processor is configured to:
detect whether a target object to be virtualized in an image wears preset occlusion objects;
when the target object wears at least one of the occlusion objects:
remove detected occlusion objects from the image of the target object, by a pre-trained neural network model;
generate a corresponding virtual avatar according to the obtained image without the occlusion objects; and
select, according to preset external features of each occlusion object removed, occlusion object images matching the preset external features from a preset three Dimension image library of occlusion object to load to a corresponding position of the virtual avatar to obtain the virtual avatar of the target object;
when the target object does not wear the occlusion objects:
directly generate the corresponding virtual avatar according to the image of the target object.
Preferably, when the target object is a portrait, the occlusion objects comprise glasses and/or an item or hair blocking facial features.
Preferably, the neural network model comprises a convolutional neural network model.
Preferably, the external features comprise shape and/or color.
A non-transitory computer readable storage medium, storing instructions, wherein the instructions, when executed by a processor, causing the processor to perform the method for generating a virtual avatar as described above.
An electronic device, comprising: a non-transitory computer readable storage medium, and a processor capable of accessing to the non-transitory computer readable storage medium.
In summary, the method and device for generating a virtual avatar provided by the embodiments of the present disclosure, according to wearing condition of the occlusion objects of the target object. When wearing the occlusion objects, the occlusion objects are removed from the image of the target object by Artificial Intelligence (AI) technology, to restore the image of the target object to an ideal input state for generating the virtual avatar. After the virtual avatar is generated, the external features of the removed occlusion objects are matched to the corresponding occlusion object images and corresponding occlusion object images are loaded to the virtual avatar to obtain the final virtual avatar of the target object.
Various embodiments of the present disclosure provide a better 3D display effect of the virtual avatar may be got, thereby effectively avoiding the errors due to the influence of the occlusion objects when generating the virtual avatar in the related technology, and improving the similarity between the virtual avatar and the real image of the target object.
Figure 1 is a flowchart for a method according to various embodiments of the present disclosure.
In order to make the objects, technical solutions and advantages of the present disclosure clearer, the present disclosure will be further described in detail below with reference to drawings and specific embodiments.
Figure 1 is a flowchart for a method according to various embodiments of the present disclosure. As shown in Figure 1, the method for generating a user virtual avatar in an embodiment includes:
Step 101: Whether a target object to be virtualized in an image wears preset occlusion objects is detected.
In this step, it is necessary to detect whether the target object in the image is wearing the preset occlusion objects, so that when there is an occlusion object, the occlusion object is processed first, and then the virtual avatar is generated to improve the similarity between the virtual avatar and the real target object.
The image may specifically be a self-photograph of a user or another image designated by the user, which is not limited herein.
Preferably, when the target object is a portrait, the occlusion objects may include glasses and/or an item or hair that blocks facial features. For example, the occlusion objects may be a garnish of glasses or the like, sunglasses, or the like, and may also be earrings or an ornament that blocks the facial features, which are not limited thereto.
Step 102: When the target object wears at least one of the occlusion objects, occlusion objects detected are removed from the image of the target object, by a pre-trained neural network model, and a corresponding virtual avatar is generated according to the obtained image without the occlusion objects; and according to preset external features of each occlusion object removed, occlusion object images matching the preset external features are selected from the preset three Dimension image library of occlusion objects to load to a corresponding position of the virtual avatar, the final virtual avatar of the target object is obtained.
In this step, the pre-trained neural network model is used to remove the occlusion objects one by one from the image of the target object, the image wearing the occlusion objects is restored to the image without the occlusion objects, and then the corresponding virtual avatar is generated based on the image without the occlusion objects to improve the similarity between the virtual avatar and the real image, and avoid influence of the occlusion objects on the accuracy of the virtual avatar.
In practical applications, the target object may be determined by a person skilled in the art according to requirement of the actual virtualized avatar, and the corresponding target object may be determined, for example, to be a portrait or an image of other creatures.
Assuming that the target object is a portrait, a specific training method of the neural network model may include the following steps:
X1, generating a training data set (in the following, taking the occlusion object as glasses as an example, the others are similar):
Select people with different skin colors, different genders, and in different environments to shoot one group of images without wearing glasses and another group of images wearing glasses (you can also use another method: load an glasses image of a right size on a group of images without glasses, which is set to be the group of images wearing glasses). Among the two groups of images saved, the one used for input of deep learning is the group of images wearing glasses, and the ground-truth data is the group of images without wearing glasses. 80% of the two groups of images may be randomly selected as a training set, and the remaining 20% of the two groups of images is used as a test set.
X2, the training of the neural network model:
A codec network model of Context Encoders may be used to repair and reconstruct input images wearing glasses. In a training process, the input images are first scaled to a preset standard size (eg, 128*128), and then final reconstructed images are generated by the codec network model composed of a multi-layer convolutional neural network.
The specific training process includes the following stages:
Coding stage: the original input images are encoded through an encoder network composed of the multi-layer convolutional neural network (such as a 5-layer convolutional neural network) to obtain coding features of a certain dimensions (such as when the encoder network composed of a 5-layer convolutional neural network is used, coding features of 4000 dimensions will be obtained).
Decoding stage: an encoded result obtained in the coding stage is input to a decoder based on a deep convolutional generative adversarial network (DCGAN) structure to generate reconstructed images.
The calculation of a loss value and the adjustment of model parameters: an error value corresponding to the reconstructed image generated is calculated according to a loss function, and the model parameters of the neural network model are adjusted based on minimizing the error value.
Here, for the loss function used for the training above: in addition to the Mean Square Error (MSE) commonly used, that is, a squared error between pixels of the real image and the reconstructed image generated, and then a term against loss is added, which comes from an error between that a discriminator in the generative adversarial networks judges the reconstructed image to be false and the true value, and thus, a better reconstruction effect is obtained.
Preferably, the neural network model includes, but is not limited to, a convolutional neural network model and a generative adversarial network model.
Preferably, the external features in this step may be set by those skilled in the art according to actual requirements, and may include features such as shape and/or color, but are not limited thereto, and for example, the external features may also be, a pattern or the like.
The generation of the virtual avatar in step 102 may be implemented by a related method, and details of the generation of the virtual avatar are not described herein again.
Specifically, in step 102, the matched occlusion object images can be worn on the virtual avatar by a three-dimensional image technology, and the specific method is known to those skilled in the art, and details are not described herein again.
Step 103: When the target object does not wear the occlusion object, the corresponding virtual avatar is generated directly according to the image of the target object.
This step may be implemented by using related methods, and details are not described herein again.
According to the method for generating a virtual avatar in the embodiment of the present disclosure, wearing condition of the occlusion objects of the target object is detected before the virtual avatar is generated, and different generating modes are adopted according to whether or not to wear the occlusion objects. When wearing the occlusion objects, the occlusion objects are removed from the image of the target object by Artificial Intelligence (AI) technology, to restore the image of the target object to an ideal input state for generating the virtual avatar. Then, based on the image without the occlusion objects, the corresponding virtual avatar is generated, and finally, the images of the occlusion objects are matched according to the external features of the occlusion objects and loaded onto the virtual avatar to obtain the final virtual avatar of the target object. In this way, when the target object wears the occlusion objects, the virtual avatar is generated based on the image without the occlusion objects after the reconstructed process, thereby ensuring the 3D display effect of the virtual avatar, effectively avoiding the errors due to the influence of the occlusion objects when generating the virtual avatar in the related technology, and improving the similarity between the virtual avatar and the real image of the target object.
A schematic diagram illustrating a structure of a device for generating a virtual avatar corresponding to the method in the embodiment of the present disclosure, the device includes: a processor, wherein the processor is configured to:
detect whether a target object to be virtualized in an image wears preset occlusion objects;
when the target object wears at least one of the occlusion objects, detected occlusion objects are removed from the image of the target object by a pre-trained neural network model, and a corresponding virtual avatar is generated according to the obtained image without the occlusion objects; and according to preset external features of each removed occlusion object, the occlusion object images matching with the external features are selected from a preset 3D image library of occlusion objects to load onto the virtual avatar to obtain a virtual avatar of the target object;
when the target object does not wear the occlusion objects, a corresponding virtual avatar is generated directly according to the image of the target object.
Preferably, when the target object is a portrait, the occlusion objects may include glasses and/or an item or hair that blocks facial features.
Preferably, the neural network model includes, but is not limited to, a convolutional neural network model and a generative adversarial network model.
Preferably, the external features may include shape and/or color.
A non-transitory computer readable storage medium, storing instructions, wherein the instructions, when executed by a processor, causing the processor to perform the method for generating the user virtual avatar as described above.
An electronic device, comprising: a non-transitory computer readable storage medium, and a processor capable of accessing to the non-transitory computer readable storage medium.
In conclusion, the embodiments above are only the preferred embodiments of the present disclosure and are not intended to limit the scope of the present disclosure. Any modifications, equivalent substitutions, improvements and so on made within the spirit and scope of the present disclosure are intended to be included within the scope of the present disclosure.
Claims (10)
- A method performed by an electronic device for generating a virtual avatar, comprising:detecting whether a target object to be virtualized in an image wears preset occlusion objects;when the target object wears at least one of the occlusion objects:removing, detected occlusion objects from the image of the target object, by a pre-trained neural network model;generating a corresponding virtual avatar according to the obtained image without the occlusion objects; andselecting, according to preset external features of each occlusion object removed, occlusion object images matching the preset external features from a preset three Dimension image library of occlusion objects to load to a corresponding position of the virtual avatar to obtain the virtual avatar of the target object;when the target object does not wear the occlusion objects:directly generating the corresponding virtual avatar according to the image of the target object.
- The method of claim 1, wherein when the target object is a portrait, the occlusion objects comprise glasses and/or an item or hair blocking facial features.
- The method of claim 1, wherein the neural network model comprises a convolutional neural network model.
- The method of claim 1, wherein the external features comprise shape and/or color.
- An apparatus for generating a virtual avatar, comprising: a processor, wherein the processor is configured to:detect whether a target object to be virtualized in an image wears preset occlusion objects;when the target object wears at least one of the occlusion objects:remove detected occlusion objects from the image of the target object, by a pre-trained neural network model;generate a corresponding virtual avatar according to the obtained image without the occlusion objects; andselect, according to preset external features of each occlusion object removed, occlusion object images matching the preset external features from a preset three Dimension image library of occlusion object to load to a corresponding position of the virtual avatar to obtain the virtual avatar of the target object;when the target object does not wear the occlusion objects:directly generate the corresponding virtual avatar according to the image of the target object.
- The apparatus of claim 5, wherein when the target object is a portrait, the occlusion objects comprise glasses and/or an item or hair blocking facial features.
- The apparatus of claim 5, wherein the neural network model comprises a convolutional neural network model.
- The apparatus of claim 5, wherein the external features comprise shape and/or color.
- A non-transitory computer readable storage medium, storing instructions, wherein the instructions, when executed by a processor, causing the processor to perform the method according to any one of claims 1 to 4.
- An electronic device, comprising the non-transitory computer readable storage medium of claim 9, and a processor capable of accessing to the non-transitory computer-readable storage medium.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811632024.3A CN109727320A (en) | 2018-12-29 | 2018-12-29 | A kind of generation method and equipment of avatar |
CN201811632024.3 | 2018-12-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020139054A1 true WO2020139054A1 (en) | 2020-07-02 |
Family
ID=66297899
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2019/018710 WO2020139054A1 (en) | 2018-12-29 | 2019-12-30 | Apparatus and method for generating a virtual avatar |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109727320A (en) |
WO (1) | WO2020139054A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008940B (en) * | 2019-06-04 | 2020-02-11 | 深兰人工智能芯片研究院(江苏)有限公司 | Method and device for removing target object in image and electronic equipment |
CN113344776B (en) * | 2021-06-30 | 2023-06-27 | 北京字跳网络技术有限公司 | Image processing method, model training method, device, electronic equipment and medium |
CN115174985B (en) * | 2022-08-05 | 2024-01-30 | 北京字跳网络技术有限公司 | Special effect display method, device, equipment and storage medium |
CN115019401B (en) * | 2022-08-05 | 2022-11-11 | 上海英立视电子有限公司 | Prop generation method and system based on image matching |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20080050336A (en) * | 2006-12-02 | 2008-06-05 | 한국전자통신연구원 | A mobile communication terminal having a function of the creating 3d avata model and the method thereof |
US20150312523A1 (en) * | 2012-04-09 | 2015-10-29 | Wenlong Li | System and method for avatar management and selection |
US20170054945A1 (en) * | 2011-12-29 | 2017-02-23 | Intel Corporation | Communication using avatar |
US20180374251A1 (en) * | 2017-06-23 | 2018-12-27 | Disney Enterprises, Inc. | Single shot capture to animated vr avatar |
US20180374242A1 (en) * | 2016-12-01 | 2018-12-27 | Pinscreen, Inc. | Avatar digitization from a single image for real-time rendering |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105469379B (en) * | 2014-09-04 | 2020-07-28 | 广东中星微电子有限公司 | Video target area shielding method and device |
CN106204423B (en) * | 2016-06-28 | 2019-09-27 | Oppo广东移动通信有限公司 | A kind of picture-adjusting method based on augmented reality, device and terminal |
CN107145867A (en) * | 2017-05-09 | 2017-09-08 | 电子科技大学 | Face and face occluder detection method based on multitask deep learning |
-
2018
- 2018-12-29 CN CN201811632024.3A patent/CN109727320A/en active Pending
-
2019
- 2019-12-30 WO PCT/KR2019/018710 patent/WO2020139054A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20080050336A (en) * | 2006-12-02 | 2008-06-05 | 한국전자통신연구원 | A mobile communication terminal having a function of the creating 3d avata model and the method thereof |
US20170054945A1 (en) * | 2011-12-29 | 2017-02-23 | Intel Corporation | Communication using avatar |
US20150312523A1 (en) * | 2012-04-09 | 2015-10-29 | Wenlong Li | System and method for avatar management and selection |
US20180374242A1 (en) * | 2016-12-01 | 2018-12-27 | Pinscreen, Inc. | Avatar digitization from a single image for real-time rendering |
US20180374251A1 (en) * | 2017-06-23 | 2018-12-27 | Disney Enterprises, Inc. | Single shot capture to animated vr avatar |
Also Published As
Publication number | Publication date |
---|---|
CN109727320A (en) | 2019-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020139054A1 (en) | Apparatus and method for generating a virtual avatar | |
CN113569791B (en) | Image processing method and device, processor, electronic device and storage medium | |
RU2679986C2 (en) | Facial expression tracking | |
CN104599284B (en) | Three-dimensional facial reconstruction method based on various visual angles mobile phone auto heterodyne image | |
WO2010005251A2 (en) | Multiple object tracking method, device and storage medium | |
WO2020247174A1 (en) | Single image-based real-time body animation | |
WO2022260386A1 (en) | Method and apparatus for composing background and face by using deep learning network | |
CN111047509A (en) | Image special effect processing method and device and terminal | |
US11758295B2 (en) | Methods, systems, and media for generating compressed images | |
WO2022250401A1 (en) | Methods and systems for generating three dimensional (3d) models of objects | |
CN108762508A (en) | A kind of human body and virtual thermal system system and method for experiencing cabin based on VR | |
CN116634242A (en) | Speech-driven speaking video generation method, system, equipment and storage medium | |
CN110610191A (en) | Elevator floor identification method and device and terminal equipment | |
CN116051439A (en) | Method, equipment and storage medium for removing rainbow-like glare of under-screen RGB image by utilizing infrared image | |
CN108241855A (en) | image generating method and device | |
WO2024014819A1 (en) | Multimodal disentanglement for generating virtual human avatars | |
WO2023075508A1 (en) | Electronic device and control method therefor | |
CN112489144A (en) | Image processing method, image processing apparatus, terminal device, and storage medium | |
WO2021261687A1 (en) | Device and method for reconstructing three-dimensional human posture and shape model on basis of image | |
WO2022108275A1 (en) | Method and device for generating virtual face by using artificial intelligence | |
CN106101489B (en) | Template matching monitor video defogging system and its defogging method based on cloud platform | |
CN114758354A (en) | Sitting posture detection method and device, electronic equipment, storage medium and program product | |
CN111429363A (en) | Video noise reduction method based on video coding | |
WO2017150847A2 (en) | Wide viewing angle image processing system, wide viewing angle image transmitting and reproducing method, and computer program therefor | |
WO2022158890A1 (en) | Systems and methods for reconstruction of dense depth maps |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19903039 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19903039 Country of ref document: EP Kind code of ref document: A1 |