CN115147265A - Virtual image generation method and device, electronic equipment and storage medium - Google Patents
Virtual image generation method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN115147265A CN115147265A CN202210776196.8A CN202210776196A CN115147265A CN 115147265 A CN115147265 A CN 115147265A CN 202210776196 A CN202210776196 A CN 202210776196A CN 115147265 A CN115147265 A CN 115147265A
- Authority
- CN
- China
- Prior art keywords
- image
- target
- preset
- style
- mapping information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000013507 mapping Methods 0.000 claims abstract description 67
- 238000006243 chemical reaction Methods 0.000 claims abstract description 21
- 238000012545 processing Methods 0.000 claims description 32
- 239000000758 substrate Substances 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 10
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 230000003190 augmentative effect Effects 0.000 abstract description 3
- 238000013135 deep learning Methods 0.000 abstract description 2
- 210000001508 eye Anatomy 0.000 description 23
- 238000010586 diagram Methods 0.000 description 11
- 230000014509 gene expression Effects 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 11
- 238000004891 communication Methods 0.000 description 10
- 230000009466 transformation Effects 0.000 description 9
- 241001465754 Metazoa Species 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 210000003128 head Anatomy 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 210000000056 organ Anatomy 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 210000000697 sensory organ Anatomy 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Processing Or Creating Images (AREA)
Abstract
The utility model provides a virtual image generation method, which relates to the technical field of artificial intelligence, in particular to the technical fields of augmented reality, virtual reality, computer vision, deep learning and the like, and can be applied to the scenes of metauniverse, virtual image generation and the like. The specific implementation scheme is as follows: converting the preset image according to conversion information between a first coordinate system of the target style image and a second coordinate system of the preset image to obtain a first registration image; aligning a plurality of preset bases of the first registration image with a plurality of target style bases of the target style image to obtain a second registration image; obtaining first mapping information according to the conversion information and the second registration image; and generating a target avatar of the target object in the target image according to the target image and the first mapping information. The present disclosure also provides an avatar generation apparatus, an electronic device, and a storage medium.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence, and more particularly to the field of augmented reality, virtual reality, computer vision, deep learning, and the like, and can be applied to the scenes of metas, avatar generation, and the like. More particularly, the present disclosure provides an avatar generation method, apparatus, electronic device, and storage medium.
Background
With the development of artificial intelligence technology, deep learning models are widely used for image processing or image generation in fields such as virtual reality and augmented reality. In addition, the virtual image is widely applied to scenes such as social contact, live broadcast or games.
Disclosure of Invention
The present disclosure provides an avatar generation method, apparatus, device, and storage medium.
According to an aspect of the present disclosure, there is provided an avatar generation method, the method including: converting the preset image according to conversion information between a first coordinate system of the target style image and a second coordinate system of the preset image to obtain a first registration image; aligning a plurality of preset bases of the first registration image with a plurality of target style bases of the target style image to obtain a second registration image; obtaining first mapping information according to the conversion information and the second registration image; and generating a target avatar of the target object in the target image according to the target image and the first mapping information.
According to another aspect of the present disclosure, there is provided an avatar generating apparatus, the apparatus including: the conversion module is used for converting the preset image according to conversion information between a first coordinate system of the target style image and a second coordinate system of the preset image to obtain a first registration image; the alignment module is used for aligning the plurality of preset bases of the first registration image with the plurality of target style bases of the target style image to obtain a second registration image; an obtaining module, configured to obtain first mapping information according to the conversion information and the second registration image; and the first generation module is used for generating a target virtual image of the target object in the target image according to the target image and the first mapping information.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method provided according to the present disclosure.
According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a method provided according to the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of an exemplary system architecture to which the avatar generation method and apparatus may be applied, according to one embodiment of the present disclosure;
FIG. 2 is a flow diagram of an avatar generation method according to one embodiment of the present disclosure;
3A-3C are schematic diagrams of an avatar generation method according to one embodiment of the present disclosure;
FIG. 4 is a flow diagram of an avatar generation method according to another embodiment of the present disclosure;
FIG. 5 is a block diagram of an avatar generation apparatus according to one embodiment of the present disclosure; and
fig. 6 is a block diagram of an electronic device to which an avatar generation method may be applied according to one embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of embodiments of the present disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The avatar may include various types of images such as a two-dimensional image, a three-dimensional image, a cartoon image, a realistic-style image, a super realistic-style image, and the like. In the process of generating the virtual image, an virtual image can be designed manually. However, manually designing the avatar requires high labor costs. In addition, when the virtual image is generated by using the related software, the related software can respond in real time according to the instruction input by the designer, so that the resource cost of the software is high.
Fig. 1 is a schematic diagram of an exemplary system architecture to which the avatar generation method and apparatus may be applied, according to one embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired and/or wireless communication links, and so forth.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The backend management server may analyze and process the received data such as the user request, and feed back a processing result (for example, a web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the avatar generation method provided in the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the avatar generation apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The avatar generation method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the avatar generation apparatus provided in the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
Fig. 2 is a flowchart of an avatar generation method according to one embodiment of the present disclosure.
As shown in fig. 2, the method 200 may include operations S210 to S240.
In operation S210, the preset image is transformed according to transformation information between the first coordinate system of the target-style image and the second coordinate system of the preset image, so as to obtain a first registration image.
For example, the target style image may be an artificially designed image. The target style image may be a three-dimensional image.
For example, the preset image may be an image preset in the parametric model. The preset image may be a three-dimensional image. In one example, the parameterized models may include, for example, 3d DMM (3 d Morphable models,3d variable models) and Blendshape (hybrid shape) models.
For example, the coordinate systems of the target-style image and the preset image are different. Three-dimensional points in different coordinate systems can be transformed into each other according to the transformation information. The transformation information may be implemented as a transformation matrix. The coordinates of the three-dimensional points may be implemented as a 3 x 1 matrix. And performing matrix multiplication according to the coordinates of the three-dimensional points in the second coordinate system and the transformation matrix, so as to transform the three-dimensional points in the second coordinate system into the first coordinate system.
For example, a first registration image may be obtained by transforming each three-dimensional point in the preset image into the first coordinate system using the transformation matrix.
For example, a first object may be included in the target style image. The first object may be a human, animal, robot, or the like object having a face or a head. For another example, the second object may be included in the preset image. The second object may be a human, an animal, a robot, or the like having a face or a head.
In operation S220, a plurality of preset bases of the first registered image are aligned with a plurality of target style bases of the target style image, resulting in a second registered image.
For example, the second registration image may be obtained by aligning the plurality of preset bases with the plurality of target style bases using a non-rigid iterative Closest Point (npicp) algorithm.
For another example, a triangular patch and a point normal associated with the predetermined basis may be adjusted to obtain the second registration image.
In operation S230, first mapping information is obtained according to the conversion information and the second registration image.
For example, the first mapping information may convert the second registered image to a second coordinate system.
In operation S240, a target avatar of a target object in the target image is generated according to the target image and the first mapping information.
For example, a first three-dimensional image of the target object may be obtained by processing the target image with a parameterized model. An avatar may be obtained by processing the first three-dimensional image with the first mapping information.
For another example, the target object may be an object having a face or a head, such as a human, an animal, or a robot.
With the embodiments of the present disclosure, the first mapping information is determined according to the target style image and the preset image, whereby style information consistent with the target style image can be efficiently added to the avatar of the target object.
In some embodiments, the substrate of the present disclosure is described in detail by taking a parameterized model as a Blendshape model as an example.
A face with an expression can be split into two components: a personality component and an expression component. For example, personality components are related to the nature of the face of an object and may be used to distinguish between faces of different objects. The personality component may not change over a longer time frame (e.g., 7x24 hours). For another example, the expression component may also change within a short time scale (e.g., 1 second). The face of a subject may have a variety of expressions.
For example, the Blendshape model includes a plurality of bases for representing expressions in order to implement the representation of the expression components. In one example, combining different bases may result in one expression component. And then the expression component and the personality component are superposed to determine a virtual face. It is understood that there may be multiple feature points on the virtual face.
For another example, the base may correspond to the subject's facial contours, eyes, mouth, nose, etc.
In some embodiments, the input image may be pre-processed to obtain the target image. The pretreatment mode comprises the following steps: image cropping, translation, and the like. And enabling the target object in the target image to be at a preset position through preprocessing.
In some embodiments, aligning the plurality of preset bases of the first registered image with the plurality of target-style bases of the target-style image to obtain the second registered image comprises: determining a target style substrate corresponding to a preset substrate according to style semantic information of the target style substrate and preset semantic information of the preset substrate; and adjusting the position and the size of a preset substrate corresponding to the target style substrate in the first registration image according to the position and the size of the target style substrate in the target style image to obtain a second registration image. The following will be described in detail with reference to fig. 3A to 3C.
Fig. 3A to 3C are schematic diagrams of an avatar generation method according to one embodiment of the present disclosure.
As shown in fig. 3A, the target style image 301 may be a three-dimensional image. For example, the target style image 301 may also include a plurality of target style bases. The target style substrate may correspond to a five sense organ of the first object. The style semantic information may indicate an organ to which the target style base corresponds. For example, the plurality of target style bases may include: a target style Base _ style _ eye corresponding to the eyes of the first object, a target style Base _ style _ mouth corresponding to the mouth of the first object, and so on.
As shown in fig. 3B, the preset image 302 may be a preset image in the parametric model. The preset image 302 is converted according to a conversion matrix between the first coordinate system of the target style image 301 and the second coordinate system of the preset image 302, so as to obtain a first registration image 303. The coordinate system of the first registration image 303 may coincide with the coordinate system of the target-style image 301. In one example, the transformation matrix may be an affine transformation matrix.
For example, the preset image 302 may include a plurality of initial substrates. The first registration image 303 resulting from the conversion of the preset image 302 may also comprise a plurality of preset bases. Each preset base may correspond to one initial base. For another example, the preset base may correspond to five sense organs of the second object. The preset semantic information may indicate an organ corresponding to the preset style base. The plurality of predetermined substrates may include: a preset Base _ pre _ eye corresponding to an eye of the second object, a preset Base _ pre _ mouth corresponding to a mouth of the second object, and the like.
Next, the plurality of preset bases of the first registration image 303 may be aligned with the plurality of target-style bases of the target-style image. For example, it may be determined that the target style Base _ style _ eye corresponds to the preset Base _ pre _ eye according to style semantic information of the target style Base _ style _ eye and preset semantic information of the preset Base _ pre _ eye. The position of the preset Base _ pre _ eye in the first registration image 303 is adjusted according to the position of the target style Base _ style _ eye in the target style image 301. And adjusting the size of the preset Base _ pre _ eye according to the size of the target style Base _ style _ eye, so that the position of the eye corner in the adjusted first registration image 303 is consistent with the position of the eye corner in the target style image 301. In one example, the size and the point normal of the triangle patch corresponding to the preset Base _ pre _ eye may be adjusted to adjust the position and size of the preset Base _ pre _ eye.
After the adjustment of the plurality of preset bases is completed, a second registration image 304 may be obtained.
From the second registration image 304 and the preset image 302, a mapping relationship between the two can be determined as the first mapping information in various ways.
In one example, the second registered image 304 is converted to the second coordinate system of the preset image 302 using a conversion matrix (or an inverse of the conversion matrix). In the second coordinate system, an initial mapping relationship between the converted second registration image 304 and the preset image 302 is determined. According to the initial mapping relation and the conversion matrix, first mapping information can be obtained. Through the embodiment of the disclosure, the second registration image can have style information of the target style image, and then other images are processed according to the first mapping information by using the parameterized model, and the processed other images can also have style information of the target style image.
As shown in fig. 3C, a first three-dimensional image 306 may be obtained by processing the object image 305 with a parameterized model. By processing the first three-dimensional image 306 with the first mapping information, an avatar 307 may be obtained. It is to be understood that the avatar 307 may be a three-dimensional image or a three-dimensional mesh model.
It is to be appreciated that in the embodiments described above, the avatar may be generated by processing the first three-dimensional image with the first mapping information. In the embodiment of the present disclosure, the first three-dimensional figure may be further processed to obtain an avatar with a higher similarity to the target object. As will be described in detail below.
In some embodiments, obtaining the target avatar of the target object in the target image according to the target image and the first mapping information comprises: determining a first three-dimensional image of the target object according to at least one target feature point of the target object; adjusting the first three-dimensional image to make a first difference between the first three-dimensional image and the target image converge to obtain a second three-dimensional image; and processing the second three-dimensional image using the first mapping information to generate the target avatar.
For example, the first difference may be a minimum projection error. It will be appreciated that the target image may be, for example, a two-dimensional image. And processing the target image by using the parameterized model according to at least one target feature point of the target object to obtain a first three-dimensional image. Adjusting the first three-dimensional image based on the minimum projection error may make the target image more similar to that of the first three-dimensional image, and in particular may make the five sense organs and facial contours of the object in the two images more similar.
For another example, by processing the second three-dimensional image using the first mapping information, an avatar having a higher similarity to the target object can be obtained.
It will be appreciated that after an avatar is derived from the first mapping information and the second three-dimensional image, the avatar may be further processed to obtain a more realistic target avatar. As will be described in detail below.
In some embodiments, processing the second three-dimensional image using the first mapping information to generate the target avatar comprises: processing the second three-dimensional image by using the first mapping information to obtain an initial virtual image; and weighting the plurality of regions of the initial avatar by using the plurality of preset weights respectively to generate the target avatar.
For example, after the second three-dimensional image is processed according to the first mapping information using the parametric model, the resultant avatar may be used as the initial avatar. It will be appreciated that in the case where the parameterized model is a Blendshape model, the face contours of the original avatar may appear distorted. For example, the chin of the original avatar may be sharp and not smooth enough.
In this case, the five-sense organ region of the initial avatar may be weighted with a first preset weight (e.g., 0.8), and the face contour region of the initial avatar may be weighted with a second preset weight (e.g., 0.2). The weighted initial avatar may be the target avatar. By the embodiment of the present disclosure, the virtual image is weighted, and a more real target virtual image can be obtained.
It is to be understood that some embodiments of the avatar are described in detail above. After the target avatar is obtained, the target avatar may be driven such that the target avatar exhibits different expressions. This will be described in detail below with reference to fig. 4.
Fig. 4 is a flowchart of an avatar generation method according to another embodiment of the present disclosure.
As shown in fig. 4, the method 400 may include operations S450 to S470. It is understood that operation S450 may be performed after operation S240 described above.
In operation S450, driving information for a target avatar is acquired.
In the embodiment of the present disclosure, the first mapping information is associated with a plurality of target feature points.
For example, as described above, the first mapping information is derived from the transformation information and the second registration image. The second object in the preset image may have a plurality of initial feature points. The plurality of initial feature points may include, for example: initial feature points associated with the eyes of the second object, initial feature points associated with the mouth of the second object, and so on. In one example, the initial feature points associated with the left eye of the second object may be 6.
For another example, a target object in the target image may have a plurality of target feature points. The plurality of target feature points may include, for example: target feature points associated with the eyes of the target object, target feature points associated with the mouth of the target object, and so on. In one example, the initial feature points associated with the left eye of the target object may be 6. According to the semantic information of the target feature points, the target feature points can be related to the initial feature points, and then the target feature points are related to the first mapping information.
For example, the drive information is associated with at least one target feature point of the plurality of target feature points. In one example, one driving information is used to drive the target avatar such that the avatar exhibits the expression "smile". The drive information may include first sub drive information and second sub drive information. The first sub driving information and the second sub driving information may be respectively related to the following target feature points: target feature points associated with the mouth of the target object and target feature points associated with the eyes of the target object.
In operation S460, the first mapping information is updated with the driving information, resulting in second mapping information.
For example, the first mapping information may include first sub-mapping information, second sub-mapping information, and the like. The first sub-map information and the second sub-map information may be respectively related to the following target feature points: target feature points associated with the mouth of the target object and target feature points associated with the eyes of the target object. For example, the first sub update map information may be obtained by performing various operations based on the first sub map information and the first sub drive information. Or performing various operations according to the second sub-mapping information and the second sub-driving information to obtain second sub-update mapping information. And obtaining second mapping information according to the first sub-updating mapping information and the second sub-updating mapping information.
In operation S470, an updated avatar of the target object is generated according to the target image and the second mapping information.
For example, the first three-dimensional image is processed with the second mapping information to generate an updated avatar such that the updated avatar may exhibit the expression "smile".
Fig. 5 is a block diagram of an avatar generation apparatus according to one embodiment of the present disclosure.
As shown in fig. 5, the apparatus 500 may include a conversion module 510, an alignment module 520, an obtaining module 530, and a generation module 540.
The converting module 510 is configured to convert the preset image according to conversion information between the first coordinate system of the target style image and the second coordinate system of the preset image, so as to obtain a first registration image.
An aligning module 520, configured to align the plurality of preset bases of the first registered image with the plurality of target-style bases of the target-style image to obtain a second registered image.
An obtaining module 530, configured to obtain the first mapping information according to the conversion information and the second registration image.
The first generating module 540 is configured to generate a target avatar of the target object in the target image according to the target image and the first mapping information.
In some embodiments, the first generating module comprises: the first determining submodule is used for determining a first three-dimensional image of the target object according to at least one target feature point of the target object; the first adjusting submodule is used for adjusting the first three-dimensional image to enable a first difference between the first three-dimensional image and the target image to be converged to obtain a second three-dimensional image; and a processing sub-module for processing the second three-dimensional image using the first mapping information to generate a target avatar.
In some embodiments, the processing submodule comprises: the processing unit is used for processing the second three-dimensional image by utilizing the first mapping information to obtain an initial virtual image; and a weighting unit for weighting the plurality of regions of the initial avatar with a plurality of preset weights, respectively, to generate a target avatar.
In some embodiments, the alignment module comprises: the second determining submodule is used for determining a target style substrate corresponding to the preset substrate according to the style semantic information of the target style substrate and the preset semantic information of the preset substrate; and the second adjusting submodule is used for adjusting the position and the size of the preset substrate corresponding to the target style substrate in the first registration image according to the position and the size of the target style substrate in the target style image to obtain a second registration image.
In some embodiments, the first mapping information is associated with a plurality of target feature points, and the apparatus 500 further comprises: the acquisition module is used for acquiring driving information aiming at the target virtual image, wherein the driving information is related to at least one target characteristic point in the plurality of target characteristic points; the updating module is used for updating the first mapping information by using the driving information to obtain second mapping information; a second generating module for generating an updated avatar of the target object according to the target image and the second mapping information
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 6 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 6, the device 600 comprises a computing unit 601, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the device 600 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 601 performs the respective methods and processes described above, such as the avatar generation method. For example, in some embodiments, the avatar generation method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into the RAM 603 and executed by the computing unit 601, one or more steps of the avatar generation method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the avatar generation method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Claims (13)
1. An avatar generation method, comprising:
converting the preset image according to conversion information between a first coordinate system of the target style image and a second coordinate system of the preset image to obtain a first registration image;
aligning a plurality of preset bases of the first registration image with a plurality of target style bases of the target style image to obtain a second registration image;
obtaining first mapping information according to the conversion information and the second registration image; and
and generating a target virtual image of a target object in the target image according to the target image and the first mapping information.
2. The method of claim 1, wherein generating a target avatar of a target object in the target image according to the target image and the first mapping information comprises:
determining a first three-dimensional image of the target object according to at least one target feature point of the target object;
adjusting the first three-dimensional image to make a first difference between the first three-dimensional image and the target image converge to obtain a second three-dimensional image; and
processing the second three-dimensional image using the first mapping information to generate the target avatar.
3. The method of claim 2, wherein said processing the second three-dimensional image with the first mapping information to generate the target avatar comprises:
processing the second three-dimensional image by using the first mapping information to obtain an initial virtual image; and
and weighting the plurality of areas of the initial avatar by using a plurality of preset weights respectively to generate the target avatar.
4. The method of claim 1, wherein said aligning a plurality of preset bases of the first registered image with a plurality of target-style bases of the target-style image resulting in a second registered image comprises:
determining the target style substrate corresponding to the preset substrate according to the style semantic information of the target style substrate and the preset semantic information of the preset substrate; and
and adjusting the position and the size of a preset substrate corresponding to the target style substrate in the first registration image according to the position and the size of the target style substrate in the target style image to obtain a second registration image.
5. The method of claim 1, the first mapping information relating to a plurality of target feature points,
the method further comprises the following steps:
acquiring driving information for the target avatar, wherein the driving information is related to at least one target feature point of the plurality of target feature points;
updating the first mapping information by using the driving information to obtain second mapping information;
and generating an updated virtual image of the target object according to the target image and the second mapping information.
6. An avatar generation apparatus comprising:
the conversion module is used for converting the preset image according to conversion information between a first coordinate system of the target style image and a second coordinate system of the preset image to obtain a first registration image;
an alignment module, configured to align a plurality of preset bases of the first registered image with a plurality of target style bases of the target style image to obtain a second registered image;
an obtaining module, configured to obtain first mapping information according to the conversion information and the second registration image; and
and the first generation module is used for generating a target virtual image of a target object in the target image according to the target image and the first mapping information.
7. The apparatus of claim 6, wherein the first generating module comprises:
the first determining submodule is used for determining a first three-dimensional image of the target object according to at least one target feature point of the target object;
a first adjusting submodule, configured to adjust the first three-dimensional image so that a first difference between the first three-dimensional image and the target image converges to obtain a second three-dimensional image; and
and the processing submodule is used for processing the second three-dimensional image by utilizing the first mapping information so as to generate the target virtual image.
8. The apparatus of claim 7, wherein the processing submodule comprises:
the processing unit is used for processing the second three-dimensional image by utilizing the first mapping information to obtain an initial virtual image; and
and the weighting unit is used for respectively weighting the plurality of areas of the initial virtual image by utilizing a plurality of preset weights so as to generate the target virtual image.
9. The apparatus of claim 6, wherein the alignment module comprises:
the second determining submodule is used for determining the target style substrate corresponding to the preset substrate according to the style semantic information of the target style substrate and the preset semantic information of the preset substrate; and
and the second adjusting submodule is used for adjusting the position and the size of a preset substrate corresponding to the target style substrate in the first registration image according to the position and the size of the target style substrate in the target style image to obtain a second registration image.
10. The apparatus of claim 6, the first mapping information relating to a plurality of target feature points,
the device further comprises:
an obtaining module, configured to obtain driving information for the target avatar, wherein the driving information is related to at least one of the plurality of target feature points;
the updating module is used for updating the first mapping information by using the driving information to obtain second mapping information;
and the second generation module is used for generating an updated virtual image of the target object according to the target image and the second mapping information.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1 to 5.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210776196.8A CN115147265B (en) | 2022-06-30 | 2022-06-30 | Avatar generation method, apparatus, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210776196.8A CN115147265B (en) | 2022-06-30 | 2022-06-30 | Avatar generation method, apparatus, electronic device, and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115147265A true CN115147265A (en) | 2022-10-04 |
CN115147265B CN115147265B (en) | 2023-05-30 |
Family
ID=83410795
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210776196.8A Active CN115147265B (en) | 2022-06-30 | 2022-06-30 | Avatar generation method, apparatus, electronic device, and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115147265B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115409933A (en) * | 2022-10-28 | 2022-11-29 | 北京百度网讯科技有限公司 | Multi-style texture mapping generation method and device |
CN115578431A (en) * | 2022-10-17 | 2023-01-06 | 北京百度网讯科技有限公司 | Image depth processing method and device, electronic equipment and medium |
CN116051694A (en) * | 2022-12-20 | 2023-05-02 | 百度时代网络技术(北京)有限公司 | Avatar generation method, apparatus, electronic device, and storage medium |
CN117333601A (en) * | 2023-11-16 | 2024-01-02 | 虚拟现实(深圳)智能科技有限公司 | Digital virtual clothing generation method and device based on artificial intelligence |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6276882B1 (en) * | 2017-05-19 | 2018-02-07 | 株式会社コロプラ | Information processing method, apparatus, and program for causing computer to execute information processing method |
CN113643412A (en) * | 2021-07-14 | 2021-11-12 | 北京百度网讯科技有限公司 | Virtual image generation method and device, electronic equipment and storage medium |
CN113706678A (en) * | 2021-03-23 | 2021-11-26 | 腾讯科技(深圳)有限公司 | Method, device and equipment for acquiring virtual image and computer readable storage medium |
CN114549710A (en) * | 2022-03-02 | 2022-05-27 | 北京百度网讯科技有限公司 | Virtual image generation method and device, electronic equipment and storage medium |
CN114581586A (en) * | 2022-03-09 | 2022-06-03 | 北京百度网讯科技有限公司 | Method and device for generating model substrate, electronic equipment and storage medium |
-
2022
- 2022-06-30 CN CN202210776196.8A patent/CN115147265B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6276882B1 (en) * | 2017-05-19 | 2018-02-07 | 株式会社コロプラ | Information processing method, apparatus, and program for causing computer to execute information processing method |
CN113706678A (en) * | 2021-03-23 | 2021-11-26 | 腾讯科技(深圳)有限公司 | Method, device and equipment for acquiring virtual image and computer readable storage medium |
CN113643412A (en) * | 2021-07-14 | 2021-11-12 | 北京百度网讯科技有限公司 | Virtual image generation method and device, electronic equipment and storage medium |
CN114549710A (en) * | 2022-03-02 | 2022-05-27 | 北京百度网讯科技有限公司 | Virtual image generation method and device, electronic equipment and storage medium |
CN114581586A (en) * | 2022-03-09 | 2022-06-03 | 北京百度网讯科技有限公司 | Method and device for generating model substrate, electronic equipment and storage medium |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115578431A (en) * | 2022-10-17 | 2023-01-06 | 北京百度网讯科技有限公司 | Image depth processing method and device, electronic equipment and medium |
CN115578431B (en) * | 2022-10-17 | 2024-02-06 | 北京百度网讯科技有限公司 | Image depth processing method and device, electronic equipment and medium |
CN115409933A (en) * | 2022-10-28 | 2022-11-29 | 北京百度网讯科技有限公司 | Multi-style texture mapping generation method and device |
CN116051694A (en) * | 2022-12-20 | 2023-05-02 | 百度时代网络技术(北京)有限公司 | Avatar generation method, apparatus, electronic device, and storage medium |
CN116051694B (en) * | 2022-12-20 | 2023-10-03 | 百度时代网络技术(北京)有限公司 | Avatar generation method, apparatus, electronic device, and storage medium |
CN117333601A (en) * | 2023-11-16 | 2024-01-02 | 虚拟现实(深圳)智能科技有限公司 | Digital virtual clothing generation method and device based on artificial intelligence |
CN117333601B (en) * | 2023-11-16 | 2024-01-26 | 虚拟现实(深圳)智能科技有限公司 | Digital virtual clothing generation method and device based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN115147265B (en) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115147265B (en) | Avatar generation method, apparatus, electronic device, and storage medium | |
CN114792355B (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN115049799A (en) | Method and device for generating 3D model and virtual image | |
CN114549710A (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN114612600B (en) | Virtual image generation method and device, electronic equipment and storage medium | |
US20220292795A1 (en) | Face image processing method, electronic device, and storage medium | |
JP7418370B2 (en) | Methods, apparatus, devices and storage media for transforming hairstyles | |
US20220351495A1 (en) | Method for matching image feature point, electronic device and storage medium | |
CN114723888A (en) | Three-dimensional hair model generation method, device, equipment, storage medium and product | |
CN114549728A (en) | Training method of image processing model, image processing method, device and medium | |
CN114708374A (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN115359171B (en) | Virtual image processing method and device, electronic equipment and storage medium | |
CN114078184B (en) | Data processing method, device, electronic equipment and medium | |
CN115082298A (en) | Image generation method, image generation device, electronic device, and storage medium | |
CN115775300A (en) | Reconstruction method of human body model, training method and device of human body reconstruction model | |
CN113781653B (en) | Object model generation method and device, electronic equipment and storage medium | |
CN113327311B (en) | Virtual character-based display method, device, equipment and storage medium | |
CN115147306A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
CN114549785A (en) | Method and device for generating model substrate, electronic equipment and storage medium | |
CN113903071A (en) | Face recognition method and device, electronic equipment and storage medium | |
CN114820908B (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN116030150B (en) | Avatar generation method, device, electronic equipment and medium | |
CN116385829B (en) | Gesture description information generation method, model training method and device | |
CN114037814B (en) | Data processing method, device, electronic equipment and medium | |
CN116363331B (en) | Image generation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |