CN114092611A

CN114092611A - Virtual expression driving method and device, electronic equipment and storage medium

Info

Publication number: CN114092611A
Application number: CN202111318946.9A
Authority: CN
Inventors: 田一泓; 宋征
Original assignee: Netease Hangzhou Network Co Ltd
Current assignee: Netease Hangzhou Network Co Ltd
Priority date: 2021-11-09
Filing date: 2021-11-09
Publication date: 2022-02-25

Abstract

The application provides a virtual expression driving method and device, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a basic face model and target animation data corresponding to the basic face model; obtaining expression offset extreme values respectively corresponding to the basic face model on a plurality of expression dimensions; writing the expression deviation extreme value into a to-be-filled mapping to obtain a deformation mapping; acquiring weights of the basic face model in each frame of expression image of the target animation on the plurality of expression dimensions from the target animation data to obtain a weight combination; and rendering to obtain the expression animation of the basic face model according to the deformation map and the weight combination. According to the scheme, the problem that the expression cannot be normally rendered when the expression dimensionality is too large due to the fact that the number of attribute variables is limited by a browser is solved.

Description

Virtual expression driving method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a virtual expression driving method and apparatus, an electronic device, and a computer-readable storage medium.

Background

In order to enrich the facial expression of the 3D virtual human, a plurality of dimensionality deformation animations can be generated for the 3D virtual human, and the deformation animations with different dimensionalities can correspond to different parts of the face. Deformation animation is called as blend shape or morph target in Maya, 3DMax and unity engines, and the deformation animation of each dimension represents the extreme state of a part in deformation. The expression animation generation algorithm may generate a weight for each dimension morphing animation, the weight being between 0 and 1. According to the deformation animation of each dimension and the corresponding weight, the expression state of the virtual human in the current image can be determined.

In the related art, after the deformation animation of each dimension is uploaded to a vertex shader in the form of webGL attribute, the deformation animation of each dimension may be weighted and summed in the vertex shader, so as to determine the coordinates of each vertex of the virtual human face, and then the expression animation is rendered. However, since the browser has a number limit to attribute, at most only 8-dimensional morphing animations can be processed. In this case, if a plurality of dimensional morphing animations exist, normal processing cannot be performed.

Disclosure of Invention

An object of the embodiments of the present application is to provide a method and an apparatus for driving a virtual expression, an electronic device, and a computer-readable storage medium, which are used to solve a problem that a browser cannot normally drive a high-dimensional animation.

In one aspect, the present application provides a virtual expression driving method, including:

acquiring a basic face model and target animation data corresponding to the basic face model;

obtaining expression offset extreme values respectively corresponding to the basic face model on a plurality of expression dimensions;

writing the expression deviation extreme value into a to-be-filled mapping to obtain a deformation mapping;

acquiring weights of the basic face model in each frame of expression image of the target animation on the plurality of expression dimensions from the target animation data to obtain a weight combination;

and rendering to obtain the expression animation of the basic face model according to the deformation map and the weight combination.

In an embodiment, before the writing the expression deviation extremum into the to-be-filled map and obtaining the deformed map, the method further includes:

determining the total amount of the expression deviation extreme value according to the number of the vertexes of the basic face model and the number of the expression dimensions;

and determining the width and the height of the to-be-filled map according to the total amount of the expression deviation extreme value, and generating the to-be-filled map according to the width and the height.

In an embodiment, the determining the width and the height of the to-be-filled map according to the total amount of the expression deviation extremum includes:

determining the minimum parameter in a parameter set larger than the total amount of the expression deviation extreme value, and taking the minimum parameter in the parameter set as a target parameter; wherein the parameters in the parameter set are all powers of two;

determining a first number corresponding to the height and a second number corresponding to the width based on the target parameter;

the height is given by the power of the first degree of two, and the width is given by the power of the second degree of two.

In an embodiment, the writing the expression deviation extreme value into a to-be-filled map to obtain a deformed map includes:

and writing the expression deviation extreme values of a plurality of vertexes in the basic face model in a plurality of expression dimensions into the to-be-filled mapping in sequence to obtain the deformation mapping.

In an embodiment, the method further comprises:

acquiring vertex attribute data of the basic face model; wherein the vertex attribute data comprises basic coordinates of each vertex of the basic face model;

and rendering to obtain the expression animation of the basic face model according to the deformation map and the weight combination, wherein the rendering comprises the following steps:

determining target coordinates of each vertex of the basic face model in each frame of expression image according to the expression offset extreme value of each vertex in the deformation mapping, the weight combination and the basic coordinates of each vertex in the vertex attribute data;

and rendering to obtain the expression animation of the basic face model according to the target coordinates of each vertex of the basic face model in each frame of expression image.

In an embodiment, the vertex attribute data includes a vertex order corresponding to each vertex of the basic face model;

before determining the target coordinates of each vertex of the basic face model in each frame of expression image according to the expression offset extremum of each vertex in the deformation map, the weight combination and the basic coordinates of each vertex in the vertex attribute data, the method comprises the following steps:

determining texture coordinates of each vertex corresponding to the deformation chartlet according to the vertex sequence corresponding to each vertex of the basic face model;

and acquiring the table offset extreme values of the vertexes corresponding to the expression dimensions from the deformation map according to the texture coordinates of the vertexes corresponding to the deformation map.

In an embodiment, the determining, according to a vertex sequence corresponding to each vertex of the base face model, texture coordinates corresponding to each vertex in the deformation map includes:

determining a rectangular coordinate of each vertex in the deformation map corresponding to each expression dimension according to the vertex sequence corresponding to each vertex of the basic face model and the width and height of the deformation map;

and determining texture coordinates corresponding to the vertexes and each expression dimension according to the rectangular coordinates corresponding to the vertexes and each expression dimension.

On the other hand, the present application further provides a virtual expression driving apparatus, including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a basic face model and target animation data corresponding to the basic face model;

the second acquisition module is used for acquiring expression offset extreme values respectively corresponding to the basic face model on a plurality of expression dimensions;

the writing module is used for writing the expression deviation extreme value into a to-be-filled mapping to obtain a deformation mapping;

the determining module is used for acquiring weights of the basic face model in each frame of expression image of the target animation on the plurality of expression dimensions from the target animation data to obtain a weight combination;

and the rendering module is used for rendering to obtain the expression animation of the basic face model according to the deformation map and the weight combination.

Further, the present application also provides an electronic device, including:

a processor; the processor comprises a central processing unit and a graphic processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute the above virtual expression driving method.

In addition, the present application also provides a computer-readable storage medium, which stores a computer program executable by a processor to perform the above virtual expression driving method.

According to the scheme, after the expression deviation extreme values corresponding to the basic face model on a plurality of expression dimensions are obtained, the deformation chartlet can be generated based on the expression deviation extreme values; acquiring weights of the basic face model in each frame of expression image on a plurality of expression dimensions from the target animation data to obtain weight combinations; rendering to obtain the expression animation of the basic face model according to the deformation chartlet and the weight combination; after the deformation map is constructed, the deformation map is transmitted in a single uniform variable mode, when expression deviation extreme values of a plurality of expression dimensions exist, the expression deviation extreme values of each expression dimension do not need to be transmitted respectively, and the problem that the expression cannot be rendered normally due to the fact that a browser limits the number of attribute variables is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic view of an application scenario of a virtual expression driving method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a virtual expression driving method according to an embodiment of the present application;

FIG. 4 is a schematic flowchart illustrating a method for determining a size of a to-be-filled map according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a deformation map provided in an embodiment of the present application;

FIG. 6 is a diagram illustrating an emoji animation driver according to an embodiment of the present application;

FIG. 7 is a block diagram of an embodiment of an expressive animation driver architecture;

fig. 8 is a block diagram of a virtual expression driving apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

Like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 is a schematic view of an application scenario of a virtual expression driving method according to an embodiment of the present application. As shown in fig. 1, the application scenario includes a user terminal 20 and a server 30; the server 30 may be a server, a server cluster or a cloud computing center, and is configured to generate a weight combination corresponding to the frame-by-frame expression image based on an expression synthesis algorithm, and send the weight combination to the user terminal 20; the user terminal 20 may be an electronic device such as a mobile phone and a tablet computer, and is configured to obtain a weight combination corresponding to each frame of expression image from the server 30, and render the expression animation according to the weight combination and the expression offset extremum of the local multiple dimensions.

As shown in fig. 2, the present embodiment provides an electronic apparatus 1 including: a Central Processing Unit (CPU) 11, a Graphics Processing Unit (GPU) 12, and a memory 13. The cpu 11, the gpu 12 and the memory 13 are connected via the bus 10, and the memory 13 stores instructions executable by the cpu 11 and the gpu 12, and the instructions are executed by the cpu 11 and the gpu 12, so that the electronic device 1 can perform all or part of the processes of the methods in the embodiments described below. In an embodiment, the electronic device 1 may be the user terminal 20 described above, configured to execute the virtual emotion driving method.

The Memory 13 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.

The present application also provides a computer-readable storage medium storing a computer program executable by the central processing unit 11 and the graphic processor 12 to perform the virtual emotion driving method provided by the present application.

Referring to fig. 3, a flowchart of a virtual expression driving method according to an embodiment of the present application is shown, and as shown in fig. 3, the method may include the following steps 310 to 350.

Step 310: and acquiring a basic face model and target animation data corresponding to the basic face model.

Step 320: and obtaining expression deviation extreme values respectively corresponding to the basic face model on a plurality of expression dimensions.

The basic human face model is a face model of a 3D virtual human, is a basic image object set consisting of a series of triangles and combines textures and internal animation. The base face model may be in a GLTF (Graphics Language Transmission Format) Format, an fbx (film box) Format, a GLTF compression model Format supporting KHR _ mesh _ quantization extension, or other formats.

The expression animation of the basic face model can be determined by expression offset extreme values of a plurality of expression dimensions, and the expression offset extreme value of each expression dimension can indicate the maximum deformation state of a certain part of the face. The expression offset extreme value of each expression dimension can be written in the basic face model in the form of blend shape or morph target. Illustratively, there is a morph target of a dimension corresponding to the eyelid, and the morph target of the dimension may include vertices of several triangular meshes corresponding to the eyelid in the base face model, and vertex coordinate offset when the eyelid is maximally deformed, where the offset may be referred to as an expression offset extremum.

The target animation data may be data that controls the underlying face model to exhibit a virtual expression. For example, the target animation data may include a weight combination corresponding to each frame of expression image, the weight combination including a weight corresponding to each expression dimension.

The user terminal can read the model data from the memory and load the model data, so that the basic human face model is obtained.

The expression animation of the virtual human is formed by continuous multiframe expression images. The user terminal may obtain the weight combination sent by the server frame by frame, and may also generate the weight combination of the expression image frame by frame through a local expression synthesis algorithm (e.g., a network Fuxi expression synthesis algorithm). The weight combination includes the weight of each dimension morphing animation. Illustratively, there are a total of 50 expression shift extrema for expression dimensions, and the weight combination contains 50 weights. After obtaining the weight combination, the user terminal obtains target animation data.

The user terminal can obtain each vertex in the basic face model from blend or morph target of the basic face model, and expression offset extreme values corresponding to the vertexes in the basic face model on a plurality of expression dimensions respectively.

For the same vertex, there may be differences in the extreme expression offset values in different expression dimensions. For example, for one vertex of the lip in the face model, in Blendshape or morph target corresponding to the eyelid, the expression offset extremum may be zero; whereas in Blendshape or morph target corresponding to the lips, the expression offset extremum is not zero.

Step 330: and writing the expression deviation extreme value into the map to be filled to obtain the deformation map.

The to-be-filled map is a map with an empty pixel value, and the user terminal writes a plurality of expression offset extreme values as pixel values of all pixels into the to-be-filled map to obtain the deformation map. At this time, the deformation map includes expression offset extrema of each vertex of the basic face model in a plurality of expression dimensions.

Each expression offset extreme value may include maximum offset values of vertices in x, y, and z directions in a three-dimensional space, and the maximum offset values of three dimensions in the expression offset extreme values may be used as pixel values of three dimensions of the deformation map. Therefore, after the expression deviation extreme value is written into the to-be-filled mapping, the three-dimensional deformation mapping can be obtained.

Step 340: and acquiring weights of the basic face model in each frame of expression image of the target animation on a plurality of expression dimensions from the target animation data to obtain a weight combination.

The user terminal can determine a weight combination corresponding to each frame of expression image from the target animation data, wherein the weight combination comprises the weight of the basic face model in a plurality of expression dimensions. For example, the target animation data includes weight combinations corresponding to 200 expression images, and the base face model has 50 expression dimensions, the user terminal may obtain 200 weight combinations from the target animation data, where each weight combination includes 50 weights.

Step 350: and rendering to obtain the expression animation of the basic face model according to the deformation chartlet and the weight combination.

After the weight combination corresponding to the deformation map and each frame of expression image is obtained, the weight combination and the deformation map can be used as uniform variables and uploaded to a webGL (Web Graphics library) rendering pipeline, and then the expression animation corresponding to the basic face model is obtained through rendering.

Through the measures, after the deformation mapping is constructed according to the expression deviation extreme values of the vertexes of the basic face model corresponding to the expression dimensions, the deformation mapping is transmitted in the form of a single uniform variable. When the expression deviation extreme values of a plurality of expression dimensions exist, the problem that the expression cannot be normally rendered due to the fact that the number of attribute variables of the browser is limited can be solved.

In an embodiment, the user terminal may construct the map to be populated before performing step 330.

The user terminal can determine the total amount of the expression deviation extreme value according to the number of the top points of the basic face model and the number of the expression dimensions. Each vertex has a corresponding expression deviation extreme value in each expression dimension, so the total amount of the expression deviation extreme values can be obtained by multiplying the number of the vertices by the number of the expression dimensions. For example, if the basic face model has 1000 vertices and 50 expression dimensions, 50000 expression deviation extrema can be obtained.

After determining the total amount of the expression deviation extreme values, the user terminal may determine the width and height of the to-be-filled map satisfying the WelGL requirement according to the total amount. Where WelGL restricts the width and height of the picture to necessarily powers of 2.

After determining the width and the height, a to-be-filled map may be generated, in which the pixel value of each pixel in the to-be-filled map is null.

In an embodiment, when the user terminal determines the width and the height of the to-be-filled map according to the total amount of the expression deviation extremum, referring to fig. 4, a flowchart of a method for determining the size of the to-be-filled map provided by an embodiment of the present application is shown in fig. 4, where the method may include steps 410 to 430.

Step 410: determining the minimum parameter in a parameter set larger than the total amount of the expression deviation extreme value, and taking the minimum parameter in the parameter set as a target parameter; wherein the parameters in the parameter set are all powers of two.

The parameter set may include a first power of 2, a second power of 2, and an nth power of … … 2 a third power of 2. The user terminal may compare the power of 2 with the total amount of the expression deviation extremum step by step starting from the power of 2. When the power of 2 is small, the user terminal may add one to the power number and re-compare the power of 2 with the size of the total amount of expression deviation extremum. The user terminal may repeat the above process until determining a first target parameter greater than the total amount of the offsets, which is the minimum parameter greater than the total amount of the expression offset extremum in the parameter set, thereby determining the power number of times scalar.

Step 420: a first number corresponding to the height and a second number corresponding to the width are determined based on the target parameter.

Step 430: the height is given by the power of the first degree of two, and the width is given by the power of the second degree of two.

The first time and the second time are indexes of two, the first time is used for calculating the height, and the second time is used for calculating the width.

The user terminal may determine the number of targets as odd or even. On one hand, if the number of targets is an even number, dividing the number of targets by two may result in a first number of times corresponding to the height and a second number of times corresponding to the width, the first number of times being the same as the second number of times. On the other hand, if the number of targets is an odd number, the number of targets is divided by two and rounded down, so that the first number of times corresponding to the height can be obtained. The second number corresponding to the width can be obtained by adding one to the first number.

The user terminal may determine a first power of two as the height and a second power of two as the width.

Illustratively, the target parameter is a power of 36 of 2, the target number is 36; in this case, the first order is 18, the second order is 18, and both the height and the width are 18 powers of 2. The target parameter is the power of 35 of 2, and the target number is 35; in this case, the first order is 17, the second order is 18, the height is 17 to the power of 2, and the width is 18 to the power of 2.

In an embodiment, after obtaining the to-be-filled map, when writing the expression deviation extreme value into the to-be-filled map, the user terminal may write the expression deviation extreme values of the plurality of vertices in the base face model at the plurality of expression dimensions respectively into the to-be-filled map in sequence, so as to obtain the deformation map.

Referring to fig. 5, which is a schematic diagram of a morphing map provided in an embodiment of the present application, as shown in fig. 5, b1_ v1 represents an extreme expression shift value of the 1 st vertex in the 1 st-dimensional morphing animation; b1_ vn represents the expression deviation extreme value of the nth vertex in the 1 st dimension deformation animation; b50_ vn represents the extreme expression shift value of the nth vertex in the 50 th dimension morphing animation, and so on. After determining the width and height of the deformation map (to fill the map), the number of pixels per row and the number of pixels per column of the deformation map may be determined. The user terminal may fill the expression deviation extreme values of the multiple vertexes in the 1 st expression dimension into the pixels according to the sequence from left to right and from top to bottom as pixel values. And after the filling of the expression deviation extreme value of the 1 st expression dimension is finished, the user terminal fills the expression deviation extreme value of the 2 nd expression dimension of the plurality of vertexes into each pixel. And repeating the process until all the expression deviation extreme values of all the vertexes in the plurality of expression dimensions are filled.

When the total amount of the expression deviation extreme values is different from the total amount of the pixels in the deformation map, the CPU fills all the expression deviation extreme values and then sets the pixel values of the rest pixels to be zero.

In one embodiment, the user terminal may obtain vertex attribute data of the base face model before rendering the expression animation of the base face model. The vertex attribute data may include, among other things, base coordinates of vertices of the base face model (coordinates of vertices when not offset), vertex order arrays, vertex texture coordinates, colors, surface normal vectors (typically outward from the surface), and so on. As can be seen, there is a set of vertex attribute data for each vertex. The user terminal can extract the vertex attribute data of each vertex from the basic face model.

In this embodiment, when rendering an expression animation according to the deformation map and the weight combination, the user terminal may determine the target coordinates of each vertex of the basic face model in each frame of expression image according to the expression offset extremum of each vertex in the deformation map, the weight combination, and the basic coordinates of each vertex in the vertex attribute data. Here, the target coordinates are actual coordinates of the vertices in one frame of the expression image. When the emoticon image is rendered through the WebGL rendering pipeline, processing is performed through a vertex shader. For a plurality of vertexes in the basic face model, the user terminal can determine the expression offset of each vertex according to the expression offset extreme values of a plurality of expression dimensions corresponding to each vertex in the deformation map and the weight corresponding to each expression dimension in the weight combination.

Taking fig. 5 as an example, for the 1 st vertex in the base face model, the expression deviation extremum of the vertex in 50 expression dimensions may be represented as b1_ v1, b2_ v1, b3_ v1, … …, b49_ v1, b50_ v 1. The weight combinations corresponding to a frame expression image are w1, w2, w3, … … and w 50. The user terminal may determine an expression offset diff1 of the vertex in the frame representation image (b1_ v1 w1+ b2 v1 w2+ … + b50 v1 w 50). After each vertex is calculated, the expression offset of each vertex in the frame of expression image can be determined.

The user terminal performs processing based on the weight combination of the continuous multiframe expression images, and can determine the expression offset of the multiple vertexes in the continuous multiframe expression images.

Illustratively, the base face model has n vertices, for a total of 50 expression dimensions. After the weight combination corresponding to any one frame of expression image is obtained, the expression offset of n vertexes in the frame of expression image can be determined in a mode of matrix multiplication according to the following formula (1):

wherein w1, w2, … … and w50 are weights corresponding to 50 expression dimensions in one expression image; bm _ vn represents the expression deviation extreme value of the nth vertex in the mth expression dimension; diffn represents an expression offset of the nth vertex in one frame of expression image.

After obtaining the expression offset of the multiple vertexes in one frame of expression image, the user terminal may determine the target coordinates of the multiple vertexes in the frame of expression image according to the expression offset of the multiple vertexes and the base coordinate.

Illustratively, the face model has n vertices, for a total of 50 expression dimensions. After determining the expression offset amounts of the n vertices in one frame of the expression image, the user terminal may determine the target coordinates of the vertices in the frame of the expression image according to the following formula (2).

(v₁,v₂,…,v_n)+(diff₁,diff₂,…,diff_n)＝(t₁,t₂,…,t_n) (2)

Wherein v1, v2, … … and vn are basic coordinates of n vertexes in the face model; diffn represents the expression offset of the nth vertex in one frame of expression image; t1, t2, … … and tn are the final target coordinates of n vertexes in one expression image.

The user terminal can render to obtain the expression animation of the basic face model according to the target coordinates of each vertex of the basic face model in each frame of expression image. For any frame representation image, after the target coordinates of each vertex of the basic face model are determined, the frame representation image can be rendered. And obtaining the expression animation by rendering continuous multi-frame expression images.

In one embodiment, the vertex attribute data includes a vertex order corresponding to each vertex of the base face model. Before determining the target coordinates of each vertex of the basic face model in each frame of expression image, the user terminal may determine an expression offset extreme value corresponding to each vertex of the basic face model.

The user terminal can determine texture coordinates corresponding to each vertex in the deformation chartlet according to the vertex sequence corresponding to each vertex of the basic face model. The texture coordinates are used to indicate the pixels of the vertices that correspond to the deformed map.

After determining the texture coordinates corresponding to the vertices, the user terminal may obtain, from the deformation map, the expression deviation values corresponding to the vertices at the plurality of expression temperatures according to the texture coordinates corresponding to the vertices in the deformation map. For any vertex, after obtaining a texture coordinate corresponding to any expression dimension, the user terminal may determine a pixel corresponding to the texture coordinate from the deformation map, so as to obtain an expression offset extremum of the vertex in the expression dimension by sampling. After sampling is performed on each expression dimension for each vertex, a rendering pipeline of the user terminal can obtain expression offset extrema of each vertex in a plurality of expression dimensions.

In an embodiment, when determining texture coordinates of each vertex corresponding to the deformation map according to the vertex sequence corresponding to each vertex, the user terminal may determine rectangular coordinates of each vertex corresponding to each expression dimension in the deformation map according to the vertex sequence corresponding to each vertex of the basic face model and the width and height of the deformation map. Here, the rectangular coordinates may be xy coordinates.

The user terminal can determine the abscissa in the rectangular coordinates by the following formula (3):

x₀＝(n*b+verticeIndex)％width (3)

the user terminal can determine the ordinate in the rectangular coordinate by the following formula (4):

wherein x is₀Represents the abscissa; y is₀Represents the ordinate; n represents the total amount of vertexes in the basic face model; b is the serial number of the current expression dimension minus one, and b starts from 0; VerticieIndex represents the ordering order of the vertices in all the vertices, starting from 0; width indicates the width of the deformation map.

Illustratively, there is a total of 50-dimensional morphing animation, for any vertex, a total of 50 rectangular coordinates may be determined for that vertex corresponding to 50 emoji dimensions.

The user terminal can determine texture coordinates corresponding to each vertex and each expression dimension according to the rectangular coordinates corresponding to each vertex and each expression dimension.

The deformation map is a two-dimensional picture and is composed of one pixel, and one grid represents one pixel. And the rectangular coordinate corresponding to any expression dimension represents the pixel position of the vertex on the deformation chartlet. The abscissa of the rectangular coordinates has a value ranging from 0 to width and the ordinate has a value ranging from 0 to height. The user terminal samples the texture by using texture2D () function through WebGL shader language, and needs to determine the position of the pixel in the texture space. The texture space is a rectangular region between (0,0) and (1,1), that is, the rectangular coordinates are normalized, and the texture coordinates in the texture space can be obtained.

The abscissa of the texture coordinate can be determined by the following equation (5):

the ordinate of the texture coordinate can be determined by the following equation (6):

wherein u is₀Represents the abscissa in the texture coordinates; v. of₀Representing the ordinate in the texture coordinate; x is the number of₀Represents the abscissa in rectangular coordinates; y is₀Represents the ordinate in rectangular coordinates; width represents the width of the deformation map; height represents the height of the morphed map.

For any vertex, the rectangular coordinate corresponding to the vertex in each dimension deformation animation can be converted into texture coordinate.

In an embodiment, both the CPU and the GPU of the user terminal may participate in the virtual expression-driven process. Referring to fig. 6, which is a schematic diagram of expression animation driving provided in an embodiment of the present application, as shown in fig. 6, a CPU extracts vertex attribute data from vertices of a base face model, and transmits the vertex attribute data to a GPU as an attribute variable. And the CPU combines the deformation mapping and the weight of each frame of expression image to be used as a uniform variable and transmits the uniform variable to the GPU. The GPU can process the attribute variable and the uniform variable through the vertex shader, determine the expression offset of each vertex in a frame of expression image, and accordingly determine the target coordinate of each vertex in the frame of expression image. After each target coordinate is determined, the GPU performs processing such as primitive assembly, rasterization, a fragment shader, fragment-by-fragment operation, frame caching and the like to obtain a frame of expression image, and renders expression animation according to continuous multi-frame expression images.

Through the measures, the CPU can construct the deformation mapping based on the expression deviation extreme values of a plurality of expression dimensions, the deformation mapping is uploaded to the GPU, and the GPU executes high-frequency calculation work of the expression images frame by frame, so that the time complexity of the CPU is reduced to O (n), and the driving efficiency of the expression animation is greatly improved.

In an embodiment, when the CPU transmits the morphable map to the GPU, the CPU may set a first specified attribute and a second specified attribute for the morphable map, and transmit the morphable map with the first specified attribute and the second specified attribute set to the GPU.

The first specified attribute is gl, clamp _ TO _ EDGE corresponding TO the texture coordinate, and is used for instructing the GPU TO sample from the deformed map according TO the texture coordinate (uv coordinate), so as TO obtain an extreme offset value of the vertex. New arest for indicating that an offset extremum is determined from the deformation map in a manner dependent on proximity filtering, in other words, selecting a pixel value of a pixel closest to a newly added pixel when the deformation map is enlarged or reduced.

Referring to fig. 7, which is a schematic diagram of an expression animation-driven architecture provided in an embodiment of the present application, as shown in fig. 7, a CPU may process a model data preprocessing stage and an animation-driven stage, where, in the model data processing stage, the CPU loads a model and obtains an attribute variable through vertex extraction; the CPU can determine the deformation chartlet based on the expression deviation extreme value of each vertex in the expression dimensions, and accordingly the uniform variable is obtained. The CPU may transmit an attribute variable and a uniform variable to the GPU. In the animation driving stage, the CPU can obtain the weight combination of each frame of expression image through an expression animation generation algorithm, and transmits the weight combination to the GPU frame by frame.

And in the GPU calculation stage, the deformation mapping is sampled through a WebGL rendering pipeline, expression offset of each vertex in one frame of expression image is determined by combining weight combination, and the expression offset is added with the basic coordinates of the vertex to obtain final target coordinates, so that expression image display is realized.

Fig. 8 is a block diagram of a virtual expression driving apparatus according to an embodiment of the present invention, and as shown in fig. 8, the apparatus may include:

a first obtaining module 810, configured to obtain a basic face model and target animation data corresponding to the basic face model;

a second obtaining module 820, configured to obtain expression offset extreme values corresponding to the base face model in multiple expression dimensions respectively;

a writing module 830, configured to write the expression deviation extremum into a to-be-filled map to obtain a deformed map;

a determining module 840, configured to obtain, from the target animation data, weights corresponding to the base face model in each frame of expression image of the target animation respectively in the plurality of expression dimensions, so as to obtain a weight combination;

and a rendering module 850, configured to render the facial expression animation of the basic face model according to the deformation map and the weight combination.

The implementation process of the functions and actions of each module in the device is specifically detailed in the implementation process of the corresponding step in the virtual expression driving method, and is not described herein again.

In the embodiments provided in the present application, the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A virtual expression driving method, comprising:

2. The method according to claim 1, wherein before the writing the expression offset extremum into the to-be-filled map to obtain the deformed map, the method further comprises:

3. The method of claim 2, wherein determining the width and height of the map to be filled according to the total amount of the expression deviation extremum comprises:

4. The method of claim 2, wherein writing the expression offset extremum into a to-be-filled map to obtain a morphed map comprises:

5. The method of claim 1, further comprising:

6. The method of claim 5, wherein the vertex attribute data comprises a vertex order corresponding to each vertex of the base face model;

7. The method according to claim 6, wherein the determining texture coordinates corresponding to each vertex in the deformed map according to the vertex order corresponding to each vertex in the base face model comprises:

8. A virtual expression driving apparatus, comprising:

9. An electronic device, characterized in that the electronic device comprises:

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the virtual expression driving method of any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that the storage medium stores a computer program executable by a processor to perform the virtual expression driving method according to any one of claims 1 to 7.