CN104346824A

CN104346824A - Method and device for automatically synthesizing three-dimensional expression based on single facial image

Info

Publication number: CN104346824A
Application number: CN201310347091.1A
Authority: CN
Inventors: 黄磊; 舒之昕
Original assignee: Hanwang Technology Co Ltd
Current assignee: Hanwang Technology Co Ltd
Priority date: 2013-08-09
Filing date: 2013-08-09
Publication date: 2015-02-11

Abstract

The invention discloses a method for automatically synthesizing a three-dimensional expression based on a single facial image. The method comprises the following steps of 1, performing facial shape positioning on a single input image by using an ASM (active shape model); 2, finishing facial shape modeling in a scattered interpolation way according to a positioned facial shape by virtue of a three-dimensional facial reference model, and performing texture mapping to obtain a three-dimensional facial model of a target face in the image; 3, calculating a facial expression movement matrix of an expression set relative to the three-dimensional facial reference model and the three-dimensional facial model of the target face; 4, calculating a linear movement model of each expression in the expression set according to the facial expression movement matrix, obtained in step 3, of the three-dimensional facial model of the target face; 5, obtaining a facial area division result of the face by virtue of a clustering method; 6, performing facial expression synthesis. The method is favorable for systematically synthesizing more flexible and richer facial expressions.

Description

Automatically method and the device of three-dimensional expression is synthesized based on individual facial image

Technical field

The invention belongs to computer vision and computer graphics association area, particularly a kind of method and device automatically synthesizing three-dimensional expression based on individual facial image.

Background technology

Three-dimensional face expression synthesis is the computer graphics techniques being applied to numerous areas, comprises film game, man-machine interaction and recognition of face etc.The combination that three-dimensional face expression is then computer vision and computer graphics techniques is synthesized from single image.Its objective is to comprise the image of face from a pair and extract face information, then synthesize the various three-dimensional expressions of face in image.

Existing human face countenance synthesis method great majority are the methods based on face parameter and standard, use MPEG-4 human face animation standard (see Raouzaiou, Amaryllis and Tsapatsoulis, Nicolas and Karpouzis, Kostas and Kollias, Stefanos.Parameterized facial expression synthesis based on MPEG-4.EURASIP Journal on Applied Signal Processing, vol.2002, 1, 1021-1038, 2002.) or Facial Action Coding System (Facial Action Coding System.FACS) (see Roesch, Etienne B and Tamarit, Lucas and Reveret, Lionel and Grandjean, Didier and Sander, David and Scherer, Klaus R.FACSGen:A tool to synthesize emotional facial expressions through systematic manipulation of facial action units.Journal of Nonverbal Behavior, vol.35, 1, pp.1-16, 2011.).Be face modelling feature and moving cell on the basis of these standards, adjust faceform to synthesize three-dimensional expression according to parameter.

The existing Expression synthesis technology based on single image is the Expression synthesis of two dimension mostly, and the human face expression of synthesis lacks the sense of reality, can not carry out three-dimensional multi-pose observation, and the Expression synthesis technology of three-dimensional often needs the plurality of input images of a face (see Fr ' ed ' eric Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, and David HSalesin.Synthesizing realistic facial expressions from photographs.In ACMSIGGRAPH2006Courses, page19.ACM, 2006.), or the face characteristic in input picture is chosen needs manual operation (Fr ' ed ' eric Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, and David HSalesin.Synthesizing realistic facial expressions from photographs.In ACMSIGGRAPH2006Courses, page19.ACM, 2006.) etc., the method that existing face facial zone divides also great majority be craft complete (see Qingshan Zhang, Zicheng Liu, Gaining Quo, Demetri Terzopoulos, and Heung-Yeung Shum.Geometry-driven photorealistic facial expression synthesis.Visualizationand Computer Graphics, IEEE Transactions on, 12 (1): 48 – 60,2006.).

Summary of the invention

In order to overcome the above-mentioned defect existed in prior art, the present invention proposes a kind of method and apparatus being synthesized three-dimensional face expression automatically by single image, wherein without the need to manually marking the human face characteristic point in image, and realize automatic human face region to divide, contribute to the human face expression that system synthesis is abundanter.

According to an aspect of the present invention, propose a kind of method of automatically synthesizing three-dimensional expression based on individual facial image, the method comprising the steps of: step 1, for the single image of input, uses ASM to carry out face shape location; Step 2, according to the face shape of location, utilizes three-dimensional face reference model to adopt loose point interpolation to complete the shape modeling of face, texture is carried out on the basis of shape modeling, obtains the three-dimensional face model of target face in image; Step 3, utilize the expression collection comprising limited expression model, calculate the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face, the summit one_to_one corresponding in the three-dimensional face model of the summit that described expression concentrates each expression model to comprise and three-dimensional face reference model and target face; Step 4, according to the kinematic matrix of the three-dimensional face model of the target face obtained in step 3, calculates the linear motion model that expression concentrates each expression; Step 5, on the basis that three-dimensional face reference model and expression collect, uses clustering method to obtain face facial zone division result; Step 6, utilize the three-dimensional face model of target face, motion model and face facial zone division result, carry out human face expression synthesis.

According to a further aspect in the invention, propose a kind of device automatically synthesizing three-dimensional expression based on individual facial image, this device comprises: face shape positioning unit, for the single image of input, uses ASM to carry out face shape location; Face shape modeling unit, according to the face shape of location, utilizes three-dimensional face reference model to adopt loose point interpolation to complete the shape modeling of face, texture is carried out on the basis of shape modeling, obtains the three-dimensional face model of target face in image; Kinematic matrix computing unit, utilizes the expression collection comprising limited expression model, calculates the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face; Linear movement model computing unit, according to the kinematic matrix of the three-dimensional face model of the target face obtained, calculates the linear motion model that expression concentrates each expression; Facial zone division unit, on the basis that three-dimensional face reference model and expression collect, uses clustering method to obtain automatic face facial zone division result; Human face expression synthesis unit, utilizes the three-dimensional face model of above-mentioned target face, motion model and face facial zone division result, carries out human face expression synthesis.

Use the method and apparatus automatically synthesizing three-dimensional expression based on individual facial image of the present invention, have the following advantages: 1) present invention utilizes active shape model locating human face feature, and divide based on the facial zone of cluster, achieve the synthesis of automatic human face expression; 2) present invention employs local space transfer pair expression kinematic matrix to convert according to the three-dimensional modeling result inputting face, make Expression synthesis result more meet the shape of input face; 3) present invention utilizes the result that facial zone divides, enriched Expression synthesis result by the mode of areas combine, the limited expression that the expression making the expression kind of synthesizing be not limited to input is concentrated; 4) present invention utilizes linear expression motion model, the expression of multiple intensity can be synthesized, enriched Expression synthesis result, and linear model computing velocity is fast, and expression can be synthesized in real time.

Accompanying drawing explanation

Fig. 1 shows the Method And Principle figure automatically synthesizing three-dimensional expression based on individual facial image of the present invention;

Fig. 2 uses active shape model (ASM) to carry out the result figure of Face detection to an image containing front face;

Fig. 3 is three-dimensional ASM key point schematic diagram on three-dimensional face reference model, and light spot wherein represents three-dimensional ASM point;

Fig. 4 a, 4b and 4c are texture schematic diagram, and wherein Fig. 4 a is input picture, and Fig. 4 b is the result of shape similarity metric, and Fig. 4 c is the result after texture;

Fig. 5 is the local coordinate definition schematic diagram on summit;

Fig. 6 is local space conversion schematic diagram;

Fig. 7 is the regional movement active degree schematic diagram of faceform;

Fig. 8 is the facial zone division result that cluster obtains;

Fig. 9 is the result of a part of human face rebuilding, and wherein row of the leftmost side are the results of input picture and ASM location, and right side is the three-dimensional face expression of synthesis;

Figure 10 shows the apparatus structure block diagram automatically being synthesized three-dimensional face expression by individual front face image of the present invention.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.

The present invention is directed to existing mark based on the feature in the Expression synthesis technology of single image and divide in face problems such as needing artificial participation, propose a kind of synthetic method of full-automation.

According to method of automatically being synthesized three-dimensional expression by individual facial image of the present invention, the automatic synthesis of three-dimensional expression is carried out for individual front face image, first the selection describing the unique point of face characteristic in image is reduced to the automatic location on single image by the artificial selection on multiple image in the past, the face shape using active shape model (Active Shape Model, ASM) to realize in single image obtains.On the basis that comprises the expression collection of limited expression model, generate linear expression model and use clustering method automatically to carry out the division of face facial zone.

The present invention is for the image of input, ASM is used to carry out face shape location, (loose point interpolation technology is under the prerequisite of the conversion of one group of known spatial point, try to achieve the interpolating function that meets this conversion, utilizes the technology that this function converts unknown spatial point carry out converting for a group basis that automatic shape obtains to utilize a three-dimensional face reference model adopt loose point interpolation technology.) complete the shape modeling of face, the basis of shape modeling is carried out the three-dimensional modeling that namely texture completes face.Then utilize an expression collection comprising limited expression model, calculate human face expression kinematic matrix, transformation of local coordinates is used to matrix, then calculate the linear expression motion model that can synthesize human face expression in various degree in real time.Meanwhile, utilize the data that expression is concentrated, use clustering method to obtain automatic face facial zone division result.Finally, use three-dimensional face model, expression motion model and division result acting in conjunction, realistic multiple three-dimensional face expression can be synthesized in real time.

Fig. 1 shows the Method And Principle figure automatically being synthesized three-dimensional face expression by individual front face image of the present invention.Describe method of the present invention in detail referring to Fig. 1, as shown in Figure 1, the method comprises following steps:

Step 1, according to the Image Acquisition face shape of input.

In this step, for the image of input, use active shape model ASM(see P.Xiong, L.Huang, and C.Liu.Initialization and pose alignment in active shape model.In 2010International Conference on Pattern Recognition, pages3971 – 3974.IEEE, 2010.) obtain face shape, this shape uses a stack features point to be formed by connecting, and uses S to represent:

S＝(p ₁,p ₂,…p _k,…p _N) (1)

Wherein, p _k=(x _k, y _k), k=(1,2 ..., N) and be the N number of unique point forming S.As shown in Figure 2.

Step 2, according to determined face shape, obtains the three-dimensional face model of target face in image.

In this step, use a three-dimensional face reference model, this three-dimensional face reference model have the three-dimensional key point corresponding with the face shape used in step 1, loose point interpolation is carried out to this three-dimensional face reference model, obtain the shape conversion of three-dimensional face reference model, then direct texture is used to the model after shape conversion, obtain the three-dimensional face model of target face in image as shown in Figure 3, this three-dimensional key point manually determines according to the two-dimension human face shape obtained in step 1 under line, uses S _refrepresent:

S _ref＝(q ₁,q ₂,…,q _k,…,q _N) (2)

Wherein q _k, k=(1,2 ..., N) and be three-dimensional point coordinate in three-dimensional face reference model.

For S _ref, the present invention uses orthogonal image to project, and obtaining its two-dimensional projection's result is on the image plane S ' _ref:

S′ _ref＝(q′ ₁,q′ ₂,…,q′ _k,…,q′ _N) (3)

Wherein q ' _kq _ktwo-dimensional projection on plane picture, and have passed through yardstick and translation transformation.After determining point correspondence, use radial base interpolation function RBF to carry out loose point interpolation to three-dimensional face reference model, interpolating function is:

f(p)＝∑ _iλ _iψ(|p-p _i|)+Ap+t (4)

Wherein p and p _ibe known spatial point target location coordinate and the front original position coordinate of conversion after the conversion respectively, the span of i depends on the number of the point of known transform, and such as, in the present invention known ASM point is 81, therefore i value is 0 to 80(or 1 to 81), λ _ifor basis function coefficient, A and t is affine composition." loose point interpolation " method that the present invention uses is the evolution solving other " points of evolution the unknown " in isospace according to the evolution of a series of " point that evolution is known ", what " known spatial point " referred to is exactly " point that evolution is known ", and p is these coordinates after evolution, p _ithese coordinates before evolution.

Function ψ (r) has multiple for radial symmetry basis function, radial symmetry basis function, and gaussian kernel function is wherein a kind of, and be also the one of widespread use, the present invention uses gaussian kernel function:

ψ (r) = \exp {- \frac{r^{2}}{{2 σ}^{2}}} - - - (5)

By separating following equations, then coefficient lambda can be obtained _i, A and t:

δ _i＝f(p _i) (6)

∑ _iλ _i＝0 (7)

∑ _iλ _ip _i＝0 (8)

Wherein δ _ideformation constraint, namely

δ _i＝p _i-q′ _i(9)

Obtain coefficient lambda _i, after A and t, to the conversion of each some use formula (4) in three-dimensional face reference model, namely the shape conversion of three-dimensional face reference model is obtained, namely the shape modeling of three-dimensional face is completed, then direct texture is used to the model after shape conversion, the Complete three-dimensional faceform (comprising shape and texture) of target face in image can be obtained, as shown in Figure 4.

Step 3, calculates the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face.

In this step, introduce the expression collection including multiple expression model, this expression is concentrated and is included dissimilar expression, the expression of every type has several expression models corresponding with it, utilize this expression collection, calculate the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face.

By above-mentioned three-dimensional face reference model M _refrepresent, expression collection M _exp, represent, this model M _refthe all models concentrated with expression cited below have good correspondence (good correspondence refer to the number on summit in different model, the topological structure on the label on summit and summit be identical), all comprise m summit, and this model M _refneutral expression's model, i.e. amimia model; And the three-dimensional face model of the target face obtained in step 2 uses M _neurepresent, this model M _neualso be neutral expression's model.Present invention uses a limited expression collection, this expression collection is the set of n expression model, n expression model correspond to the dissimilar expression of s kind altogether, each expression model wherein in n expression model and three-dimensional face reference model have good corresponding, also have good corresponding with the three-dimensional model of target face, be M by this limited expression set representations simultaneously _exp:

M_{\exp} = (M_{1}^{e}, M_{2}^{e}, . . ., M_{j}^{e}, . . ., M_{s}^{e}) - - - (10)

Wherein representative corresponds to several expression models of a jth expression, might as well be set to containing t expression model, owing to being independent modeling to each expression, so different expression j may have different t, but not affect the modeling for expression, for:

M_{j}^{e} = (M_{j_{1}}, . . ., M_{j_{k}}, . . ., M_{j_{t}}) - - - (11)

Wherein represent the model of a kth intensity of a jth expression, also have simultaneously:

{ΔM}_{j_{k}} = M_{j_{k}} - M_{ref} - - - (12)

The i.e. kinematic matrix of a kth intensity of a jth expression, this kinematic matrix corresponds to three-dimensional face reference model M _refkinematic matrix, be called M _refjth _kindividual kinematic matrix.

Following the present invention uses a kind of partial transformation method to solve three-dimensional face model M corresponding to target face _neujth _kindividual kinematic matrix, uses represent.The ultimate principle of partial transformation method is: any one summit of expression being concentrated to neutral expression's model, construct its source local space being arranged in three-dimensional face reference model and be arranged in the target local space of three-dimensional face model of target face, utilize the corresponding relation between the source local space on described summit and target local space, try to achieve the motion vector of this summit relative to three-dimensional face reference model apex coordinate, use local space converts, obtain the variation that this motion vector is applicable to the three-dimensional face model of target face, utilize the kinematic matrix of this variation and three-dimensional face reference model, obtain the kinematic matrix of the three-dimensional face model of target face.In the present invention, " neutral expression's model " and " three-dimensional face reference model " is a model, but is not same model with " three-dimensional face model of target face ".For a summit, the make of its local space is unique, and is irrelevant with it in which model.And which model " source local space " and " target local space " is in for distinguishing summit, in " three-dimensional face reference model ", " local space " on summit is exactly " source local space ", and " local space " on the summit in " three-dimensional face model of target face " is exactly " target local space ".

For the above-mentioned partial transformation method mentioned, particularly, first for the neutral expression's model in expression model set, (only containing neutral expression's model in expression model set, that specifically refer to is exactly M _ref, namely neutral expression's model is exactly three-dimensional face reference model) any one summit, construct it and be positioned at three-dimensional face reference model M _refin source local space and it is positioned at the three-dimensional face model M of target face _neuin target local space as the three-dimensional face reference model M of neutral expression's model _refwith the three-dimensional model M of target face _neuin summit be relation one to one, the two has identical topological structure, there is the summit corresponding with it on any one summit wherein in these two any one models of model in another one model, just coordinate is different, think that there is identical status on these two summits in respective model herein, think a summit, because the three-dimensional model M of target face _neuin this summit be exactly three-dimensional face reference model M _refin this vertex interpolation calculate.In fact, " neutral expression's model " above-mentioned and " three-dimensional face reference model " use same model M _ref, it belongs to the identity that represent neutral expression in " expression collection " to claim its " neutral expression's model " to refer to, and claims its " three-dimensional face reference model " to refer to it in " three-dimensional face modeling process " as the identity with reference to model.

As shown in Figure 5, for a summit V of neutral expression's model in expression model set, its surface unit normal vector (being defined as the mean value of the outer normal direction of all surface dough sheet comprising this summit), define the section of this point, O is world coordinates initial point (being defined as model center point) simultaneously, the vector in OV direction, be projection on section:

\overset{&RightArrow;}{o_{V}} = \overset{&RightArrow;}{v_{V}} \times \overset{&RightArrow;}{n_{V}} - - - (13)

\overset{&RightArrow;}{p_{V}} = \overset{&RightArrow;}{n_{V}} \times \overset{&RightArrow;}{o_{V}} - - - (14)

So local space L _vjust by with form.

So source local space and target local space are respectively just

L_{V_{R}} = (o_{V_{R}}; p_{V_{R}}; n_{V_{R}}) - - - (15)

L_{V_{N}} = (o_{V_{N}}; p_{V_{N}}; n_{V_{N}}) - - - (16)

For three-dimensional face reference model M _refany one kinematic matrix, it contains expression and concentrates certain containing all summits of the three-dimensional face model of expression relative to the one motion on summit corresponding in three-dimensional face reference model, and namely this summit is by three-dimensional face reference model M _refto the vector of the motion of corresponding expression model vertices.The model vertices of any one the expression model concentrated for expressing one's feelings, relative to the motion vector of three-dimensional face reference model apex coordinate, uses represent, use local space conversion, the three-dimensional model M that this motion vector is adapted to target face can be obtained _neuvariation (as shown in Figure 6):

Δ \overset{&RightArrow;}{V_{V_{N}}} = T (Δ \overset{&RightArrow;}{V_{V_{R}}}) = L_{V_{N}} L_{V_{R}}^{- 1} Δ \overset{&RightArrow;}{V_{V_{R}}} - - - (17)

This conversion is applied to kinematic matrix all in, namely obtain corresponding can in the hope of all three-dimensional model M corresponding to target face by the method _neuthe kinematic matrix of a kth intensity of jth expression

Step 4, calculates to each expression that expression is concentrated motion model of linearly expressing one's feelings.

? basis on, for expression model set in each expression j, calculate the linear expression motion model A of this expression j _j:

A_{j} = \arg \min_{A_{j}} Σ_{k = 1}^{t} {| | {ΔN}_{j_{k}} - ι_{k} \cdot A_{j} | |}^{2} - - - (18)

Wherein ι _kit is the quantisation metric of the intensity k of the kth intensity to a jth expression.Solve this formula independently for each expression j, this formula can solve by least square method.

Step 5, carries out facial zone division.

In this step, utilize the summit that the concentrated each three-dimensional model of expression comprises, construct a matrix and represent the mode of motion correlated characteristic concentrated in expression on summit, utilize the mode of motion correlated characteristic on summit and the locus feature on summit, the proper vector on structure summit, carry out K mean cluster to this proper vector, the result of cluster is the result that face divides.

Particularly, at three-dimensional face reference model M _refwith expression collection M _expbasis on carry out the division of facial zone.Owing to expressing one's feelings, concentrated each three-dimensional model comprises m summit, so structure matrix represents the mode of motion correlated characteristic concentrated in expression on summit:

L = ({\overset{&RightArrow;}{l}}_{1}, . . ., {\overset{&RightArrow;}{l}}_{k}, . . ., {\overset{&RightArrow;}{l}}_{m}) - - - (19)

Wherein

{\overset{&RightArrow;}{l}}_{k} = Σ_{i = 1}^{n} | Δ \overset{&RightArrow;}{V_{l_{k}}} |, (k = 1, . . . m) - - - (20)

Wherein n is expression collection M _expin the quantity of three-dimensional face model, be then the amplitude of the motion vector of a kth summit in the kinematic matrix of institute's espressiove model in each expression model and, represent the motion feature on this summit by this value, as shown in Figure 7.

Summit has locus feature simultaneously it is that a kth summit is at three-dimensional face reference model M _refin volume coordinate.In conjunction with motion feature and the position feature on this summit, the proper vector constructing this summit is:

{\overset{&RightArrow;}{f}}_{k} = (λ_{1} {\overset{&RightArrow;}{V}}_{re f_{k}}; μ λ_{2} {\overset{&RightArrow;}{l}}_{k}) = (λ_{1} {\overset{&RightArrow;}{V}}_{{ref}_{k}}; {μλ}_{2} Σ_{i = 1}^{n} | Δ \overset{&RightArrow;}{V_{l_{k}}} |), (k = 1, . . . m) - - - (21)

Wherein λ ₁be normalization coefficient, μ is used to the coefficient of the influence power of controlled motion information in whole feature, λ ₂be normalization coefficient.

Utilize this feature, carry out K mean cluster, this cluster process completes under line, and the result of cluster is exactly the result that face divides, as shown in Figure 8.In cluster process, for m summit in the three-dimensional face model of target face, use the method for K mean cluster to be divided into r class, the number of vertices in each class is not certain, think that of a sort summit represents the same area of face, inhomogeneity summit represents zones of different.Cluster result is exactly R _ij, existing explanation in (22) formula illustrates.Obtained the label on all summits in each class by cluster, use these labels and A _ir R can be built _ij.In addition, " to complete under line " to refer to and having expression collection M _expand M _refafterwards, clustering method just can be used to divide for face, and in real-time human face expression building-up process, need not re-start division, the division result obtained under using line, because the facial division result of model and input picture are that it doesn't matter.

Step 6: three-dimensional face expression synthesizes.

The expression motion model A tried to achieve in the three-dimensional face modeling result obtained in step 2, step 4 _jand on the basis of facial division result in step 5, carry out human face expression synthesis by following formula:

E (\overset{&RightArrow;}{α}, Γ) = M_{neu} + \underset{i, j}{Σ} (α_{i} A_{i} + γ_{ij} R_{ij})

\overset{&RightArrow;}{α} = (α_{1}, α_{2}, . . ., α_{s})

Wherein: E is that system exports, the expression of namely synthesizing; A _iit is the motion model of i-th expression; the expression motion control parameter of s dimension, wherein α _irepresent the intensity of i-th expression; R _ija jth region of i-th expression, R _ijwith A _iwith the three-dimensional model M of target face _neubeing equally the matrix that dimension is the same, such as, is that (3 × m) ties up matrix, just R _ijin do not belong to the vertex correspondence in a jth region value be 0; Γ is that (s × r) ties up Region control matrix, and wherein s is the number that expression concentrates the expression kind contained, and r is the effective areal after dividing, the element γ in matrix _ijcorresponding to the intensity of a jth facial expression in i-th region, m, r, i and j be all be more than or equal to 1 integer.Part Expression synthesis result as shown in Figure 9.

The invention allows for a kind of device automatically synthesizing three-dimensional face expression based on individual front face image, in order to perform the above-mentioned method of automatically synthesizing three-dimensional face expression based on individual front face image.Figure 10 is the structured flowchart of this device, and with reference to Figure 10, this device comprises: face shape positioning unit 101, for the single image of input, uses ASM to carry out face shape location; Face shape modeling unit 102, according to the face shape of location, utilizes three-dimensional face reference model to adopt loose point interpolation to complete the shape modeling of face, texture is carried out on the basis of shape modeling, obtains the three-dimensional face model of target face in image; Kinematic matrix computing unit 103, utilizes the expression collection comprising limited expression model, calculates the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face; Linear movement model computing unit 104, according to the kinematic matrix of the three-dimensional face model of the target face obtained, calculates the linear motion model that expression concentrates each expression; Facial zone division unit 105, on the basis that three-dimensional face reference model and expression collect, uses clustering method to obtain automatic face facial zone division result; Human face expression synthesis unit 106, utilizes the three-dimensional face model of above-mentioned target face, motion model and face facial zone division result, carries out human face expression synthesis.

Wherein, face shape modeling unit 102 uses a three-dimensional face reference model, this three-dimensional face reference model have the three-dimensional key point corresponding with the face shape that face shape positioning unit is determined, loose point interpolation is carried out to this three-dimensional face reference model, obtain the shape conversion of three-dimensional face reference model, then direct texture is used to the model after shape conversion, obtain the three-dimensional face model of target face in image.

Kinematic matrix computing unit 103 introduces the expression collection including multiple expression model, this expression is concentrated and is included dissimilar expression, the expression of every type has several expression models corresponding with it, utilize this expression collection, calculate the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face.Kinematic matrix computing unit 103 obtains the three-dimensional face model of target face in the following manner: any one summit of expression being concentrated to a medium-sized expression model, construct its source local space being arranged in three-dimensional face reference model and be arranged in the target local space of three-dimensional face model of target face, utilize the local space on described summit, corresponding relation between source local space and target local space, try to achieve the motion vector of this summit relative to three-dimensional face reference model apex coordinate, use local space converts, obtain the variation that this motion vector is applicable to the three-dimensional face model of target face, utilize the kinematic matrix of this variation and three-dimensional face reference model, obtain the kinematic matrix of the three-dimensional face model of target face.

The summit that facial zone division unit 105 utilizes the concentrated each three-dimensional model of expression to comprise, construct a matrix and represent the mode of motion correlated characteristic concentrated in expression on summit, utilize the mode of motion correlated characteristic on summit and the locus feature on summit, the proper vector on structure summit, carry out K mean cluster to this proper vector, the result of cluster is the result that face divides.

Advantage of the present invention and effect are: 1) present invention utilizes active shape model locating human face feature, and divide based on the facial zone of cluster, achieve the synthesis of automatic human face expression; 2) present invention employs local space transfer pair expression kinematic matrix to convert according to the three-dimensional modeling result inputting face, make Expression synthesis result more meet the shape of input face; 3) present invention utilizes the result that facial zone divides, enriched Expression synthesis result by the mode of areas combine, the limited expression that the expression making the expression kind of synthesizing be not limited to input is concentrated; 4) present invention utilizes linear expression motion model, the expression of multiple intensity can be synthesized, enriched Expression synthesis result, and linear model computing velocity is fast, and expression can be synthesized in real time.

Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. automatically synthesize a method for three-dimensional expression based on individual facial image, the method comprising the steps of:

Step 1, for the single image of input, uses ASM to carry out face shape location;

Step 2, according to the face shape of location, utilizes three-dimensional face reference model to adopt loose point interpolation to complete the shape modeling of face, texture is carried out on the basis of shape modeling, obtains the three-dimensional face model of target face in image;

Step 3, utilize the expression collection comprising limited expression model, calculate the human face expression kinematic matrix of this expression collection relative to the kinematic matrix of three-dimensional face reference model and the three-dimensional face model of target face, the summit one_to_one corresponding in the three-dimensional face model of the summit that described expression concentrates each expression model to comprise and three-dimensional face reference model and target face;

Step 4, according to the human face expression kinematic matrix of the three-dimensional face model of the target face obtained in step 3, calculates the linear movement model that expression concentrates each expression;

Step 5, on the basis that three-dimensional face reference model and expression collect, uses clustering method to obtain face facial zone division result;

Step 6, utilizes linear movement model and the face facial zone division result of the concentrated each expression of the three-dimensional face model of target face, expression, carries out the three-dimensional Expression synthesis of face.

2. method according to claim 1, it is characterized in that, step 2 comprises further, use a three-dimensional face reference model, this three-dimensional face reference model has the three-dimensional key point corresponding with the face shape used in step 1, loose point interpolation is carried out to this three-dimensional face reference model, obtains the shape conversion of three-dimensional face reference model, then direct texture is used to the model after shape conversion, obtain the three-dimensional face model of target face in image.

3. method according to claim 2, it is characterized in that, step 3 comprises further, obtain the human face expression kinematic matrix of the three-dimensional face model of target face in the following manner: any one summit of expression being concentrated to neutral expression's model, construct its source local space being arranged in three-dimensional face reference model and be arranged in the target local space of three-dimensional face model of target face, utilize the local space on described summit, corresponding relation between source local space and target local space, try to achieve the motion vector of this summit relative to three-dimensional face reference model apex coordinate, use local space converts, obtain the variation that this motion vector is applicable to the three-dimensional face model of target face, utilize the kinematic matrix of this variation and three-dimensional face reference model, obtain the kinematic matrix of the three-dimensional face model of target face.

4. method according to claim 3, it is characterized in that, step 5 comprises further: utilize the summit that the concentrated each three-dimensional model of expression comprises, construct a matrix and represent the mode of motion correlated characteristic concentrated in expression on summit, utilize the mode of motion correlated characteristic on summit and the locus feature on summit, the proper vector on structure summit, carries out K mean cluster to this proper vector, and the result of cluster is the result that face divides.

5. method according to claim 4, it is characterized in that, in cluster process, for m summit in the three-dimensional face model of target face, use the method for K mean cluster to be divided into r class, the number of vertices in each class is not certain, thinks that of a sort summit represents the same area of face, inhomogeneity summit represents zones of different, and cluster result is exactly R _ij, R _ijbe a jth region of i-th expression, obtained the label on all summits in each class by cluster, use these labels and A _ibuild r R _ij, wherein A _ibe i-th expression motion model, m, r, i and j be all be more than or equal to 1 integer.

6. automatically synthesize a device for three-dimensional expression based on individual facial image, this device comprises:

Face shape positioning unit, for the single image of input, uses ASM to carry out face shape location;

Face shape modeling unit, according to the face shape of location, utilizes three-dimensional face reference model to adopt loose point interpolation to complete the shape modeling of face, texture is carried out on the basis of shape modeling, obtains the three-dimensional face model of target face in image;

Kinematic matrix computing unit, utilizes the expression collection comprising limited expression model, calculates the kinematic matrix of the three-dimensional face model of three-dimensional face reference model and target face;

Linear movement model computing unit, according to the kinematic matrix of the three-dimensional face model of the target face obtained, calculates the linear motion model that expression concentrates each expression;

Facial zone division unit, on the basis that three-dimensional face reference model and expression collect, uses clustering method to obtain automatic face facial zone division result;

Human face expression synthesis unit, utilizes the three-dimensional face model of above-mentioned target face, motion model and face facial zone division result, carries out human face expression synthesis.

7. device according to claim 6, it is characterized in that, face shape modeling unit uses a three-dimensional face reference model, this three-dimensional face reference model have the three-dimensional key point corresponding with the determined face shape of face shape positioning unit, loose point interpolation is carried out to this three-dimensional face reference model, obtain the shape conversion of three-dimensional face reference model, then direct texture is used to the model after shape conversion, obtain the three-dimensional face model of target face in image.

8. device according to claim 7, it is characterized in that, kinematic matrix computing unit obtains the three-dimensional face model of target face in the following manner: any one summit of expression being concentrated to a medium-sized expression model, construct its source local space being arranged in three-dimensional face reference model and be arranged in the target local space of three-dimensional face model of target face, utilize the local space on described summit, corresponding relation between source local space and target local space, try to achieve the motion vector of this summit relative to three-dimensional face reference model apex coordinate, use local space converts, obtain the variation that this motion vector is applicable to the three-dimensional face model of target face, utilize the kinematic matrix of this variation and three-dimensional face reference model, obtain the kinematic matrix of the three-dimensional face model of target face.

9. device according to claim 8, it is characterized in that, the summit that facial zone division unit utilizes the concentrated each three-dimensional model of expression to comprise, construct a matrix and represent the mode of motion correlated characteristic concentrated in expression on summit, utilize the mode of motion correlated characteristic on summit and the locus feature on summit, the proper vector on structure summit, carries out K mean cluster to this proper vector, and the result of cluster is the result that face divides.

10. device according to claim 9, it is characterized in that, in cluster process, for m summit in the three-dimensional face model of target face, use the method for K mean cluster to be divided into r class, the number of vertices in each class is not certain, thinks that of a sort summit represents the same area of face, inhomogeneity summit represents zones of different, and cluster result is exactly R _ij, R _ijbe a jth region of i-th expression, obtained the label on all summits in each class by cluster, use these labels and A _ir R can be built _ij, wherein A _ibe i-th expression motion model, m, r, i and j be all be more than or equal to 1 integer.