CN109766866A

CN109766866A - A kind of human face characteristic point real-time detection method and detection system based on three-dimensional reconstruction

Info

Publication number: CN109766866A
Application number: CN201910057766.6A
Authority: CN
Inventors: 汪令野; 沈江洋
Original assignee: Hangzhou Meidai Technology Co Ltd
Current assignee: Hangzhou Meidai Technology Co Ltd
Priority date: 2019-01-22
Filing date: 2019-01-22
Publication date: 2019-05-17
Anticipated expiration: 2039-01-22
Also published as: CN109766866B

Abstract

The invention discloses a kind of human face characteristic point real-time detection method and detection system based on three-dimensional reconstruction, wherein detection method includes: (1) acquisition facial image frame, utilizes geometrical reconstruction algorithm real-time reconstruction geometrical face model；(2) geometrical face model is pre-processed, extracts face point cloud data；(3) face point cloud data is inputted and the detection of face three-dimensional feature point is carried out by the FacePointNet that thick PlainFPN network and piecemeal cascade network CascadeFPN are formed, obtain face three-dimensional feature point；The thick PlainFPN network is used to detect the cheek Edge Feature Points of face and the inter characteristic points of rough grade, face point cloud data is divided into several piecemeals according to the inter characteristic points of rough grade by the CascadeFPN, and detects the inter characteristic points of the thin precision of each piecemeal respectively.Using method of the invention, the accuracy rate and real-time of face three-dimensional feature point detecting method can be improved.

Description

A kind of human face characteristic point real-time detection method and detection system based on three-dimensional reconstruction

Technical field

The invention belongs to technical field of computer vision, real more particularly, to a kind of human face characteristic point based on three-dimensional reconstruction When detection method and detection system.

Background technique

With popularizing for mobile terminal RGB-D camera, the consumer level RGB-D camera assembled on the mobile apparatus will be touched directly To consumer end-user, deep progress certainly will be brought to 3D vision.And the three-dimensional information modeling of face, it is mobile peace Entirely, the basis of the every field such as mobile AR, mobile VR, provides unlimited possibility for mobile application more abundant.And in face Three-dimensional information in, most expressive force and practical value be exactly face geometric grid, appearance texture and three-dimensional feature point.But It is that the operational capability and storage capacity of mobile device are limited, needs to design friendly interactive system, efficient on the mobile apparatus The acquisition of algorithm progress face three-dimensional information.

In the research of three-dimensional geometry Gridding Reconstruction Method, there is significant component of algorithm to take into account performance and effect, together When have benefited from the continuous promotion of equipment operational performance, real-time reconstruction can be carried out to RGB-D frame, obtain good effect.In addition, Appearance texture also contains the abundant information of three-dimension object.But due to camera position and attitude error, geometric grid error, complicated light According to etc. factors, the true appearance texture for restoring three-dimension object it is very challenging.The method of mainstream is directed to general mostly at present Scene establishes complicated global Non-linear Optimal Model iterative solution, and operation is very time-consuming, and does not account for the priori of face Information cannot well adapt to face.

The three-dimensional feature point of face, it is recognition of face, human face animation, 3D printing system that same, which includes the key message of face, Make etc. the premise of more advanced application.Existing facial modeling research is concentrated mainly on two dimensional image field.Certainly, Some scholar proposes effective algorithm for the positioning feature point of human face three-dimensional model.But existing algorithm one Sub-model is excessively complicated, and operational efficiency is low, and the precision of another part model result cannot then be met the requirements.Three-dimensional feature point Compared with two dimensional character point location, the difference of the two is that the format of input data is different for positioning.The input of 2-D data is benefited In the development of image convolution neural network, very big progress is achieved.Therefore, many scholars, which start to explore, is suitable for three-dimensional The deep learning method of data store organisation, this popular research field are referred to as three dimensional depth study.With general field Three dimensional depth study continuous development, face three-dimensional feature point detection as one be easy expansion subproblem, will obtain Preferably solve.

Summary of the invention

The object of the present invention is to provide a kind of human face characteristic point real-time detection method and detection system based on three-dimensional reconstruction, The accuracy rate and real-time of face three-dimensional feature point detecting method can be improved.

For achieving the above object, the present invention the following technical schemes are provided:

A kind of human face characteristic point real-time detection method based on three-dimensional reconstruction, comprising the following steps:

(1) facial image frame is acquired, geometrical reconstruction algorithm real-time reconstruction geometrical face model is utilized；

(2) geometrical face model is pre-processed, extracts face point cloud data；

(3) input of face point cloud data is made of thick PlainFPN network and piecemeal cascade network CascadeFPN FacePointNet carries out the detection of face three-dimensional feature point, obtains face three-dimensional feature point；

The thick PlainFPN network is used to detect the cheek Edge Feature Points of face and the inter characteristic points of rough grade, institute Face point cloud data is divided into several piecemeals by the inter characteristic points for stating CascadeFPN according to rough grade, and detection is each respectively The inter characteristic points of the thin precision of piecemeal.

The present invention proposes FacePointNet for the detection of face three-dimensional feature point.Wherein, piecemeal cascade network CascadeFPN is according to the cascade thinking of piecemeal, by slightly to the three-dimensional feature point inside the detection face of essence.Compared to current mainstream Face three-dimensional feature point detecting method it is more acurrate, and real-time is higher.

Preferably, the facial image recognition frame of acquisition is the consecutive image frame of human face posture alignment in step (1).It is acquiring When facial image frame, two-dimension human face alignment algorithm is executed to input picture, detects real-time human face posture and two dimensional character point.Root Judge whether face is aligned according to current human face posture, when face does not have initial alignment, system can feedback prompts user progress The adjustment of posture.

In step (1), the geometrical reconstruction algorithm uses KinectFusion algorithm, specific reconstruction process are as follows:

(1-1) executes Spacial domain decomposition operation to facial image, the two dimensional character point of facial image is detected, in conjunction with depth Frame marks off the bounding box cube of face；

(1-2) realizes KinectFusion algorithm in obtained bounding box cube, and it is several to carry out real-time face three-dimensional What is rebuild.

In step (2), the pretreatment includes alignment and cutting process, detailed process are as follows:

Geometrical face model to be detected is slightly snapped to the coordinate space of the faceform of a standard first；Then, Taking nose is the centre of sphere, and 80~100mm is that the spherical surface of radius cuts face, and face point cloud data is obtained after cutting.

Thick alignment is in order to avoid input point cloud model rotation has an impact testing result, and cutting is to extract effectively Face part avoids invalid geometry that data is interfered to generate adverse effect to testing result.It, can be with by above-mentioned pretreatment operation Improve the robustness of entire detection method.

It is described in step (3)

PlainFPN network is divided into two modules, and one of module is used to detect the cheek Edge Feature Points of face, separately One module is used to detect the face inter characteristic points of rough grade.

Each module includes 5 x-conv layers, and every layer of average sample points are respectively 1024,384,128,128,128 A, x-conv layers followed by one average pond layer extracts the average characteristics of each sampled point, is later input to this feature The multi-layer perception (MLP) being made of three layers of full articulamentum finally exports the three-dimensional coordinate of human face characteristic point.

Wherein, it is made of a multi-layer perception (MLP) and two common convolutional layers for x-conv layers, the point cloud data of input is advanced Enter and rise dimension in a multi-layer perception (MLP), obtained data are input in a convolutional layer again and obtain transformation matrix, later will transformation Matrix is multiplied with the point cloud data risen after tieing up, is input in another convolutional layer, finally exports result.

The piecemeal cascade network CascadeFPN includes: the point cloud segmentation module for dividing face inter characteristic points And several thin PlainFPN networks corresponding with piecemeal each after segmentation.

CascadeFPN is using the face inter characteristic points of rough grade as input, using these characteristic points by face point cloud minute Four nose, eyes, eyebrow and mouth piecemeals are segmented into, are then input to the point cloud data of each piecemeal corresponding thin In PlainFPN network.Each thin PlainFPN network includes 4 layers x-conv layers, and every layer of sampling number is different, removes this Outer network other parts it is then identical as thick PlainFPN network structure.Last each thin PlainFPN network exports correspondence portion The three-dimensional coordinate of position characteristic point.

According to the size at each position of face and point cloud scale, four kinds of sampling policies are separately designed, each piecemeal is corresponding The sampling policy of thin PlainFPN network is as follows:

Each x-con layers of the average sample points of nose piecemeal respectively 738,384,128,128, eyes and eyebrow Average sample points are identical, are all 400,256,128,128, and the average sample points of mouth are then 512,256,128,128 It is a.

In step (3), the detailed process of face three-dimensional feature point is obtained are as follows:

Point cloud data is input to thick PlainFPN network by (3-1), obtains the characteristic point being slightly aligned, including cheek edge spy Sign point and inter characteristic points；

The inter characteristic points being slightly aligned are inputted CascadeFPN by (3-2), are done cutting operation to point cloud data, are obtained eyebrow Four hair, eyes, nose and mouth piecemeals；

The point cloud data of (3-3) each piecemeal inputs CascadeFPN network together with corresponding inter characteristic points respectively In the corresponding thin PlainFPN network of each piecemeal, obtain the inter characteristic points of essence alignment.

The thick PlaneFPN network designed through the invention has been able to output more accurately three-dimensional feature point, including face Cheek Edge Feature Points and inter characteristic points.Since the local feature of the characteristic point at cheek edge is not obvious, noise is larger, is difficult Its precision is continued to improve.But for face inter characteristic points, using piecemeal cascade network CascadeFPN of the invention, Local geometric details can preferably be extracted by carrying out piecemeal to input point cloud, and by slightly to the essence for continuing to optimize characteristic point of essence Degree, can finally obtain more accurately as a result,

The invention also discloses a kind of human face characteristic point real-time detecting system based on three-dimensional reconstruction, including the shifting with camera Dynamic terminal, the mobile terminal further include:

Geometrical reconstruction module, the facial image frame acquired using camera construct face 3-D geometric model in real time；

Preprocessing module is aligned and is cut to face 3-D geometric model, and face point cloud data is exported；

Three-dimensional feature point detection module carries out the detection of three-dimensional feature point to face point cloud data, obtains face three-dimensional feature Point；

The three-dimensional feature point detection module includes by thick PlainFPN network and piecemeal cascade network CascadeFPN group At FacePointNet；The thick PlainFPN network is used to detect the cheek Edge Feature Points of face and the inside of rough grade Characteristic point；The CascadeFPN include: point cloud segmentation module for dividing face inter characteristic points and with it is every after segmentation Several corresponding thin PlainFPN networks of a piecemeal.

The present invention proposes FacePointNet for the detection of face three-dimensional feature point, and FacePointNet is by thick PlainFPN and piecemeal cascade network CascadeFPN composition is respectively intended to the cheek Edge Feature Points and face of detection face Inter characteristic points.Wherein, CascadeFPN is according to the cascade thinking of piecemeal, the subtended network configuration of design, by slightly to essence Detect the three-dimensional feature point inside face.Compared with the face three-dimensional feature point detecting method of current mainstream, accuracy rate is higher, and And real-time is also higher.

Detailed description of the invention

Fig. 1 is a kind of flow diagram of the human face characteristic point real-time detection method based on three-dimensional reconstruction of the present invention；

Fig. 2 is the flow diagram using geometrical reconstruction algorithm real-time reconstruction geometrical face model；

Fig. 3 is face Spacial domain decomposition schematic diagram, wherein (a) is two dimensional character point, (b) is three-dimensional feature point, (c) For bounding box cube；

Fig. 4 is the thick PlainFPN schematic network structure in FacePointNet of the present invention；

Fig. 5 is the CascadeFPN schematic network structure in FacePointNet of the present invention；

Fig. 6 is the corresponding thin PlainFPN schematic network structure of four piecemeals in CascadeFPN, wherein (a) is The thin PlainFPN network of nose piecemeal is (b) the thin PlainFPN network of eyes piecemeal, is (c) the thin of eyebrow piecemeal PlainFPN network is (d) the thin PlainFPN network of mouth piecemeal；

Fig. 7 is that face inter characteristic points are slightly aligned the comparison diagram being aligned with essence, wherein (a) is that thick PlainFPN is slightly aligned As a result, (b) being aligned result for CascadeFPN essence；

Fig. 8 is characterized the mark relation schematic diagram a little in faceform.

Specific embodiment

The invention will be described in further detail with reference to the accompanying drawings and examples, it should be pointed out that reality as described below It applies example to be intended to convenient for the understanding of the present invention, and does not play any restriction effect to it.

As shown in Figure 1, for a kind of process signal of human face characteristic point real-time detection method based on three-dimensional reconstruction of the present invention Figure, comprising:

Step (1) acquires facial image frame, utilizes geometrical reconstruction algorithm real-time reconstruction geometrical face model.

As shown in Fig. 2, needing first to execute two-dimension human face alignment algorithm, detection to picture when rebuilding geometrical face model Real-time human face posture and two dimensional character point.Judge whether face is aligned according to current human face posture, when face is not initial When alignment, system meeting feedback prompts user carries out the adjustment of posture.After human face posture alignment, Spacial domain decomposition behaviour is executed Make, using two dimensional character point is detected, in conjunction with depth frame, the bounding box cube of face is marked off, to initialize real-time geometry It rebuilds.Then, real-time geometrical reconstruction is executed, geometrical face model data are exported.Detailed process is as follows:

(1-1) face initial alignment

The present embodiment utilizes existing offset dynamic expression (Displaced Dynamic Expression, abbreviation DDE) Two-dimension human face alignment algorithm does the initial alignment of face.By inputting continuous rgb video, DDE algorithm can calculate in real time The information such as the position of face, size, posture, characteristic point in image are obtained, in Fig. 3 shown in (a).According to obtained face position It sets, size, posture, system can prompt user to adjust corresponding posture in real time, to complete the operation of face initial alignment. In mobile terminal, DDE algorithm provides the SDK of commercial version by FaceUnity company, and when realization directly calls it.

(1-2) Spacial domain decomposition

After completing face initial alignment, using the first frame RGB-D frame of inflow as reference frame, need to rebuild for dividing Bounding box area of space, to initialize three-dimensional reconstruction.Remember that the RGB frame is I₀, depth frame D₀, it is somebody's turn to do from DDE algorithm is available The face two dimensional character point set P of frame_2d∈R², each two dimensional character point p therein_2dIt is all I₀On a two dimensional image coordinate, In Fig. 3 shown in (a).Due to P_2dOnly characteristic point is in two-dimensional image I₀On coordinate, need to utilize and I₀The depth frame D of alignment₀ With camera internal reference matrix, calculating and P_2dCorresponding three-dimensional feature point set P_3d∈R³, for determining face area of space, such as Fig. 3 In shown in (b).Assuming that a two dimensional character point p_2d∈P_2d, corresponding p can be calculated_3d∈P_3d

p_3d=D₀(p_2d)K^-1[p_2d ^T,1]^T (1)

Wherein, K is 3 × 3 camera internal reference matrix.

But in most cases, depth value of the depth frame of consumer level camera at edge contour is usually missing from.Such as In Fig. 3 shown in (b), what the depth value of the gray area in depth frame was missing from.It is thus impossible to use entire P_2dPoint set, only The area of space of face can be determined using the characteristic point inside face.Firstly, agreement is horizontally to the right x-axis, it is straight up y Axis, just facing outwardly is z-axis.Three-dimensional feature point set P by the calculating of formula (1), inside available face_3d.Then, it calculates P_3dBounding box cuboid (x_min,x_max,y_min,y_max,z_min,z_max), in Fig. 3 shown in (b).Finally, according to face geometric proportion Example it is prior-constrained, mark off the bounding box cube (c where face_x,c_y,c_z,c_w), in Fig. 3 shown in (c).Wherein (c_x, c_y,c_z) indicate cube center, c_wIndicate that the side length of cube, calculating process are as follows

c_w=2.2 (y_max-y_min) (2)

c_x=0.5x_min+0.5x_max (3)

c_y=0.2y_min+0.8y_max (4)

c_z=y_max-0.5c_w+Δ (5)

Wherein, Δ is a fixed small offset, prevent nose partly because error and cut, taken when we realize Δ=15mm.Geometry mould can adaptively be adjusted for the user of different shapes of face by carrying out Spacial domain decomposition by above-mentioned calculating The reconstruction precision of type.

(1-3) real-time geometrical reconstruction

By above-mentioned formula (2)~(5) calculating, the bounding box cubical area (c where good person's face is divided_x,c_y,c_z, c_w) after, real-time geometrical reconstruction is carried out using KinectFusion algorithm.Firstly, doing uniform body to the bounding box cube Then plain grid dividing is constantly updated with initializing the TSDF voxel grid of KinectFusion according to incoming depth frame TSDF value obtains the face 3-D geometric model of continuous updating.

Step (2), pre-processes geometrical face model, extracts face point cloud data.

Firstly, it is necessary to which faceform to be detected slightly to be snapped to the coordinate space of the faceform of a standard.It can be with First with the two-dimentional inter characteristic points of two-dimension human face alignment algorithm rough estimate face, and it is mapped as rough three-dimensional feature point. First frame when using geometrical reconstruction is as reference frame, the two-dimension human face characteristic point coordinate of available first frame.And depth map is then It renders to obtain by the camera pose of reconstructed faceform and first frame, then can use step (1-2) area of space Transform method in division obtains rough face interior three-dimensional characteristic point.Then, the faceform of a standard is taken, phase is marked Corresponding characteristic point is converted by the three-dimensional space calculated between two groups of characteristic points, so that it may by face mould to be detected Type slightly snaps to the coordinate space of standard faces model.Then, taking nose is the centre of sphere, and 80~100mm is the spherical surface of radius to people Face is cut.Data after cutting, by the input as FacePointNet network proposed by the present invention, by slightly to the inspection of essence Survey face three-dimensional feature point.

Face point cloud data input FacePointNet network is carried out the detection of face three-dimensional feature point, obtained by step (3) Face three-dimensional feature point.

A kind of intuitive method for obtaining three-dimensional feature point is that two-dimension human face characteristic point is mapped to three-dimensional, obtains face Three-dimensional feature point.But the shortcomings that this method based on two dimensional image, is also fairly obvious, cannot adequately be believed using three-dimensional geometry Breath, and to light sensitive.Therefore, in order to make full use of the three-dimensional geometric information of face, we are by the point cloud of faceform Data on the basis of PointCNN, devise FacePointNet as input, by slightly to the detection three-dimensional feature point of essence. FacePointNet is made of two networks: thick PlainFPN and piecemeal cascade network CascadeFPN is respectively intended to detection people The cheek Edge Feature Points of face and the inter characteristic points of face.

Firstly, the present invention devises a thick PlainFPN network using X-Conv convolution, structure is as shown in Figure 4.Slightly The input of PlainFPN network is the face point cloud data V ∈ R by obtaining after pre-processing³If input includes n vertex, then The input of network is the one-dimensional vector of n × 3.If feature point set to be detected is combined into V_m∈R³, include m characteristic point, then net The output of network is the one-dimensional vector Y={ y of m × 3₁,y₂,…,y_3m}.If y_iIndicate match value, y_i ^*Indicate true value, network Loss function is

In fact, both can detecte the cheek Edge Feature Points of face using thick PlainFPN network, while can also examine Survey the inter characteristic points of face.Moreover, the inter characteristic points detected using thick PlainFPN, result is than a part of mainstream The obtained result of method it is more preferable.But for face inter characteristic points, it is intended that obtain more accurate result.From grade The thinking of connection Regressive Solution problem is set out, can be by the result of thick PlainFPN network as thick snap point.It is believed that essence is right Neat characteristic point is distributed in thick alignment neighborhood of a point certainly, and progress neighborhood search can be concentrated to obtain in original input point It arrives.In addition, local geometric details can preferably be extracted by carrying out piecemeal to input point cloud.Based on above-mentioned observation, present invention design CascadeFPN network, by slightly to the precision for continuing to optimize characteristic point of essence.In addition, the office of the characteristic point due to cheek edge Portion's feature is not obvious, and noise is larger, and experiment shows to be difficult to continue to improve to its precision.Therefore, CascadeFPN network needle The detection accuracy of the inter characteristic points of face is improved.

As shown in figure 5, being the CascadeFPN network that the present invention designs in dotted line frame.Firstly, face point cloud data is defeated Enter thick PlainFPN network, obtains the characteristic point being slightly aligned.Using the characteristic point being slightly aligned, segmentation behaviour is done to input point cloud data Make, obtains 4 piecemeals: eyebrow, eyes, nose, mouth.The operation of cut-point cloud, actually in order to preferably extract face Local geometric details.For the point cloud of each piecemeal together with corresponding thick characteristic point, combination constitutes the input of next step network.To this 4 piecemeals separately design the thin PlainFPN network of different parameters, carry out smart alignment operation.

As shown in fig. 6, being the thin PlainFPN network of each block design, (a) is the thin PlainFPN net of nose piecemeal Network is (b) the thin PlainFPN network of eyes piecemeal, (c) is the thin PlainFPN network of eyebrow piecemeal, (d) is mouth piecemeal Thin PlainFPN network.Reason for this is that the geometrical characteristics of each piecemeal to be different, and needs with different Network extracts corresponding feature.Wherein, the logical construction of 4 networks shown in fig. 6 is identical, corresponding to be Nose, eyes, eyebrow and mouth piecemeal.But the parameter of each layer is different in network.

It is slightly aligned the comparing result being aligned with CascadeFPN essence as Fig. 7 illustrates thick PlainFPN, (a) is thick in figure It is that PlainFPN is slightly aligned as a result, (b) for CascadeFPN essence alignment result.From comparison as can be seen that CascadeFPN essence Alignment is obviously improved the characteristic point result of each piecemeal.Not very for eyebrow and this surface geometry details of eyes Position outstanding, the Topology connection that the characteristic point of essence alignment is constituted are more more natural than thick alignment.It is this for mouth and nose The characteristic point at surface geometry details position very outstanding, essence alignment is tended to draw close to the edge of geometric catastrophe, obtains more smart True result.Wherein, the lower edge of nose and mouth improvement is especially apparent.

It will be of the invention on disclosed human face three-dimensional model data set BU3DFE for the validity for verifying the method for the present invention The FacePointNet of proposition and the two methods of current mainstream compare, and are analyzed the execution efficiency of algorithm.

BU3DFE data set includes 100 people totally 2500 scan models.Each subject acquires 1 neutral expression altogether With 6 setting expressions, each setting expression has 4 strength levels, therefore each subject acquires 25 scan models, often A model about includes 35000 vertex.The data set also has 85 characteristic points marked manually, as true value.Therefore, may be used To test feature point detection algorithm of the invention on the data set, and compared with existing method.In an experiment, we It was found that wherein 2 characteristic points marks in 85 of BU3DFE mark characteristic points and incorrect therefore of the invention experiment only takes Remaining correct 83 characteristic point.Experiment takes the data of wherein 80 people totally 2000 models as training set, remaining 20 people totally 500 The data of a model are as test set.

FacePointNet is tested on PC, the PC hardware configuration of test be tetra- core CPU of Xeon3.40GHz, 16GB memory, Nvidia GTX 1080GPU.

Firstly, carrying out data enhancement operations to training dataset, rotation (Rotation) angle is used's Gaussian Profile, scaling (Scale) ratio use N (1,0.1²) Gaussian Profile, while additionally use coordinate shake (Jitter). For Fig. 4 and PlainFPN network structure shown in fig. 6, each of these layer of X-Conv convolution operation, have one to upper one The process of the resampling of the input data of layer.Wherein, the method for sampling uses farthest point sampling.This is because of the invention Facial feature points detection task it is higher to the Even distribution entails of point.Then, model training uses Adam algorithm, training parameter As shown in table 1.

Table 1

Mark relationship of each characteristic point in faceform is as shown in figure 8, the following table 2 illustrates existing method and this hair The Comparative result of bright FacePointNet method.Wherein, method 1 is paper " Shape-based automatic The method proposed in detection of a large number of 3D facial landmarks "；Method 2 is paper “Fully automated and highly accurate dense correspondence for facial The method proposed in surfaces ".In table, Mean is average error, and SD is mean square deviation, and unit is mm, and numerical value is smaller, is indicated The precision of detection is higher.

Table 2

Comparison the results show that thick being aligned as a result, ratio method 1 of obtaining of the thick PlainFPN network that the present invention designs It is more acurrate, and the effect of mean of access 2.And CascadeFPN network proposed by the present invention as a result, essence alignment after three-dimensional Characteristic point mean error only has 1.77mm, and ratio method 2 and thick PlainFPN are improved very much, averagely improved respectively 38.54% and 50.56%.

Moreover, method 2 also uses data texturing when solving characteristic point.FacePointNet of the invention is then only used Three-dimensional geometry data, but result is more preferable.The experiment proves that the CascadeFPN based on piecemeal cascade thinking that the present invention designs There is significant effect to the essence alignment of characteristic point.FacePointNet proposed by the present invention is that a kind of advanced face is three-dimensional special Sign point detection algorithm.

Under above-mentioned experimental setup, single characteristic point detection time of the FacePointNet on PC is about 3.61 seconds, Real-time is higher.Wherein, the thick PlainFPN time-consuming of thick alignment procedure is about 1.28 seconds, and the CascadeFPN time-consuming of essence alignment is about 2.33 seconds, efficiency with higher.

Technical solution of the present invention and beneficial effect is described in detail in embodiment described above, it should be understood that Above is only a specific embodiment of the present invention, it is not intended to restrict the invention, it is all to be done in spirit of the invention Any modification, supplementary, and equivalent replacement, should all be included in the protection scope of the present invention.

Claims

1. a kind of human face characteristic point real-time detection method based on three-dimensional reconstruction characterized by comprising

(2) geometrical face model is pre-processed, extracts face point cloud data；

(3) input of face point cloud data is made of thick PlainFPN network and piecemeal cascade network CascadeFPN In FacePointNet, the detection of face three-dimensional feature point is carried out, face three-dimensional feature point is obtained；

The thick PlainFPN network is used to detect the cheek Edge Feature Points of face and the inter characteristic points of rough grade, described Face point cloud data is divided into several piecemeals according to the inter characteristic points of rough grade by CascadeFPN, and each point of detection respectively The inter characteristic points of the thin precision of block.

2. the human face characteristic point real-time detection method according to claim 1 based on three-dimensional reconstruction, which is characterized in that step (1) in, the facial image recognition frame of acquisition is the consecutive image frame of human face posture alignment.

3. the human face characteristic point real-time detection method according to claim 1 based on three-dimensional reconstruction, which is characterized in that step (1) in, the geometrical reconstruction algorithm uses KinectFusion algorithm, specific reconstruction process are as follows:

(1-1) executes Spacial domain decomposition operation to facial image, detects the two dimensional character point of facial image, in conjunction with depth frame, Mark off the bounding box cube of face；

(1-2) realizes KinectFusion algorithm in obtained bounding box cube, carries out real-time face three-dimensional geometry weight It builds.

4. the human face characteristic point real-time detection method according to claim 1 based on three-dimensional reconstruction, which is characterized in that step (2) in, the pretreated detailed process are as follows:

Geometrical face model to be detected is slightly snapped to the coordinate space of the faceform of a standard first；Then, nose is taken Point is the centre of sphere, and 80~100mm is that the spherical surface of radius cuts face, and face point cloud data is obtained after cutting.

5. the human face characteristic point real-time detection method according to claim 1 based on three-dimensional reconstruction, which is characterized in that step (3) in, it is two modules that the thick PlainFPN, which is divided to, and one is used to detect the cheek Edge Feature Points of face, another use In the face inter characteristic points of detection rough grade；

Each module includes 5 x-conv layers, and every layer of average sample points are respectively 1024,384,128,128,128, An average pond layer is connected after x-conv layers to be later input to this feature for extracting the average characteristics of each sampled point The multi-layer perception (MLP) being made of three layers of full articulamentum finally exports the three-dimensional coordinate of human face characteristic point.

6. the human face characteristic point real-time detection method according to claim 1 based on three-dimensional reconstruction, which is characterized in that step (3) in, the piecemeal cascade network CascadeFPN include: for divide the point cloud segmentation module of face inter characteristic points with And several thin PlainFPN networks corresponding with piecemeal each after segmentation；

Each thin PlainFPN network includes 4 layers x-conv layers, separately designs and adopts according to the size of each piecemeal and point cloud scale Sample strategy, the multi-layer perception (MLP) that an average pond layer is sequentially connected after x-conv layers and is made of three layers of full articulamentum, most Each thin PlainFPN network exports the three-dimensional coordinate of corresponding blocking characteristic point afterwards.

7. the human face characteristic point real-time detection method according to claim 6 based on three-dimensional reconstruction, which is characterized in that described Face inter characteristic points are divided into four nose, eyes, eyebrow and mouth piecemeals by point cloud segmentation module；Each piecemeal is corresponding thin The sampling policy of PlainFPN network is as follows:

Nose piecemeal, each x-conv layers of average sample points are respectively 738,384,128,128；Eyes and eyebrow piecemeal In, it is all 400,256,128,128 that each x-conv layers of average sample points are identical；In mouth piecemeal, each x-conv The average sample points of layer are 512,256,128,128.

8. the human face characteristic point real-time detection method according to claim 1 based on three-dimensional reconstruction, which is characterized in that step (3) in, the detailed process of face three-dimensional feature point is obtained are as follows:

Point cloud data is input to thick PlainFPN network by (3-1), obtains the characteristic point being slightly aligned, including cheek Edge Feature Points And inter characteristic points；

The inter characteristic points being slightly aligned are inputted CascadeFPN by (3-2), are done cutting operation to point cloud data, are obtained eyebrow, eye Four eyeball, nose and mouth piecemeals；

The point cloud data of (3-3) each piecemeal inputs every in CascadeFPN network respectively together with corresponding inter characteristic points The corresponding thin PlainFPN network of a piecemeal, obtains the inter characteristic points of essence alignment.

9. a kind of human face characteristic point real-time detecting system based on three-dimensional reconstruction, including the mobile terminal with camera, feature exists In the mobile terminal further include:

The three-dimensional feature point detection module includes being made of thick PlainFPN network and piecemeal cascade network CascadeFPN FacePointNet；The thick PlainFPN network is used to detect the cheek Edge Feature Points of face and the internal feature of rough grade Point；The CascadeFPN includes: point cloud segmentation module for dividing face inter characteristic points and each divides with after segmentation Several corresponding thin PlainFPN networks of block.