CN110276289A

CN110276289A - Generate the method and human face characteristic point method for tracing of Matching Model

Info

Publication number: CN110276289A
Application number: CN201910523891.1A
Authority: CN
Inventors: 王喆; 许清泉; 张伟; 洪炜冬; 曾志勇
Original assignee: Xiamen Meitu Technology Co Ltd
Current assignee: Xiamen Meitu Technology Co Ltd
Priority date: 2019-06-17
Filing date: 2019-06-17
Publication date: 2019-09-24
Anticipated expiration: 2039-06-17
Also published as: CN110276289B

Abstract

The invention discloses the method and human face characteristic point method for tracing that generate Matching Model, Matching Model is suitable for carrying out Feature Points Matching to the facial image for having marked characteristic point.Wherein, the method for generating Matching Model includes: to generate the image block centered on characteristic point based on the facial image for having marked characteristic point；Manual tag figure is generated based on described image block；Manual tag figure is inputted in the convolutional neural networks of pre-training, to export prediction label figure；And based on the penalty values between manual tag figure and prediction label figure, the training convolutional neural networks, using the network after training as the Matching Model generated.The program can be improved the accuracy of Feature Points Matching, to improve the stability and accuracy of face characteristic point tracking.

Description

Generate the method and human face characteristic point method for tracing of Matching Model

Technical field

The present invention relates to depth learning technology fields, the especially method of generation Matching Model and face characteristic point tracking side Method.

Background technique

Facial modeling is also known as face alignment, and target, which is oriented, can describe face feature in face picture Point, such as canthus, nose, the corners of the mouth, chin.Currently, facial modeling has had the method for many maturations in static images, Such as supervise gradient descent method, local binarization method.But the tracking for human face characteristic point in dynamic picture such as video flowing Positioning, due to not accounting for the timing of before and after frames in video flowing, it will usually serious characteristic point shake occur.

With the real-time filter of U.S. face camera, makeups function and augmented reality application, the real-time tracing of human face characteristic point Technology is particularly important.Face tracking algorithm or accurate but unstable at present, unstable rule leads to the shake of Real-time Special Effect；? Stablize but inaccurate, position that is inaccurate then leading to special efficacy is deviateed.Limited partial model is a kind of facial modeling algorithm, It is made the characteristic point on each average face scan for matching on its neighborhood position and has been come by the position of initialization average face It is detected at face point.Wherein, local detectors scanned picture is used first, generates a response for everyone face characteristic point Figure is then based on response diagram and optimizes a Global Face shape, to obtain the primary update of human face characteristic point.Limited office Portion's model is made of local expert and global shape model, wherein local expert instructs usually using feature and support vector machines Practice.Since feature and learning method are weaker, part unconspicuous for feature, such as the key point on lip and facial contour, Generally occur within trace error.

In consideration of it, the method for needing a kind of pair of characteristic point to position in real time, can be not only stable but also accurately carry out characteristic point and chase after Track.

Summary of the invention

For this purpose, the present invention provides generate Matching Model method and human face characteristic point method for tracing, with try hard to solve or Person at least alleviates at least one existing problem above.

According to an aspect of the invention, there is provided it is a kind of generate Matching Model method, wherein Matching Model be suitable for pair The facial image for having marked characteristic point carries out Feature Points Matching.This method can execute in calculating equipment, be primarily based on and marked The facial image of characteristic point is infused, the image block centered on characteristic point is generated.It is then based on image block, generates manual tag figure. Then, manual tag figure is inputted in the convolutional neural networks of pre-training, to export prediction label figure.Finally, based on artificial The penalty values of loss function between label figure and prediction label figure, training convolutional neural networks, using the network after training as The Matching Model of generation

Optionally, in the above-mentioned methods, manual tag figure includes positive sample region, negative sample region and inactive area.

Optionally, in the above-mentioned methods, convolutional neural networks may include convolutional neural networks include process of convolution layer and Warp lamination, wherein process of convolution layer can carry out convolution, activation, pondization processing to the image block of input, special with output prediction Levy the probability of point.Warp lamination can classify to the probability of predicted characteristics point, to export prediction label figure.

It optionally, in the above-mentioned methods, can be based on the power of area pixel point each in manual tag figure and prediction label figure Weight, calculates the penalty values of loss function.Penalty values based on loss function adjust convolutional neural networks using back-propagation algorithm Parameter, until penalty values be less than predetermined threshold.

Optionally, in the above-mentioned methods, loss function is binary cross entropy loss function, calculates loss based on following formula The penalty values of function:

Wherein, i is the index of pixel, predict_iIndicate prediction label figure, lable_iIndicate the picture in manual tag figure Vegetarian refreshments, w_iIndicate the weight of pixel.

Optionally, in the above-mentioned methods, the weight of positive sample area pixel point and be 1, the power of negative sample area pixel point Weight and be 1, the weight of inactive area is 0.

According to another aspect of the present invention, a kind of human face characteristic point method for tracing is provided, can be held in calculating equipment Row, in the method, can be primarily based on shape and initialize to facial image to be detected, obtain face characteristic point diagram Picture.Then, multiple image blocks comprising characteristic point are obtained from face characteristic point image.It then, will be after image block input training Matching Model, with export prediction matching characteristic point.Wherein Matching Model is generated using the method for above-mentioned generation Matching Model.

Optionally, in the method, shape can be constructed based on the facial video image frame for having marked characteristic point.

It optionally, in the method, can be based on the characteristic point vector sum present frame facial image of previous frame facial image Characteristic point vector, obtain change in shape matrix.Then, principal component analysis is carried out to change in shape matrix, obtains change in shape Weight vectors.Finally, the weight vectors based on change in shape matrix and change in shape, complete the building of shape.

According to another aspect of the invention, a kind of calculating equipment is provided, comprising: one or more processors；And storage Device；One or more programs, wherein one or more programs store in memory and are configured as being handled by one or more Device executes, and one or more programs include the instruction for either executing in method as described above method.

In accordance with a further aspect of the present invention, a kind of computer-readable storage medium for storing one or more programs is provided Matter, one or more programs include instruction, and instruction is when calculating equipment execution, so that calculating equipment executes method as described above In either method.

According to the solution of the present invention, by being fused to local restricted model for convolutional neural networks as local expert model Frame in, can be improved the accuracy of Feature Points Matching.

Detailed description of the invention

To the accomplishment of the foregoing and related purposes, certain illustrative sides are described herein in conjunction with following description and drawings Face, these aspects indicate the various modes that can practice principles disclosed herein, and all aspects and its equivalent aspect It is intended to fall in the range of theme claimed.Read following detailed description in conjunction with the accompanying drawings, the disclosure it is above-mentioned And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical appended drawing reference generally refers to identical Component or element.

Fig. 1 shows the schematic diagram of the limited partial model of tradition；

Fig. 2 shows the organigrams according to an embodiment of the invention for calculating equipment 100；

Fig. 3 shows the flow diagram of the method 300 according to an embodiment of the invention for generating Matching Model；

Fig. 4 shows the schematic diagram of positive sample according to an embodiment of the invention and negative sample；

Fig. 5 shows the schematic diagram of manual tag figure according to an embodiment of the invention；

Fig. 6 shows the schematic diagram of prediction label figure according to an embodiment of the invention；

Fig. 7 shows the schematic flow chart of human face characteristic point method for tracing 700 according to an embodiment of the invention.

Specific embodiment

Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.

Traditional limited partial model is made of local expert and global shape model.Whole process is divided into model construction rank Section and point fitting stage.The model construction stage is divided into global shape model construction and Partial matched model building again.Shape Building is to generate a point spread function using active shape model ASM, will indicate several key feature points of body form Coordinate is connected into a vector, this object is indicated with the vector.And Partial matched model is to adjacent around each characteristic point Domain is modeled, and a Feature Points Matching criterion is established, to determine the best match of characteristic point.Fig. 1 shows the limited office of tradition The building schematic diagram of portion's model.As shown in Figure 1, being based on points distribution models, a people can be initialized on the face detected Then face shape model allows each point to find optimal match point in its contiguous range.For example, can along edge direction into Various rudimentary matchings, the matching error rates such as row block matching, point matching are higher.

This programme improves the human face characteristic point method for tracing based on limited partial model, and convolutional neural networks are made It is fused in limited partial model for local expert, replaces pixel characteristic and support vector machines usually used in local expert, Weak characteristic point is accurately tracked.

Fig. 2 is the block diagram of Example Computing Device 100.In basic configuration 102, calculating equipment 100, which typically comprises, is System memory 106 and one or more processor 104.Memory bus 108 can be used for storing in processor 104 and system Communication between device 106.

Depending on desired configuration, processor 104 can be any kind of processor, including but not limited to: micro process Device (μ P), microcontroller (μ C), digital information processor (DSP) or any combination of them.Processor 104 may include all Cache, processor core such as one or more rank of on-chip cache 110 and second level cache 112 etc 114 and register 116.Exemplary processor core 114 may include arithmetic and logical unit (ALU), floating-point unit (FPU), Digital signal processing core (DSP core) or any combination of them.Exemplary Memory Controller 118 can be with processor 104 are used together, or in some implementations, and Memory Controller 118 can be an interior section of processor 104.

Depending on desired configuration, system storage 106 can be any type of memory, including but not limited to: easily The property lost memory (RAM), nonvolatile memory (ROM, flash memory etc.) or any combination of them.System storage Device 106 may include operating system 120, one or more is using 122 and program data 124.In some embodiments, It may be arranged to be operated using program data 124 on an operating system using 122.In some embodiments, equipment is calculated 100 are configured as executing the method 300 or human face characteristic point method for tracing 700 for generating Matching Model, just wrap in program data 124 The instruction for executing above-mentioned each method is contained.

Calculating equipment 100 can also include facilitating from various interface equipments (for example, output equipment 142, Peripheral Interface 144 and communication equipment 146) to basic configuration 102 via the communication of bus/interface controller 130 interface bus 140.Example Output equipment 142 include graphics processing unit 148 and audio treatment unit 150.They can be configured as facilitate via One or more port A/V 152 is communicated with the various external equipments of such as display or loudspeaker etc.Outside example If interface 144 may include serial interface controller 154 and parallel interface controller 156, they, which can be configured as, facilitates Via one or more port I/O 158 and such as input equipment (for example, keyboard, mouse, pen, voice-input device, image Input equipment) or the external equipment of other peripheral hardwares (such as printer, scanner etc.) etc communicated.Exemplary communication is set Standby 146 may include network controller 160, can be arranged to convenient for via one or more communication port 164 and one A or multiple other calculate communication of the equipment 162 by network communication link.

Network communication link can be an example of communication media.Communication media can be usually presented as in such as carrier wave Or computer readable instructions, data structure, program module in the modulated data signal of other transmission mechanisms etc, and can To include any information delivery media." modulated data signal " can be such signal, one in its data set or Multiple or its change can be carried out in a manner of encoded information in the signal.As unrestricted example, communication media It may include the wired medium of such as cable network or private line network etc, and such as sound, radio frequency (RF), microwave, red Various wireless mediums including (IR) or other wireless mediums outside.Term computer-readable medium used herein may include Both storage medium and communication media.In some embodiments, one or more programs are stored in computer-readable medium, this It include the instruction for executing certain methods in a or multiple programs.

Calculating equipment 100 can be implemented as a part of portable (or mobile) electronic equipment of small size, these electronics are set It is standby to can be such as cellular phone, digital camera, personal digital assistant (PDA), personal media player device, wireless network Browsing apparatus, personal helmet, application specific equipment or may include any of the above function mixing apparatus.Certainly, it counts Calculate equipment 100 and also can be implemented as include desktop computer and notebook computer configuration personal computer, or have The server of above-mentioned configuration.Embodiments of the present invention to this with no restriction.

Fig. 3 shows the flow diagram of the method 300 according to an embodiment of the invention for generating Matching Model.Matching Model can be used for carrying out Feature Points Matching to the facial image for having marked characteristic point.Since Feature Points Matching is selected from image It takes certain characteristic points and partial analysis, and non-viewing whole image is carried out to image.Therefore facial image can be located in advance Reason, obtains the image block centered on each characteristic point.Then the image block used is as the training image of model.Below The process of the method 300 according to an embodiment of the present invention for generating Matching Model will be elaborated in conjunction with Fig. 3.

As shown in figure 3, method 300 starts from step S310.In step s310, based on the face figure for having marked characteristic point Picture generates the image block centered on characteristic point.

Training set for training convolutional neural networks includes positive sample and negative sample, such as in recognition of face, positive sample Originally it can be the picture of face, the selection of negative sample is just related to problem scenes.According to one embodiment of present invention, it can adopt Characteristic point is used to be in the image block at center as positive sample, the image block using characteristic point far from central point is as negative sample.Fig. 4 Show the schematic diagram of positive sample according to an embodiment of the invention and negative sample.As shown in figure 4, wherein A) be positive sample This, B) it is negative sample.But it since adjacent sample has substantial portion of overlapping region, can be made in training convolutional neural networks At repeating for overlapping region, operational efficiency is damaged.According to one embodiment of present invention, it can be used during characteristic point is in The larger sized image block of the heart is as training set.

Then in step s 320, it is based on image block, generates manual tag figure.

Manual tag figure includes positive sample region, negative sample region and inactive area.Fig. 5 shows according to the present invention one The schematic diagram of the manual tag figure of a embodiment.As shown in figure 5, black portions indicate that negative sample, white portion indicate positive sample, Several points at the center Fig. 5 are all white, and indicating that these pixels can be treated as is the characteristic point to be predicted.This is equivalent to do The data enhancing of positive sample.A circle grey is filled out among white area and black region, indicating that these pixels are not known is not It is the characteristic point to be predicted, these samples can be ignored, be not involved in training.

Then in step S330, by the convolutional neural networks of the manual tag figure generated in step S320 input pre-training In handled, to export prediction label figure.

According to one embodiment of present invention, convolutional neural networks may include process of convolution layer and warp lamination.Convolution Process layer can carry out convolution, activation, pondization processing to the image of input, to obtain predicting the probability of each characteristic point.Instead Convolutional layer can classify to the probability of predicted characteristics point, to export prediction label figure.

Table 1 shows the subnetwork structural parameters of Matching Model according to an embodiment of the invention.

Subnetwork structural parameters in 1 Matching Model of table

As shown in table 1, Matching Model includes process of convolution layer and warp lamination.Process of convolution layer include convolution, activation and Chi Hua.The activation primitives such as ReLU, tanh, sigmoid can be used in active coating, the continuous real value of input can be transformed to 0 and 1 Between output.Network parameter can be obtained by debugging, be not filled when process of convolution wherein, so Matching Model Size can determine the relationship output and input, so that the center of the corresponding output prediction label figure in the center of input picture.For example, The image of 20x20x3 can be input in the Matching Model of pre-training, the prediction label figure output of 14x14x1 will be obtained. Warp lamination can up-sample the probability of the predicted characteristics point of output, to carry out on the characteristic pattern of up-sampling by picture Element classification, final output prediction label figure.The Matching Model that this programme provides can receive the image block input of arbitrary dimension, lead to It crosses and up-sampling is carried out to the probability value of output and is classified pixel-by-pixel, to export prediction label figure.

Finally in step S340, based on the penalty values of loss function between manual tag figure and prediction label figure, training Convolutional neural networks, using the network after training as the Matching Model generated.

It according to one embodiment of present invention, can be based on area pixel point each in manual tag figure and prediction label figure Weight calculates the penalty values of loss function.Then, penalty values are based on, back-propagation algorithm adjustment convolutional neural networks are used Parameter, until penalty values are less than predetermined threshold.

The process of training convolutional neural networks is divided into two stages.First stage is from low level to high-level propagation Stage, i.e. propagated forward stage.Another stage is, when the result that propagated forward obtains with it is expected be not consistent when, by error from The high-level stage for propagate to bottom training, i.e. back-propagation phase.The image block array of input passes through process of convolution Output prediction label figure is obtained after layer, the processing of pond layer, warp lamination；Then it calculates between prediction label figure and manual tag figure The penalty values of the loss function of building；When penalty values are greater than desired value, full articulamentum, down-sampling are adjusted based on gradient descent method The parameter of layer, convolutional layer.Until terminating training when penalty values are equal to or less than desired value.This is to those skilled in the art For belong to contents known, it will not go into details herein.

Matconvnet or pytorch frame can be used to convolutional Neural net in a kind of implementation according to the present invention Network is trained, and wherein Matconvnet is the convolutional neural networks realized with Matlab.The model of pre-training is loaded first, so The loss function in frame is improved afterwards:

Wherein, i is the index of pixel, predict_iIndicate prediction label figure, lable_iIndicate the picture in manual tag figure Vegetarian refreshments, w_iIndicate the weight of pixel, binary cross entropy loss function can be used in loss function loss, indicates negative sample using -1 This, 0 indicates to ignore, and 1 indicates positive sample.If indicating negative sample using the cross entropy loss function of pytorch, 0,1 is indicated Positive sample, -100 indicate to ignore.Cross entropy is also smaller, and the output of model and desired output are closer.

In order to guarantee the data balancing of positive negative sample, increase weight coefficient in loss function.Fig. 6 is shown according to this hair The schematic diagram of the prediction label figure of bright one embodiment.As shown in fig. 6, entire characteristic pattern is 13x13, gray area is 7x7, White is 5 pixels.The number of so negative sample is 13x13-7x7=120, and the number of positive sample is 5.If do not weighted, meeting There is the Loss of Loss and 120 negative sample of 5 positive samples.In this way training when, network more can pay attention to allowing negative sample not Want misclassification.As a result the prediction for being exactly network is essentially all black (being negative sample entirely).Therefore, in addition weight seeks to positive sample Total Loss size and negative sample total Loss size it is almost the same.

It can make the weight of positive sample area pixel point and for 1, the weight of negative sample area pixel point and be 1.With positive sample For this, 0.2*Loss_ positive sample 2+0.2*Loss_ positive sample 3+0.2*Loss_ positive sample 4+0.2*Loss_ positive sample 5= The positive sample of 0.2* (Loss_ positive sample 2+Loss_ positive sample 3+Loss_ positive sample 4+Loss_ positive sample 5)=0.2*5 The Loss of Loss==1 positive sample.The weight of negative sample is 1/120==0.0083.If the training frame used is supported Indicate to ignore the sample using 0 (such as matconvnet) or -100 (such as pytorch) in label figure, then weight The weight of figure grey area is filled out any value and is ok.If it does not, the region of grey must fill out 0 in weight map, to ignore It calculates.

An implementation according to the present invention, using minimum convolutional neural networks, such as input 17x17x3- > 13x13x8- > 11x11x4- > 9x9x1 output, and the optimization by carrying out single-instruction multiple-data stream (SIMD) to matconvnet, effectively Ground improves operational efficiency.

So far, the process of method 300 terminates.300 feature is carried out using the method for deep learning according to the method for the present invention Point matching, the method compared to tradition based on feature and support vector machines can be improved the accuracy and efficiency of Feature Points Matching. And based on more mature convolutional neural networks training framework, the accuracy of Feature Points Matching can be further increased.

It should be noted that the Matching Model that generates above and be limited to human face characteristic point matching, be also equally applicable to cat face, The application scenarios of the Feature Points Matchings such as dog face, it is not limited here.

A kind of method that human face characteristic point is tracked using above-mentioned matching result introduced below.Fig. 7 shows basis The schematic flow chart of the human face characteristic point method for tracing 700 of one embodiment of the invention.

With method 300, the finger for executing method 700 is stored in the program data of calculating equipment 100 as shown in Figure 2 It enables, so that calculating equipment 100 executes this method 700.As shown in fig. 7, method 700 starts from step S710, it is based on shape pair Facial image initialization to be detected, obtains face characteristic point image.

According to one embodiment of present invention, shape can be constructed based on the facial video image frame for having marked characteristic point Model.Wherein, firstly, the characteristic point vector of the characteristic point vector sum present frame facial image based on previous frame facial image, is obtained Obtain change in shape matrix.Then, principal component analysis is carried out to change in shape matrix, obtains the weight vectors of change in shape.Finally, Weight vectors based on change in shape matrix and change in shape, complete the building of shape.

Assuming that there is M picture, every picture has N number of characteristic point, and the coordinate of each characteristic point is assumed to be (x_i, y_i), a figure Vector x=[the x of the coordinate composition of the N number of characteristic point of on piece₁, y₁, x₂, y₂... x_n, y_n] indicate shape can use it is as follows Formula one indicates:

X=x '+PB

Wherein, x indicates that present frame characteristic point vector, x ' expression previous frame characteristic point vector, P are change in shape matrix.With The characteristic point vector of each frame image subtracts the characteristic point vector of previous frame image, just obtains a change in shape matrix X.Then To XX ' carry out principal component analysis, the decisive ingredient of change in shape is obtained, i.e. feature vector Pj and corresponding characteristic value enter J, K feature vector formed shape transformation matrices P in a manner of arranging discharge before selecting.These feature vectors are exactly all samples in fact The base of this transformation can state any variation in sample.There are change in shape matrix P, the B in above-mentioned formula one that can lead to Following formula two are crossed to obtain:

B=P^T(x-x′)。

B is the weight vectors of change in shape, determines which characteristic point plays key effect.So far shape is just completed Building, when giving a weight B a, so that it may face shape is initialized with above-mentioned formula two, so as to the face to acquisition Feature point image carries out local feature region matching.This belongs to contents known to one skilled in the art, refuses herein It repeats.

Then in step S720, multiple image blocks comprising characteristic point are obtained from face characteristic point image.Such as it can To choose the input of the image block using centered on each characteristic point as Matching Model.

Finally in step S730, by the Matching Model after the image block input training of selection, to export the matching of prediction Characteristic point.Wherein Matching Model is generated based on method 300.

So far, the process of method 700 terminates.According to the method for the present invention 700, by using the Matching Model after training as Local expert is fused in limited partial model, replaces pixel characteristic and support vector machines usually used in local expert, makes Obtaining weak characteristic point can accurately be tracked.Particularly with the characteristic point real-time tracing in video flowing, positioning feature point can be improved Anti-jitter ability, improve the stability and accuracy of tracing characteristic points.

To sum up, according to the solution of the present invention, initialization face characteristic point image is obtained by building shape first, so Matching Model is trained afterwards and obtains preferable matching result, so that the Matching Model based on generation carries out human face characteristic point Real-time location tracking.The Stability and veracity of positioning feature point can be improved in the program, improves the efficiency of tracing characteristic points.It is right In the application scenarios of some Real-time Special Effects, it is able to ascend user experience.

A4, the method as described in A3, wherein the step of training convolutional neural networks includes:

Based on the weight of area pixel point each in the manual tag figure and prediction label figure, the loss of loss function is calculated Value；

Based on the penalty values, the parameter of the convolutional neural networks is adjusted using back-propagation algorithm, until penalty values Less than predetermined threshold.

It should be appreciated that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, it is right above In the description of exemplary embodiment of the present invention, each feature of the invention be grouped together into sometimes single embodiment, figure or In person's descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. claimed hair Bright requirement is than feature more features expressly recited in each claim.More precisely, as the following claims As book reflects, inventive aspect is all features less than single embodiment disclosed above.Therefore, it then follows specific real Thus the claims for applying mode are expressly incorporated in the specific embodiment, wherein each claim itself is used as this hair Bright separate embodiments.

Those skilled in the art should understand that the module of the equipment in example disclosed herein or unit or groups Part can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example In different one or more equipment.Module in aforementioned exemplary can be combined into a module or furthermore be segmented into multiple Submodule.

Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.

In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed Meaning one of can in any combination mode come using.

Various technologies described herein are realized together in combination with hardware or software or their combination.To the present invention Method and apparatus or the process and apparatus of the present invention some aspects or part can take insertion tangible media, such as it is soft The form of program code (instructing) in disk, CD-ROM, hard disk drive or other any machine readable storage mediums, Wherein when program is loaded into the machine of such as computer etc, and is executed by the machine, the machine becomes to practice this hair Bright equipment.

In the case where program code executes on programmable computers, calculates equipment and generally comprise processor, processor Readable storage medium (including volatile and non-volatile memory and or memory element), at least one input unit, and extremely A few output device.Wherein, memory is configured for storage program code；Processor is configured for according to the memory Instruction in the said program code of middle storage executes method of the present invention.

By way of example and not limitation, computer-readable medium includes computer storage media and communication media.It calculates Machine readable medium includes computer storage media and communication media.Computer storage medium storage such as computer-readable instruction, The information such as data structure, program module or other data.Communication media is generally modulated with carrier wave or other transmission mechanisms etc. Data-signal processed passes to embody computer readable instructions, data structure, program module or other data including any information Pass medium.Above any combination is also included within the scope of computer-readable medium.

In addition, be described as herein can be by the processor of computer system or by executing by some in the embodiment The combination of method or method element that other devices of the function are implemented.Therefore, have for implementing the method or method The processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, Installation practice Element described in this is the example of following device: the device be used for implement as in order to implement the purpose of the invention element performed by Function.

As used in this, unless specifically stated, come using ordinal number " first ", " second ", " third " etc. Description plain objects, which are merely representative of, is related to the different instances of similar object, and is not intended to imply that the object being described in this way must Must have the time it is upper, spatially, sequence aspect or given sequence in any other manner.

Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited from It is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that Language used in this specification primarily to readable and introduction purpose and select, rather than in order to explain or limit Determine subject of the present invention and selects.Therefore, without departing from the scope and spirit of the appended claims, for this Many modifications and changes are obvious for the those of ordinary skill of technical field.For the scope of the present invention, to this Invent done disclosure be it is illustrative and not restrictive, it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims

1. a kind of method for generating Matching Model, the Matching Model is suitable for carrying out feature to the facial image for having marked characteristic point Point matching, the method are suitable for executing in calculating equipment, comprising steps of

Based on the facial image for having marked characteristic point, the image block centered on characteristic point is generated；

Based on described image block, manual tag figure is generated；

It will be handled in the convolutional neural networks of manual tag figure input pre-training, to export prediction label figure；With And

Based on the penalty values of loss function between the manual tag figure and prediction label figure, the convolutional neural networks are trained, Using the Matching Model by the network after training as generation.

2. the method for claim 1, wherein the convolutional neural networks include process of convolution layer and warp lamination,

The process of convolution layer is suitable for carrying out the image block of input convolution, activation, pondization processing, to export predicted characteristics point Probability；

The warp lamination is suitable for classifying to the probability of the predicted characteristics point, to export prediction label figure.

3. the method for claim 1, wherein the manual tag figure includes positive sample region, negative sample region and nothing Imitate region.

4. method as claimed in claim 3, wherein the loss function is binary cross entropy loss function, is based on following public affairs Formula calculates the penalty values of the loss function:

Wherein, i is the index of pixel, predict_iIndicate prediction label figure, lable_iIndicate the pixel in manual tag figure Point, w_iIndicate the weight of pixel.

5. method as claimed in claim 4, wherein the weight of the positive sample area pixel point and be 1, the negative sample area The weight of domain pixel and be 1, the weight of the inactive area is 0.

6. a kind of human face characteristic point method for tracing, suitable for being executed in calculating equipment, wherein the described method includes:

Facial image to be detected is initialized based on shape, obtains face characteristic point image；

Multiple image blocks comprising characteristic point are obtained from the face characteristic point image；

By the Matching Model after the input training of described image block, to export the matching characteristic point of prediction.

Wherein the Matching Model is generated using method according to any one of claims 1 to 5.

7. method as claimed in claim 6, wherein the described method includes:

Based on the facial video image frame for having marked characteristic point, the shape is constructed.

8. the method for claim 7, wherein the step of building shape includes:

The characteristic point vector of characteristic point vector sum present frame facial image based on previous frame facial image obtains change in shape square Battle array；

Principal component analysis is carried out to the change in shape matrix, obtains the weight vectors of change in shape；And

Based on the weight vectors of the change in shape matrix and change in shape, the building of the shape is completed.

9. a kind of calculating equipment, comprising:

One or more processors；With

Memory；

One or more programs, wherein one or more of programs are stored in the memory and are configured as by described one A or multiple processors execute, and one or more of programs include for executing in -8 the methods according to claim 1 The instruction of either method.

10. a kind of computer readable storage medium for storing one or more programs, one or more of programs include instruction, Described instruction is when calculating equipment execution, so that the equipment that calculates executes appointing in method described in -8 according to claim 1 The instruction of one method.