CN100389388C

CN100389388C - Screen protection method and apparatus based on human face identification

Info

Publication number: CN100389388C
Application number: CNB2006100122822A
Authority: CN
Inventors: 刘凯; 黄英
Original assignee: Vimicro Corp
Current assignee: Mid Star Technology Ltd By Share Ltd
Priority date: 2006-06-15
Filing date: 2006-06-15
Publication date: 2008-05-21
Anticipated expiration: 2026-06-15
Also published as: CN1862487A

Abstract

The present invention discloses a screen protecting method and a device based on human face certification. The method comprises the procedures: 1. the playing of background video files is used as the background of current screen protecting images; 2. video images captured by a video capture device are disassembled into multiplex video streams; 3. the multiplex video streams and the background video are synthesized and are displayed; 4. face recognition is carried out in the video images captured by the video capture device, comparison certification is carried out to a human face which is recognized and a base model stored in a data base, and screen protection is unlocked when certification accords with conditions. The present invention enriches the use of camera devices for users, users can experience pleasure and security brought by technologies of face recognition, background video, etc. in the process of the use of screen protection, and amusement lives of people are enriched greatly.

Description

Screen protection method and device based on face authentication

Technical field

The present invention relates to a kind of screen protection method and device, relate in particular to a kind of screen protection method and device based on face authentication.

Background technology

For screen protection method; often the method for using is to come view screen protection image by programmed control; present screen protection program is generally made at video, and the application of catching video for synthetic video capture device (as camera) seldom.

And along with the Internet video broad application, increasing people begins to use the first-class video capturing device of shooting and carries out activities such as online chat, and the application program that is directed to video capturing device simultaneously is also increasing.

On the other hand, as the needs of online chat function, people's face of the image person of being to use that camera captures usually, and face recognition technology also is a present emphasis in computer realm research.

Therefore, face recognition technology and screen protection method can be combined, become problem demanding prompt solution in the industry.

Summary of the invention

Technical matters to be solved by this invention is to provide a kind of screen protection method and device based on face authentication; image and background image that camera is captured synthesize; thereby screen protection is provided, simultaneously the facial image that captures is carried out face authentication, thereby unblank screen protection.

The invention provides a kind of screen protection method, comprise the steps: based on face authentication

(1) sets the condition that the video image that captures from video capturing device is started face authentication;

(2) play the background of background video file as current screen protection image;

(3) video image that video capture device is captured splits into multi-path video stream;

(4) described multi-path video stream and described background video are synthesized and show;

(5) whether the entry condition that detects step (1) setting satisfies; if satisfy; then in the described video image that captures by video capture device, carry out recognition of face, with the basic model of storing in the people's face that identifies and the database authentication of comparing, authentication by the time screen protection of unblanking.

The present invention and then a kind of screen protective device based on face authentication is provided comprises:

The condition enactment module is used to set the condition that the video image that captures from video capturing device is started face authentication;

The background playing module is used to play the background of background video file as current screen protection image;

Video splits module, is used for the video image that video capture device captures is split into multi-path video stream;

The video synthesis module is used for described multi-path video stream and described background video being synthesized and showing;

The face authentication module; whether the entry condition that is used to detect described condition enactment module settings satisfies; if satisfy; then in the described video image that captures by video capture device, carry out recognition of face; with the basic model of storing in the people's face that identifies and the database authentication of comparing, authentication by the time screen protection of unblanking.

The present invention has enriched the use of user to camera equipment, allows the user in the middle of the process of using screen protection, experiences enjoyment and safety guarantee that technology such as recognition of face, background video are brought to people, has greatly enriched people's entertainment life.

Description of drawings

Fig. 1 is according to the described screen protection method process flow diagram based on face authentication of the embodiment of the invention;

Fig. 2 is according to the described face authentication process flow diagram of the embodiment of the invention;

Fig. 3 is according to the described sample training schematic flow sheet of the embodiment of the invention;

Fig. 4 is the demarcation and collection synoptic diagram according to the described people's face of embodiment of the invention sample;

Fig. 5 is according to the described facial image non-linear correction of embodiment of the invention synoptic diagram;

Fig. 6 is according to the described non-linear correction principle schematic of the embodiment of the invention;

Fig. 7 is according to the described facial image illumination of embodiment of the invention result synoptic diagram;

Fig. 8 is according to the described feature extraction method flow diagram based on AdaBoost of the embodiment of the invention;

Fig. 9 is according to the described support vector machine optimal classification of embodiment of the invention face synoptic diagram;

Figure 10 A, Figure 10 B are according to the described face authentication output of embodiment of the invention result schematic diagram.

Embodiment

As shown in Figure 1, for according to the described screen protection method process flow diagram of the embodiment of the invention based on face authentication:

On the one hand, normally carry out the existing screen protection program of storage by computing machine, as:

Step 107: play existing background video file.

Step 108: video file is carried out audio decoder and video decode.

Decoded audio frequency can be exported by audio frequency apparatus.The image that video can be used as with camera collection synthesizes.

On the other hand:

Step 101: at first by video capturing device (for example camera) captured video image.

Step 102: in conjunction with the faceform storehouse, the image basis enterprising pedestrian's face track identification and the authentication of catching.

Step 103: convert the video image that captures to the RGB24 form.

The raw video image that video capture device captures generally is the form of RGB or YUV2.If YUV2 or RGB32 form need to carry out format conversion so, it is converted to the RGB24 form.

Step 104: split video flowing.

The fractionation of video image and synthetic technology provide in the prior art, and for example, the multimedia application of the DirectX9.0 technological development of windows platform both can be realized.Embodiments of the invention are undertaken multichannel to the image of the RGB24 form after changing to raw video image by Inifinite Pin Tee Filter and split, exactly one road video flowing is carried out multichannel output according to certain algorithm, the video flowing after the output is input to video synthesizer.

Step 105: synthetic video stream.

The background video file is input to video synthesizer through behind the video decode.

Step 106: output on the video display apparatus after the processing such as the multi-channel video of video synthesizer handle input superposes, sharpening.

In addition; in the embodiment of the invention; can also further set entry condition to face authentication; for example; can be normally open by it manually is set, just face authentication is all carried out in unconditionally authentication during whole screen protection; can also be set to start when occupying the certain proportion of whole two field picture area for example 30% when the people's face in the video image is long-pending.Because people's face is long-pending too small, illustrate that user's distance calculation is far away excessively, the wish of not unblanking screen protection.By the face authentication unlocking condition is set, can use the present invention more flexibly.

Accordingly, from design factors, the embodiment of the invention also provides a kind of screen protective device based on face authentication, comprising:

The code conversion module, the video image that is used for video capture device is captured carries out code conversion.

Video splits module, be used for video capture device is captured, become the video image of RGB24 form to split into multi-path video stream via described code conversion module converts;

Equally, described condition enactment module, the entry condition of setting can comprise: the unconditional startup, or when the people's face in the video image amasss the certain proportion that occupies whole two field picture area, start.

Realize screen protective device of the present invention and method, the video that focuses on the one hand splits and synthetic technology, just is human face identification technology on the other hand.Prior art has provided the method for a lot of recognition of face authentications, and the embodiment of the invention provides a kind of face authentication mode based on Gabor feature and support vector machine, is equally applicable to the present invention.We can say that also the present invention is the practice of this face authentication method.

As shown in Figure 2, for according to the described face authentication process flow diagram of the embodiment of the invention,, train the support vector machine faceform's (step 201) who obtains an one-to-many for each user that need authenticate at first by people's face sample image; The video image of acquisition camera input, the front face in search and the detected image, and it is continued to follow the trail of and verify, guarantee that people's face of following the trail of is front face (step 202); Automatically demarcate the organ characteristic's point in the front face, and in view of the above detected people's face is carried out pre-service (step 203); Calculate Gabor feature then, from higher-dimension Gabor feature, pick out the strongest part Gabor feature of classification capacity and form low dimensional feature vector (step 204) through pretreated facial image; Low dimensional feature vector after selecting is input among the described faceform, carries out recognition of face, return similarity data (step 205) with each faceform; In conjunction with the similarity numerical value of consecutive numbers two field picture, export final face authentication result (step 206).

The described training process of step 201 among Fig. 2, can adopt multiple mode to train, as shown in Figure 3, be the sample training schematic flow sheet of embodiment of the invention employing, at first, gather positive sample and anti-sample facial image (step 301) at each user that need authenticate; All sample people faces are demarcated, determined the human face characteristic point position (step 302) in people's face sample; According to calibration result, all sample images are carried out pre-service (step 303); Calculate Gabor feature, from higher-dimension Gabor feature, pick out the strongest part Gabor feature of classification capacity and form low dimensional feature vector (step 304) through pretreated sample image; Utilize described low dimensional feature vector, adopt the one-to-many support vector machine that the different authentication user is trained, obtain support vector machine faceform's (step 305) of an one-to-many for each user.

The described face tracking of step 202 among Fig. 2, also can pass through accomplished in many ways, the for example real-time detection of people's face and the method and system that continue to follow the trail of in a kind of video sequence that provides in the Chinese patent application 200510135668.8, this application adopts the layering detection model of AdaBoost training algorithm training front face, and divide the video image of yardstick search input, determine the position of a plurality of people's faces in the image; Then system verifies at follow-up several frames the people's face that detects, and verifies again in conjunction with the lasting tracking that realizes people's face based on the face tracking algorithm of Mean shift and color histogram; In tracing process, can carry out the checking of people's face, determine the accuracy of tracking results.This system tests a plurality of people's faces in many scenes, test result shows that people's face detection and tracking algorithm of this paper can quick and precisely detect a plurality of front faces under different expressions, the different colour of skin, the different illumination conditions, can detect people's face of-20 ° to 20 ° degree of depth rotations ,-20 ° to 20 ° plane rotations, but any attitude people's face of real-time follow-up comprises side, rotation people face etc.

The described human face positioning feature point of step 203 also has a variety of methods to realize among Fig. 2, for example the localization method that provides in the Chinese patent application 200610011673.2.According to embodiments of the invention, can determine the eyes position by following steps: (1) adopts statistical to determine left eye region of search and right eye region of search on the basis of obtaining people's face positional information, and definite left eye primary election position and right eye primary election position; (2) in described left eye and right eye region of search, adopt left eye local feature detecting device and right eye local feature detecting device respectively, all left eye primary election positions and right eye primary election position are differentiated, and determined a simple eye similarity numerical value for each primary election position; (3) from all left eye primary election positions and right eye primary election position, select the preceding N of similarity numerical value maximum respectively ₁Individual position is as left-eye candidate positions and right eye position candidate, and it is right that all left eyes and right eye position candidate are made into the eyes candidate, with each candidate to being that benchmark is determined the eyes zone; (4) adopt the eyes area detector as global restriction, described each eyes zone differentiated, for each eyes candidate wherein to determining an eyes similarity numerical value; (5) the preceding M of selection eyes similarity numerical value maximum ₁Individual eyes candidate is right, to all left-eye candidate positions and all right eye position candidate difference calculating mean value wherein, as left eye characteristic point position and right eye characteristic point position.

And can determine the face position by following steps: (1) adopts statistical to determine face location finding zone on the basis of obtaining eye position information, and definite face primary election position; (2) in face location finding zone, adopt face local feature detecting device that each face primary election position is differentiated, and determine a face local similar number of degrees value for it; (3) the preceding N of selection face local similar number of degrees value maximum ₂Individual primary election position for each position candidate, is a benchmark with left eye characteristic point position, right eye characteristic point position, face position candidate as the face position candidate, determines face area; (4) adopt the face area detecting device as global restriction, each described definite face area is differentiated, for each face position candidate is wherein determined a face overall situation similarity numerical value; (5) the preceding M of selection face overall situation similarity numerical value maximum ₂Individual position candidate is calculated the mean value of these position candidate, as the face characteristic point position.

The described pre-treatment step of step 203 among Fig. 2, and the described pre-treatment step of step 303 among Fig. 3, process all is similar, below main be that example further specifies with pre-service to sample image.

Before carrying out recognition of face, must carry out pre-service to size, position and the gray scale of input facial image, the size of different facial images, gray scale are consistent.In addition, the position of people's face should be consistent in the different images, and the method that this can pass through eyes, face location makes the position basic fixed of people's face eyes, face in the input picture, again entire image is carried out affined transformation or non-linear correction.Have only through after these pre-service, a plurality of input people faces of same individual just can have certain similarity on some feature, and different people's faces also just can have certain difference, and just can adopt the statistical model recognizer to carry out the training and the identification of model this moment.

With reference to the step 301 among the figure 3, because the present invention adopts the mode of support vector machine (SVM) to realize that face authentication is (for the arthmetic statement of support vector machine, can be referring to " pattern-recognitions " that the people showed such as Bian Zhaoqi, Zhang Xuegong, publishing house of Tsing-Hua University, 2000), therefore need to collect a large amount of anti-sample people faces, to improve the accuracy of face authentication.These anti-samples preferably should cover people's face of different expressions, the different colour of skin, all ages and classes as far as possible, comprise people's face of-20 ° to 20 ° degree of depth rotations, comprise the people's face of wearing and do not wear glasses.

The positive sample people face of face authentication is meant people's face sample of user to be certified, and these class data need be gathered user's sample by Automatic Program when practical application, and automatically user's sample is carried out pre-service and feature calculation.

With reference to the step 302 among the figure 3, to all anti-samples, the present invention can be by manually demarcating the key feature point of all anti-sample people faces, for each positive sample people face has been demarcated three points: two centers, face center.And can take the method for demarcation automatically for user's to be certified positive sample, obtain the coordinate of three points, as shown in Figure 4.

Then, the present invention can carry out geometrical normalization to each individual face according to these calibration points, the major organs aligning that is about to facial image is to normal place, reduce yardstick, translation and plane rotation difference between sample, cut out out human face region according to organ site then and become to be people's face sample, make people's face sample introduce background interference less, and the organ site of different people face sample have consistance as far as possible.

The present invention has introduced a width of cloth standard faces image each individual face sample has been carried out the cutting of geometrical normalization and human face region.Yardstick wd * the ht that at first determines people's face window to be identified is 44 * 48, and promptly wide is 44, and height is 48.Then obtain the front face image of a width of cloth standard, two y coordinate unanimity in the standard picture, people's face is also symmetrical fully, shown in the 4A among Fig. 4, demarcates three key feature points of this image.The position of the square human face region of determining cutting according to the distance and the position of eyes in this image.If two distance is r, the central point of two lines is (x _Center, y _Center), the wide 2r that is made as of collection rectangle, i.e. twice binocular interval, the then coordinate (x in clipping rectangle zone _Left, y _Top, x _Right, y _Bottom) be:

\{\begin{matrix} x_{lefe} \\ y_{top} \\ x_{right} \\ y_{bottom} \end{matrix}\} = \{\begin{matrix} x_{center} - r \\ y_{top} - 0.5 r \\ x_{center} + r \\ y_{top} + ht \end{matrix}\}

The human face region of cutting is normalized to 44 * 48 size,, and obtain the coordinate [x of three calibration points after the normalization as the 4B among Fig. 4 _Stad(i), y _Stad(i)], i=0,1,2, preceding two is the eye center point, last is the lip central point.

Three unique point [x of any given primitive man's face sample and demarcation _Label(i), y _Label(i)], i=0,1,2, as the 4C among Fig. 4, more direct method of cutting out is the affined transformation coefficient between three point coordinate after these three points of calculating and the standard picture normalization.In addition, can not add the stretching conversion of people's face all directions in the affined transformation formula, we only consider rotation and two conversion of whole convergent-divergent.Then can calculate in this cutting image arbitrarily some corresponding point coordinate in the original sample, and then obtain the pixel value of being had a few in the cutting people face, shown in the 4D among Fig. 4 by the affined transformation coefficient.

But the algorithm based on affined transformation exists apparent in view defective.At first when people's face sample band expression or the non-front of input people face, adopt the eyes lip central point deviation of the eyes lip central point of the cutting people face that this method obtains and standard picture can be bigger, particularly with the lip central point after the cutting of attitude sample also not on the image vertical center axis, eye position is also variant, as shown in Figure 5,5A is original image and calibration point, and 5B is the cutting image.Therefore for people's face of the different attitude expressions of same people, people's face eyes lip position difference is bigger in its cutting image, and this certain degree can reduce the anti-expression of recognizer, attitude interference capability.

Embodiments of the invention have adopted a kind of non-linear correction method, promptly adopt non-linear method will import the position that three central points of people's face are remedied to 3 of standard faces fully.At first only consider two central points of eyes, adopt the affined transformation algorithm between the calibration point of input people's face and standard faces, to calculate the affined transformation coefficient, equally only consider rotation and two conversion of whole convergent-divergent this moment.That is:

\{\begin{matrix} x_{label} (i) \\ y_{label} (i) \end{matrix}\} = \{\begin{matrix} s \cos θ & - s \sin θ \\ s \sin θ & s \cos θ \end{matrix}\} \{\begin{matrix} x_{stad} (i) \\ y_{stad} (i) \end{matrix}\} + \{\begin{matrix} a \\ b \end{matrix}\} = \{\begin{matrix} c & - d & a \\ d & c & b \end{matrix}\} \{\begin{matrix} x_{stad} (i) \\ y_{stad} (i) \\ 1 \end{matrix}\}, i = 0,1

Four unknown numbers are arranged in the following formula, and four equatioies have only unique one to separate, be made as (a, b, c, d), the 5C among Fig. 5 is the cutting result who only adopts these four coefficients to obtain.Can calculate the corresponding point of three unique points in cutting people face in the input sample by this affined transformation coefficient, be made as [x _Trans(i), y _Trans(i)], i=0,1,2.The position that preceding two coordinate transformings are eyes and the eye position of standard face are in full accord, but are subjected to interference such as attitude, expression, and face position difference may be bigger.We need be with the face aligning to normal place for this reason.

As shown in Figure 6, A, B point is the central point in the standard picture among the figure, and the D point is the central point of A, B, and Cstad is standard lip central point, and C is the lip point after the conversion.The non-linear correction process is carried out in two steps, at first corrects in the y direction, makes the y coordinate of correcting back lip point consistent with Cstad, as the C ' point among Fig. 6.And then carry out the rectification of x direction, and we couple together D and C ', and are half of about DC ' line is divided into people's face, consider certain bar straight line of horizontal direction, and establishing its y coordinate is y1, and the intersecting point coordinate E of itself and DC ' straight line is (x ₁, y ₁).Because we need (x ₁, y ₁) move to (x _D, y ₁), x wherein _DBe the x coordinate of D, therefore need be to (x _D, y ₁) point of the right and left carries out linear transformation respectively, E is moved on the DCstad of axis.Consider certain point (x, y ₁), to the some x＜x on the left side ₁, its coordinate of correcting the back point is (xx _D/ x ₁, y ₁), to the some x 〉=x on the right ₁, its coordinate of correcting the back point is [2x _D-x _D(2x _D-x)/(2x _D-x ₁), y ₁].As can be seen,, people from right side face is stretched, can all be remedied on people's face vertical centering control axis DCstad having a few on DC ' straight line like this if C ' on the right side of Cstad, then needs people's face in left side is compressed.

After obtaining the non-linear correction coefficient, then obtain people's face after the rectification in conjunction with original image, if the facial image after the cutting is I, this picture size is 44 * 48, wherein certain point coordinate be (x, y), the coordinate before obtaining it and correct according to the non-linear correction coefficient (x ', y '), obtain the coordinate (x of this point in the original image again by the affined transformation coefficient _Ori, y _Ori):

\{\begin{matrix} x_{ori} \\ y_{ori} \end{matrix}\} = \{\begin{matrix} c & - d & a \\ d & c & b \end{matrix}\} \{\begin{matrix} x^{'} \\ y^{'} \\ 1 \end{matrix}\}

For eliminating The noise, (x, pixel value y) is made as corresponding point (x in the images cut _Orj, y _Ori) average of all some pixel values in the neighborhood scope, shown in the 5D among Fig. 5.

In addition, disturbed by factors such as ambient light photograph, imaging device, facial image brightness or contrast can occur unusually, strong shadow or situation such as reflective appear, also there is this difference in addition between the colour of skin of different ethnic groups, therefore need carry out the gray balance processing to the people's face sample after geometrical normalization and the rectification, improve its intensity profile, the consistance between enhancement mode.But, the illumination problem in the recognition of face is comparison difficulty always but also is unusual important problem.Very many photo-irradiation treatment algorithms have been arranged for many years, but performance compare all generally, resist the ability of various ambient light interference all poor.Owing to need gather the positive sample of people's face and train based on the face recognition algorithms of statistical method, but the illumination of the positive sample of people's face is general more single, even to the positive sample of adding different light, training data also can only cover a few illumination patterns.And illumination is very complicated in the actual scene, and same people's face but illumination difference when big gray scale also can have this evident difference, also can there be this difference in the characteristics of image that calculates.In addition, if the uneven illumination of input people face is even, subregion illumination is strong, subregion illumination is weak, even carry out the normalization, histogram equalization etc. of full figure this moment to image, all be difficult to obtain light application ratio people's face data uniformly, this will reduce the precision of recognition of face greatly.

The photo-irradiation treatment algorithm that the embodiment of the invention adopts can carry out in two steps, at first image is carried out whole gray scale normalization, and then the combined standard image carries out local gray scale normalization.

Whole normalization is fairly simple, a given width of cloth standard faces image, and as the 4B among Fig. 4, the average P of basis of calculation people's face gray scale _sAnd variances sigma _s, then calculate average P and the variances sigma of importing the sample gray scale, its arbitrary pixel value I (x, y) pixel value after the normalization is:

I′(x，y)＝[I(x，y)-P]·σ _s/σ+P _s

The accurate image point of bidding (x, pixel value y) be S (x, y), behind the input people face gray scale normalization this point value be I ' (x, y).Because the position of eyes, face correspondence fully in this two width of cloth image, so the organ site difference of the position of each organ and standard faces is also not too large in the sample.That is to say that each local gray scale of two width of cloth images should be similar to unanimity, if gray scale is inconsistent, the uneven illumination that then can think to import people's face is even, need carry out the rectification of gray scale, thus can be with the gray scale of the gray scale deemphasis positive input people face of standard faces.

Based on this consideration, the embodiment of the invention is handled respectively each picture element, considers wherein certain point (x, y), extract all picture elements in its neighborhood scope, the neighborhood length and width are W, we add up, and (x, y) average of W * W some gray scale in the neighborhood is made as A in the input sample _I(x, y), (x, y) average of W * W some gray scale in the neighborhood is made as A in the statistical standard sample again _S(x, y).A _I(x, the y) size of brightness in Fan Ying the current neighborhood, A _S(x, y) reflection is the intensity of standard faces local light photograph, if both differ greatly, then uneven illumination is even near the current point of expression input people's face, need correct the gray scale of this point, again A _S(x, y) and A _I(x, ratio y) can the approximate reverse illumination according to the ratio of intensity, therefore can be directly the gray-scale value of this point be multiply by this ratio, as correction result, i.e. (x, y) the afterwards new gray-scale value I of processing _r(x y) is:

I _r(x，y)＝I′(x，y)·A _S(x，y)/A _I(x，y)

The selection of W is relatively more crucial, and W can not be too big, otherwise the gray scale rectification does not have effect, and W can not be too little, otherwise facial image and standard faces after correcting are more approaching, and this paper is made as 15 with W, obtains optimum.As shown in Figure 7, for before the photo-irradiation treatment and photo-irradiation treatment after the result contrast synoptic diagram, wherein, 7A is the facial image after the overall intensity normalization; 7B is for carrying out the facial image after gray scale is corrected according to embodiments of the invention.

The described feature extraction step of step 304 among step 204 and Fig. 3 among Fig. 2 is unusual the key link in recognition of face.Feature commonly used has gray feature, edge feature, wavelet character, Gabor feature etc.Wherein Gabor shows outstanding time-frequency aggregation for facial image provides multiple dimensioned, multidirectional fine description, possesses the very strong portrayal details and the ability of partial structurtes.It has the character of bandpass filtering, can partly resist the influence that becomes illumination slowly, also can some high frequency noises of elimination.Simultaneously, the simple type cell is closely similar to the response of picture signal in the impulse response function of two-dimensional Gabor filter and the mammiferous visual cortex, possesses solid foundation in theory.Therefore, embodiments of the invention are selected the Gabor feature when realizing recognition of face.

The impulse response of two-dimensional Gabor filter is expressed as:

ψ_{j} (x) = \frac{| | k_{j} | |}{σ^{2}} \exp (- \frac{| | k_{j} | | \cdot | | x | |}{{2 σ}^{2}}) [\exp ({ik}_{j} x) - \exp (- \frac{σ^{2}}{2})]

σ=2 π wherein, we have considered 5 frequency v=0 ..., 4,8 direction μ=0 ..., 7, then have:

k_{v} = 2^{- \frac{v + 2}{2}} π,

Each point at facial image can all calculate 5 frequencies, and 8 directions, totally 40 dimension Gabor features, account form are that the impulse response with input picture and each frequency all directions carries out convolution, that is:

G _j(x)＝∫I _r(x′)ψ _j(x-x′)dx′

In order to improve the counting yield of Gabor feature, can adopt fft algorithm that this convolution process is quickened, earlier to I _r(x ') and ψ _j(x ') carries out the FFT conversion respectively, the result after the conversion multiplied each other carry out anti-FFT conversion again, just can obtain in the image have a few for the Gabor feature of certain certain direction of frequency.Total Gabor characteristic number is 5 * 8 * 44 * 48=84480, this data volume is very large, it is very difficult directly adopting sorting algorithm the feature of higher-dimension like this to be trained and discern, and therefore also needs to carry out selecting of feature, reduces the dimension of feature significantly.

The dimension of every width of cloth people face Gabor feature is up to 84480, and total number of training has more than 10,000, and what adopt when sorter is trained is SVM algorithm more than 1 pair, therefore can adopt based on the feature of AdaBoost and select algorithm, picking out the strongest thousands of dimensional features of classification capacity in conjunction with the mode classification more than 1 pair and positive and negative sample data from these features comes out as 2000 dimensional features, the feature of picking out is formed new low-dimensional Gabor proper vector, adopt support vector machine (SVM) algorithm of one-to-many that different user is trained after feature selecting finishes again.The faceform's of the calculated amount of training algorithm and storage data volume all reduces greatly like this.In verification process, algorithm only need calculate the Gabor feature of people's face, selects the result in conjunction with existing feature and picks out low dimensional feature, lower dimensional feature vector is discerned.

The feature extraction method of simply introducing the embodiment of the invention below and being adopted based on AdaBoost, as shown in Figure 8:

Step 801: given two class samples, sample number is L, and positive sample number is Lp, and anti-sample number is Ln.

Step 802: initialization, weight is set, positive sample is 1/2Lp, anti-sample is 1/2Ln.

At first, set weights for positive and negative image pattern collection, in a specific embodiment, can the shared weight of inverse video sample set be set to 1/2, the shared weight of all positive image pattern collection is set to 1/2.Certainly, in other embodiments, also can the shared weight of all inverse video sample sets be set to 2/5 fully, the shared weight of all inverse video sample sets is set to 3/5.That is to say, can be that positive and negative image pattern collection is set weight as required.Afterwards, set weight for each positive and negative image pattern, in a specific embodiment, the weight that can set each positive sample is the 1/Lp of positive sample set weight, and the weight of setting each anti-sample is the 1/Ln of anti-sample lump weight.Certainly, also important positive and negative image pattern can be set higher weight.

Step 803: set iteration round t=1,2 ..., T.

Step 804: consider the feature that all are never selected, utilize single features training Weak Classifier, obtain optimum threshold parameter according to the weights of training sample set, make the weighting error rate minimum of all samples, can obtain an error rate for each Weak Classifier and characteristic of correspondence thereof like this.

Adopt j Weak Classifier h _j(x) according to j feature G of preset threshold and each image pattern _j(x) go to judge that each sample image is positive sample or anti-sample, can count the weighting error rate of this Weak Classifier thus.

Each Weak Classifier is all only handled a corresponding feature, and it can be expressed as:

h_{j} (x) = \{\begin{matrix} 1, if & g_{j} (x) > low_θ_{j} and g_{j} (x) < high_θ_{j} \\ 0, & otherwise \end{matrix}

Wherein, low_ θ _jBe Weak Classifier h _j(x) low threshold value, high_ θ _jBe Weak Classifier h _j(x) if high threshold is j feature G of present image sample _j(x) numerical value is greater than low threshold value and when being lower than high threshold, described Weak Classifier h _j(x) be output as 1, its expression present image sample is judged as positive sample; Otherwise, described Weak Classifier h _j(x) be output as 0, its expression present image sample is judged as anti-sample.Wherein, Weak Classifier h _j(x) low threshold value and high threshold are the weight settings according to image pattern.

About the classification of Weak Classifier to image pattern, specifically be exactly, at first, j Weak Classifier h _j(x) according to j feature G of the 1st image pattern _j(x) judge that the 1st image pattern is positive sample or anti-sample, next, according to j feature G of the 2nd image pattern _j(x) judge that the 2nd image pattern is positive sample or anti-sample ..., up to, j Weak Classifier h _j(x) according to j feature G of L image pattern _j(x) judge that L image pattern is positive sample or anti-sample.

Step 805: count each Weak Classifier h _j(x) error rate is selected a predetermined number Weak Classifier of error rate minimum and its characteristic of correspondence is selected the result as the feature when front-wheel.

Each Weak Classifier h _j(x) to be that positive sample or anti-sample judge wherein the sample that misdeems to be arranged all to L image pattern, in other words, Weak Classifier h _j(x) positive sample anti-sample may be regarded as, also positive sample anti-sample may be regarded as.The weight of the image pattern of this Weak Classifier mistake of statistics is tried to achieve, just can obtain this Weak Classifier h _j(x) weighting error rate.Afterwards, a predetermined number Weak Classifier characteristic of correspondence of error rate minimum is selected the result as the feature when front-wheel.In one embodiment, described predetermined number is 1, also can be, 2 or 3 or the like, the operator can set this number according to actual conditions.

Step 806: the weight of the judicious image pattern of Weak Classifier that reduces to select, increase the weight of the selected wrongheaded image pattern of Weak Classifier, and the weight of the image pattern after upgrading carried out normalization, make the weight sum of all samples equal 1, return 103, enter the next round iteration,, pick out the feature of predetermined number until finishing the setting round.

Above selection method at be two class problems.For the multiclass problem, realization architecture design selection method that can the binding pattern sorting algorithm.If what pattern classification algorithm adopted is the framework of one-to-many, we select process with feature and are decomposed into a plurality of two class problems, in each two class problem wherein a class be certain class sample, another kind of then corresponding other samples.If what pattern recognition problem adopted is man-to-man framework, be about to the multiclass pattern recognition problem and be decomposed into a plurality of two classes problem one to one, the class in each two class problem is arbitrary class input sample, second class is another kind of input sample.When selecting, feature need consider the AdaBoost module flow process of a plurality of similar Fig. 8 like this, we realize each AdaBoost module flow process synchronously, being about to the error rate that the t wheel Weak Classifier of all AdaBoost modules returns adds up, the feature of total error rate minimum is returned, selected the result as this feature of taking turns.Each is taken turns feature and selects and upgrade weight according to the error rate of each current AdaBoost module again after finishing, and selects next stack features.

Among Fig. 2 among step 201 and Fig. 3 the described support vector machine of step 305 (SVM) be that Statistical Learning Theory develops a kind of mode identification method of.This algorithm is that the optimal classification face under the linear separability situation proposes.Consider two class linear separability situations shown in Fig. 9, establish sample set (x _i, y _i), i=1 ..., n, x ∈ R ^d, y ∈+1 ,-1}, wherein y _iBe the category label of pattern xi, H:wx+b=0 is the classification interface, and H1, H2 are respectively and are parallel to H and are that with the H distance two planes of 1/||w||, the distance between them are called class interval (margin).The basic thought of support vector machine wishes to find an optimum linearity classifying face exactly, makes the class interval big as far as possible, promptly || and w|| is as far as possible little, and classification error is few as far as possible on training set.The problem of finding the solution of optimal classification face is the quadratic function extreme-value problem under inequality constrain in fact, and its optimum solution is:

w = Σ_{i = 1}^{n} α_{i} y_{i} x_{i}

α wherein _iBe weight.To most sample α _iBe zero, the α that minority is non-vanishing _iCorresponding is exactly support vector, promptly is positioned at the sample on H1 and H2 two planes.The optimal classification function then is

f (x) = sgn [(w \cdot x) + b] = sgn [Σ_{i = 1}^{n} α_{i} y_{i} (x_{i} \cdot x) + b]

Sgn () is a sign function.F (x) is that 1 expression is identified as first kind sample, i.e. y=1, otherwise think and be identified as the second class sample.Change the click computing of proper vector in the following formula into inner product, and inner product satisfies the Mercer condition, just linear SVM can be expanded to the non-linear SVM of broad sense, that is:

f (x) = sgn [(w \cdot x) + b] = sgn [Σ_{i = 1}^{n} α_{i} y_{i} K (x_{i}, x) + b]

Adopt different inner product functions will cause different algorithm of support vector machine, as polynomial expression inner product, sigmoid function, radial kernel function (RBF) etc., compare with linear SVM, non-linear SVM expands to the optimal classification face nonlinear, can realize the classification of a lot of linear inseparable situations, so classification accuracy is improved also.We have adopted the SVM algorithm based on RBF, that is: when realizing recognition of face

K (x, x_{i}) = \exp {- \frac{| | x - x_{i} | |}{σ^{2}}}

SVM has one to one and two kinds of ways of realization of one-to-many when being used for the identification of multiclass people face.Man-to-man SVM is to be wantonly two class sample training svm classifier devices at algorithm, if N class sample is arranged like this, then needs to train the sorter of N * (N-1)/2.During identification sample is input in each svm classifier device successively, each judgement all will be eliminated a class sample.If two sample standard deviations of certain sorter correspondence are superseded certainly, then skip this sorter, that remaining after all judgements are finished classification is exactly a recognition result.The subject matter of sorter is that training the time has only been considered all kinds of training samples one to one, and a large amount of anti-sample datas have all slatterned, and this sorter can't realize anti-sample refuse know, therefore can't be applied to the face authentication algorithm.

The SVM algorithm of one-to-many only need be trained a sorter respectively for each classification, and positive sample is exactly such other training data when training at every turn, and anti-sample has then comprised other kinds data and all anti-sample datas.Because this method has been considered numerous anti-sample datas, the optimum interphase that has obtained after having trained can come current classification sample and other classification sample separation more exactly, therefore when realizing the automated validation of a plurality of people's faces, the SVM algorithm of one-to-many has extraordinary using value.

The verification process of one-to-many SVM is also fairly simple, the feature of importing after sample is selected is input in N the svm classifier device, if all sorters can refuse to know input feature vector, think that then all categories is all dissimilar in input people face and the training storehouse, the result that algorithm output is refused to know; Otherwise, if input feature vector has only passed through a sorter, and refused to know, then this sorter corresponding class result that is exactly recognition of face by other all sorters; Another kind of special circumstances are exactly that input feature vector has passed through a more than svm classifier device, algorithm thinks that it is similar to a plurality of classifications, from our experimental result, this situation is very rare, because we are all kinds of samples anti-sample of other classifications each other when sorter is trained, but when others face of inhomogeneity was more similar, this situation also can occur.This moment, we taked a kind of short-cut method to address this problem, because each one-to-many SVM algorithm all can be a judgement of each sample output numerical value:

J (x) = Σ_{i = 1}^{n} α_{i} y_{i} K (x_{i}, x) + b,

This numerical value has also reflected the degree of closeness of input sample with corresponding classification to a certain extent, reaches the gap size with corresponding anti-sample.This numerical value is big more, and then expression input sample is similar more to current classification, and is big more with other classification differences.Therefore we handle this special circumstances according to size of this judgement numerical value, are about to output and refuse to know that result's svm classifier device returns:

J (x) = Σ_{i = 1}^{n} α_{i} y_{i} K (x_{i}, x) + b

Sort, with the result of maximum number corresponding class as recognition of face.Although this is an approximate result, from actual result, the effect of this method is still very good.

Below in conjunction with one-to-many SVM, step 205,206 described calculation of similarity degree among Fig. 2 are described.

Face authentication algorithm based on one-to-many SVM can be imported a judgement of people's face output data J for each _n(x), n=0,1 ..., N-1 has reflected when the similarity degree of forefathers' face with corresponding classification.We are with the judgement data combination that each classification shape of face is counted frame, and L is counted in the new judgement that obtains a reflection multiframe identifying information _n(x), n=0,1 ..., N-1, we are with the identification similarity of this number as each classification.Be located at front k-1 frame we to try to achieve a similarity be L _n ^K-1(x), n=0,1 ..., N-1, the SVM of k frame return number and are J _n ^k(x), nn=0,1 ..., N-1, then the similarity calculating formula of k frame is:

L_{n}^{k} (x) = \min (\max (L_{n}^{k - 1} (x) + J_{n}^{k}, \min_data), \max_data)

The similarity that is people's face adds up, but cumulative number need limit minimum and maximum value.The J that works as certain classification of consecutive numbers frame like this _n(x) all greater than zero the time, then total similarity L _n(x) will increase gradually; Otherwise then reduce gradually.If L at k frame all categories ^k _n(x) all less than zero, then refuse to know this people's face; If one or more L is arranged ^k _n(x) greater than zero, then with the maximum L of numerical value ^k _n(x) corresponding class is as the result of face authentication.Shown in Figure 10 A, be frame input people face, Figure 10 B is the output result, each one L among the figure ^k _n(x) show that with histogram middle black line is the similarity decision threshold, the present invention directly is made as zero with this threshold value.

In sum, the present invention has made full use of the synthetic and face recognition technology of many videos, and just video and the background video that the seizure video equipment is captured carries out multi-faceted synthesizing, and formed final screen protection program.In this process, increased face recognition technology, can utilize people's face data of having gathered, provide people's face checking defencive function to screen protection program.

Claims

1. the screen protection method based on face authentication is characterized in that, comprises the steps:

2. the method for claim 1 is characterized in that, the entry condition in the described step (1) comprises: the unconditional startup, or when the people's face in the video image amasss the certain proportion that occupies whole two field picture area, start.

3. the method for claim 1 is characterized in that, in the described step (3), comprises that further the video image that video capture device is captured carries out code conversion.

4. method as claimed in claim 3 is characterized in that, in the described step (3), converts the video image that captures to the RGB24 form.

5. the method for claim 1 is characterized in that, described step (5) comprising:

(51), train the support vector machine faceform who obtains an one-to-many for each user that need authenticate by people's face sample image;

(52) gather the video image of video capture device input, the front face in search and the detected image, and it is continued to follow the trail of and verify, guarantee that people's face of following the trail of is a front face;

(53) demarcate organ characteristic's point in the front face automatically, and in view of the above detected people's face is carried out pre-service;

(54) calculating is picked out the strongest part Gabor feature of classification capacity and is formed low dimensional feature vector through the Gabor feature of pretreated facial image from higher-dimension Gabor feature;

(55) the low dimensional feature vector after will selecting is input among the described faceform, carries out recognition of face, returns the similarity data with each faceform;

(56) according to the described similarity data of returning, export final face authentication result.

6. method as claimed in claim 5 is characterized in that, described step (56) is the similarity numerical value in conjunction with the consecutive numbers two field picture, exports final face authentication result.

7. method as claimed in claim 5 is characterized in that, described step (51) comprising:

(1A) at each user that need authenticate, gather positive sample and anti-sample facial image;

(1B) all sample people faces are demarcated, determined the human face characteristic point position in people's face sample;

(1C) according to calibration result, all sample images are carried out pre-service;

(1D) calculating is picked out the strongest part Gabor feature of classification capacity and is formed low dimensional feature vector through the Gabor feature of pretreated sample image from higher-dimension Gabor feature;

(1E) utilize described low dimensional feature vector, adopt support vector machine that the different authentication user is trained, obtain the support vector machine faceform of an one-to-many for each user.

8. the screen protective device based on face authentication is characterized in that, comprising:

9. device as claimed in claim 8 is characterized in that, described condition enactment module, and the entry condition of setting comprises: the unconditional startup, or when the people's face in the video image amasss the certain proportion that occupies whole two field picture area, start.

10. device as claimed in claim 8 is characterized in that, further comprises: the code conversion module, the video image that is used for video capture device is captured carries out code conversion.

11. device as claimed in claim 10 is characterized in that, described code conversion module converts the video image that captures to the RGB24 form.