[ summary of the invention ]
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The invention aims to provide a face authentication method, which realizes self-maintenance capability and training capability of a face model by using an incremental learning method.
The invention also aims to provide a face authentication device, which realizes the self-maintenance of the system by using the incremental learning method.
In order to achieve the object of the present invention, according to an aspect of the present invention, there is provided a face authentication method, the method including: extracting human face features from a plurality of frame images; judging whether the face features in the plurality of frames of images accord with a preset face model or not, and when the number of the images which accord with the preset face model exceeds a first threshold value, considering that the user authentication corresponding to the face features in the plurality of frames of images is successful; when the number of the images which accord with the preset human face model exceeds a second threshold value, extracting all human face features in the plurality of frames of images as sample features; and performing incremental training on the preset face model by using the sample characteristics, wherein the second threshold is not less than the first threshold.
Further, the method further comprises: collecting continuous images of the same user; carrying out face detection and tracking on the continuous images; and selecting a plurality of frame images of which the rotation angles of the human faces do not exceed a preset error range from the continuous images of the detected human face area.
Further, the selecting, from the continuous images in which the face regions are detected, a number of frames of images in which the rotation angles of the faces do not exceed a predetermined error range includes:
extracting feature points of eyes and a mouth from a face region of the continuous image;
calculating the face rotation angle theta through the feature points, wherein the calculation formula is as follows:
<math><mrow><mrow><mi>θ</mi><mo>=</mo><mi>arctg</mi><mo>[</mo></mrow><mfrac><mrow><mrow><mo>(</mo><mi>b</mi><mo>-</mo><mi>a</mi><mo>)</mo></mrow><mi>sin</mi><mi>α</mi></mrow><mrow><mrow><mo>(</mo><mi>b</mi><mo>+</mo><mi>a</mi><mo>)</mo></mrow><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>cos</mi><mi>α</mi><mo>)</mo></mrow></mrow></mfrac><mo>]</mo></mrow></math>
wherein a is the horizontal distance from the right-eye characteristic point to the mouth characteristic point, b is the horizontal distance from the left-eye characteristic point to the mouth characteristic point, and alpha is a certain value from 20 degrees to 30 degrees;
selecting a plurality of frames of images of which the face rotation angle theta does not exceed a preset error range.
Further, extracting all the face features in the plurality of frame images as sample features comprises: and extracting the face features which accord with the preset face model from the plurality of frames of images as positive samples, and extracting the face features which do not accord with the preset face model from the plurality of frames of images as negative samples.
Further, constructing a weak classifier library by using the positive sample and the negative sample and performing incremental training,
initializing all selectors, wherein the selectors comprise selected weak classifiers and weak classifier weights; initializing all selector corresponding classifier correct sample weights and
and classifying error sample weights and
for a current sample, the sample label is 1, if 1 ═ 1, then it is a positive sample, and 1 ═ 1 is a negative sample; setting the sample weight w to be 1;
updating M weak classifiers of online weak feature structures;
for N selectors, updating weak classifier serial number j and weak classifier weight alpha of the selectorn。
Further, the weak features are pre-U-dimensional Gobar features extracted from the positive samples and the negative samples, and for the pre-U-dimensional Gabor features with different scales and different positions, a weak classifier is constructed in a nearest neighbor-based mode and is in the form of:
wherein,
for the jth feature center of the positive sample,
is the jth feature center of the inverse sample, f
j(x) Is the current feature.
Further, the step of updating M weak classifiers constructed by the online weak features is to adopt a Kalman filtering mode to update the mean value of the weak features online
To know
Further, for the N selectors, the weak classifier serial number j and the weak classifier weight α of the selector are updatednThe method comprises the following steps:
obtaining the authentication result marks Hyp (M) of the M weak classifiers for the sample, wherein the authentication is 1 correctly, and otherwise, the authentication is 0;
setting unused flag bUsed for each weak classifiermWhether the flag has been selected for use by a selector, and if it has been used to be 1, noneUse is 0;
for all N selectors, the following process updates are performed:
for all M weak classifiers, according to the authentication result of the M weak classifiers on the sample, if Hyp
mIs 1, then
If not, then,
bUsed if the current weak classifier is not in use m1, skip and calculate the authentication error rate e for all unused weak classifiersn,mAnd selecting e with the smallest error ratenWeak classifiers as current selectors, i.e. taking j ═ argmin (e)n,m) While computing weak classifier weights alphan(ii) a Updating the weight w of the sample;
and replacing the T weak classifiers with the worst authentication effect.
Further, the authentication error rate en,mSatisfies the following conditions:
<math><mrow><msub><mi>e</mi><mrow><mi>n</mi><mo>,</mo><mi>m</mi></mrow></msub><mo>=</mo><mfrac><msubsup><mi>λ</mi><mrow><mi>n</mi><mo>,</mo><mi>m</mi></mrow><mi>w</mi></msubsup><mrow><msubsup><mi>λ</mi><mrow><mi>n</mi><mo>,</mo><mi>m</mi></mrow><mi>w</mi></msubsup><mo>+</mo><msubsup><mi>λ</mi><mrow><mi>n</mi><mo>,</mo><mi>m</mi></mrow><mi>c</mi></msubsup></mrow></mfrac></mrow></math>
the weak classifier weight α
nSatisfies the following conditions:
the weight w of the sample satisfies:
if Hyp (j) is 1
If not, then,
according to another aspect of the present invention, there is provided a face authentication system, the system comprising: the characteristic extraction module is used for extracting human face characteristics from a plurality of frame images; the face authentication module is used for judging whether the face features in the plurality of frames of images accord with a preset face model or not, and when the number of the images which accord with the preset face model exceeds a first threshold value, the user authentication corresponding to the face features in the plurality of frames of images is considered to be successful; the characteristic adding module is used for extracting all the human face characteristics in the plurality of frames of images as sample characteristics when the number of the images which accord with the preset human face model exceeds a second threshold value; and the increment learning module is used for carrying out increment training on the preset human face model by utilizing the sample characteristics, wherein the second threshold value is not less than the first threshold value.
Further, the system further comprises: the image acquisition module acquires continuous images of the same user; the face tracking and positioning module is used for carrying out face detection, tracking and positioning on the continuous images; and the image selection module is used for selecting a plurality of frames of images of which the rotation angles of the human faces do not exceed a preset error range from the continuous images of the detected human face area.
Further, the image selection module comprises a feature point extraction unit, a rotation angle calculation unit and an image selection unit, wherein the feature point extraction unit extracts feature points of eyes and mouths from a face region of a continuous image; the rotation angle calculating unit calculates the face rotation angle theta through the feature points, and the calculation formula is as follows:
<math><mrow><mi>θ</mi><mo>=</mo><mi>arctg</mi><mo>[</mo><mfrac><mrow><mrow><mo>(</mo><mi>b</mi><mo>-</mo><mi>a</mi><mo>)</mo></mrow><mi>sin</mi><mi>α</mi></mrow><mrow><mrow><mo>(</mo><mi>b</mi><mo>+</mo><mi>a</mi><mo>)</mo></mrow><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>cos</mi><mi>α</mi><mo>)</mo></mrow></mrow></mfrac><mo>]</mo></mrow></math>
wherein a is the horizontal distance between the right-eye characteristic point and the mouth characteristic point, b is the horizontal distance between the left-eye characteristic point and the mouth characteristic point, and alpha is 20-30 degrees; the image selection unit selects a plurality of frame images of which the face rotation angle theta does not exceed a preset error range.
Compared with the prior art, the invention uses the initial training image to obtain the face model, and then uses the image with higher confidence coefficient of the authentication result as the sample to carry out incremental learning and training on the face model in the process of carrying out user authentication by using the face model.
[ detailed description ] embodiments
The detailed description of the invention generally describes procedures, steps, logic blocks, processes, or other symbolic representations that directly or indirectly simulate the operation of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. And the invention may be practiced without these specific details. Those skilled in the art will be able to utilize the description and illustrations herein to effectively introduce other skilled in the art to their working essence. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
Reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic may be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, the order of blocks in a method, flowchart, or functional block diagram representing one or more embodiments is not necessarily fixed to refer to any particular order, nor is it intended to be limiting.
The face authentication method and apparatus of the present invention can be implemented as a module, a system or a part of a system by software, hardware or a combination of both. The face authentication method and the face authentication device continuously select samples to continue training the face model in the use process of the user, and achieve good self-maintenance and authentication accuracy.
Referring to fig. 1, a flow chart of a method 100 of face authentication in an embodiment of the invention is shown. The face authentication method 100 includes:
step 101, collecting continuous images of the same user;
in this step, successive images of the same user are typically acquired with a camera, such as with a high-definition camera with a resolution of 1280 × 960 at a rate of 30 frames per second.
102, carrying out face detection and tracking on the continuous images;
the face detection and tracking of the continuous images acquired in step 101 are performed by a method described in the present inventor's chinese patent application No. 200510135668.8, namely, a method and a system for real-time detection and continuous tracking of a face in a video sequence.
103, extracting human face features from a plurality of frame images;
in this embodiment, the face region may be segmented according to the standard face model and the feature point positions in the current face region, where the feature points refer to two or more of an eye feature point, a mouth feature point, a nose feature point, and a chin feature point in the face region. For the extraction of the feature points, there are many mature techniques in the prior art, for example, the method described in the inventor's chinese patent application No. 200710177541.1, "a method and apparatus for locating feature points of an image" can be used. When the feature points are obtained, the human face region in the image can be obtained by using the 'three-stop five-eye' criterion of human face organ distribution. Further, corresponding Gabor features in the face region may be extracted, and in order to increase the speed of the authentication process, the adaboost algorithm may be adopted to select the Gabor features of different scales and different directions, and then the front M-dimensional Gabor feature most effective for authentication is selected from the Gabor features.
104, judging whether the face features in the plurality of frames of images accord with a preset face model or not, and when the number of the images which accord with the preset face model exceeds a first threshold value, considering that the user authentication corresponding to the face features in the plurality of frames of images is successful;
and judging whether the front M-dimensional Gabor features accord with a trained preset face model or not, and outputting a final output result according to a plurality of frames of images for authentication because a plurality of frames of images of the user are acquired, namely voting the final output result by using the authentication results of a plurality of frames of images. The face features in the N frames of face images are supposed to be authenticated, and the output results are O respectivelynN is 1, 2, 3, as OnWhen the value of (1) is 1, the frame image is expressed to accord with the preset human face model; when O is presentnWhen the value of (1) is 0, the frame image does not accord with the preset face model, then whether the number of images which accord with the model in the N frames of images reaches a first threshold value is judged, and if so, the user is considered to pass the authentication; if not, the user is considered to be not authenticated.
105, when the number of the images which accord with the preset human face model exceeds a second threshold value, extracting all human face features in the plurality of frames of images as sample features;
after the user passes the authentication, continuously judging whether the number of the images which accord with the preset face model exceeds a second threshold value, if so, taking the face features in the N frames of images as sample features, wherein OnThe face feature with the value of 1 is a positive sample, OnThe face feature with the value of 0 is taken as an inverse sample; and if not, not taking the human face features in the N frames of images as sample features. Since it is desirable to authenticate the face features with high confidence as sample features, the second threshold is usually larger than the first threshold, but may be equal to the first threshold.
And 106, performing incremental training on the preset face model by using the sample characteristics.
In the step, an Adaptive Boosting algorithm is mainly used for training a classifier, namely a human face model. For example, the paper Real-Time Tracking via On-line boosting-Grabner Helmut, Grabner Michael, Bischof Horst, Proceedings of the British Machine Vision Conference (BMVC' 06), vol.1, pages47-56, 2006, proposes an improved method. Specifically, the incremental learning training method by using the adaboost algorithm provided by the invention is as follows;
firstly, extracting front U-dimensional Gabor characteristics in the positive sample and the negative sample in the step 105 as weak characteristics, wherein U is less than M, and constructing an online weak classifier library;
next, training a strong classifier including a plurality of selectors sharing a weak classifier library by using the following method:
(1) initializing all selectors, wherein the selectors comprise selected weak classifiers and weak classifier weights;
(2) initializing all selector corresponding classifier correct sample weights and
and classifying error sample weights and
(3) for a current sample, the sample label is 1, if 1 ═ 1, then it is a positive sample, and 1 ═ 1 is a negative sample; setting the sample weight w to be 1;
(4) updating M weak classifiers with online weak feature structures, wherein a weak classifier updating algorithm is detailed below;
(5) for N selectors, updating weak classifier serial number j and weak classifier weight alpha of the selectorn(ii) a The method comprises the following specific steps:
obtaining the authentication result marks Hyp (M) of the M weak classifiers for the sample, wherein the authentication is 1 correctly, and otherwise, the authentication is 0;
setting unused flag bUsed for each weak classifiermWhether the flag has been selected for use by a certain selector, and if it has been used as 1, it is not used as 0;
for all N selectors, the following process updates are performed:
for all M weak classifiers, according to the authentication result of the M weak classifiers on the sample, if Hyp
mIs 1, then
If not, then,
bUsed if the current weak classifier is not in use m1, then skip, and perform the following on all unused weak classifiers:
calculating an authentication error rate
And selects e with the smallest error rate
nWeak classifiers as current selectors, i.e. taking j ═ argmin (e)
n,m) Simultaneously calculate
Updating the weight of the sample, if Hyp (j) is 1
If not, then,
replacing T weak classifiers with the worst authentication effect;
secondly, the weak classifier construction updating algorithm can be as follows: for front U-dimensional Gabor features with different scales and different positions, a nearest neighbor based mode is adopted to constructThe weak classifier is in the form of
Wherein,
for the jth feature center of the positive sample,
is the jth feature center of the inverse sample, f
j(x) Is the current feature. A feasible weak classifier online updating algorithm is that the mean sum of weak features is updated online in a Kalman filtering mode
Therefore, online updating of the weak classifiers is realized. D (f)
1,f
2) Representing a feature f
1,f
2The absolute value of the difference between.
In addition, in order to ensure that the obtained preset face model can be subjected to incremental training, the preset face model also needs to be obtained by adopting the incremental learning training method. Namely, the face images collected when the initial user is logged in are used as positive samples, the face images of other users and illegal users are used as negative samples, and the increment learning training method is sequentially adopted to obtain the preset face model one by one according to the crossing sequence of the positive samples and the negative samples.
And taking the face images acquired when the user is recorded as a positive sample, taking the face images of other users and illegal users as negative samples, sequentially sending the face images one by one to the incremental learning module according to the sequence of crossing the positive and negative samples, and training to obtain the face authentication model.
In a preferred embodiment, between step 102 and step 103, a number of frame images whose face rotation angles do not exceed a predetermined error range may also be selected from the continuous images in which the face region is detected to extract features. That is, the continuous images in step 102 are not all processed in step 103, but a plurality of frames of images, some of which are considered to have the expression pose of the face region meeting the predetermined condition, are selected for processing in step 103. This is because, in the image acquisition process, if the user does not experience or unconsciously rotates the head, so that the faces in the acquired continuous images are not all at the acquisition target surface of the camera, the face regions in the continuous images are not all "ideal" face regions, and at this time, several frames of images in which the rotation angle of the face does not exceed the predetermined error range can be selected to enter the processing in step 103. The calculation method of the face rotation angle comprises the following steps:
as shown in fig. 2, assuming that the head of a human is a cylinder, the left and right glasses and the mouth are distributed on the surface of the same cylinder with radius r, according to the rule of 'three-stop five-eye' distribution of human face organs, that is, the front of the face is divided into five equal parts longitudinally, the distance between two eyes is the distance of one eye, the distance between the perpendicular line of the external canthus and the perpendicular line of the external ear hole is the distance of one eye, and the distance between the whole front of the face is divided into five eyes longitudinally, the radial included angle between the eye feature point and the mouth feature point can be estimated
And obtaining r and theta according to the marking information of the left and right eye characteristic points and the mouth characteristic points, wherein the r and theta satisfy the following relation:
rsin(α+θ)-rsinθ=a
rsin(α-θ)+rsinθ=b
solving the two equations to obtain:
<math><mrow><mi>θ</mi><mo>=</mo><mi>arctg</mi><mo>[</mo><mfrac><mrow><mrow><mo>(</mo><mi>b</mi><mo>-</mo><mi>a</mi><mo>)</mo></mrow><mi>sin</mi><mi>α</mi></mrow><mrow><mrow><mo>(</mo><mi>b</mi><mo>+</mo><mi>a</mi><mo>)</mo></mrow><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>cos</mi><mi>α</mi><mo>)</mo></mrow></mrow></mfrac><mo>]</mo></mrow></math>
<math><mrow><mi>r</mi><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><msqrt><mfrac><msup><mrow><mo>(</mo><mi>a</mi><mo>+</mo><mi>b</mi><mo>)</mo></mrow><mn>2</mn></msup><mrow><msup><mi>sin</mi><mn>2</mn></msup><mi>α</mi></mrow></mfrac><mo>+</mo><mfrac><msup><mrow><mo>(</mo><mi>a</mi><mo>-</mo><mi>b</mi><mo>)</mo></mrow><mn>2</mn></msup><msup><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>cos</mi><mi>α</mi><mo>)</mo></mrow><mn>2</mn></msup></mfrac></msqrt></mrow></math>
thereby estimating the face rotation angle theta and the head radius r.
If the face rotation angle theta does not meet the preset error range thetamin≤θ≤θmaxFor example (-60 °, 60 °) the face rotation error in the image is considered too large, and the step 103 is not selected.
In summary, the training and authentication processes of the face model in the face authentication method of the present invention are performed gradually, and the face model can obtain better authentication accuracy and self-maintenance capability through continuous training and updating. Meanwhile, a plurality of frame images with part of face rotation angles not exceeding a preset error range are selected to extract features, and certain authentication accuracy can be improved.
Referring to fig. 3, a block diagram of a face authentication apparatus 300 according to an embodiment of the present invention is shown. The face authentication apparatus 300 includes: an image acquisition module 301, a face tracking and positioning module 302, an image selection module 303, a feature extraction module 304, a face authentication module 305, a feature addition module 306, an incremental learning module 307, and a face model library 308.
The image capturing module 301 may be a camera for capturing continuous images of a user, such as: one frame of image of the user is acquired at a rate of 30 frames per second using a high-definition camera with a resolution of 1280 x 960.
The face tracking and positioning module 302 detects a face region from the continuous images, and performs tracking and positioning after the face region is detected, wherein the positioning can be performed through eye feature points and mouth feature points in the face region.
The image selection module 303 calculates a face rotation angle according to the eye feature points and the mouth feature points in the face region, and selects a plurality of frames of images in which the face rotation angle in the continuous images conforms to a predetermined error range. In the embodiment shown in fig. 4, the image processing device comprises a feature point extracting unit 402, a rotation angle calculating unit 404 and an image selecting unit 406, wherein the feature point extracting unit 402 extracts feature points of eyes and mouth from a face region of a continuous image; the rotation angle calculating unit 404 calculates the face rotation angle θ through the feature points, and the calculation formula is as follows:
<math><mrow><mi>θ</mi><mo>=</mo><mi>arctg</mi><mo>[</mo><mfrac><mrow><mrow><mo>(</mo><mi>b</mi><mo>-</mo><mi>a</mi><mo>)</mo></mrow><mi>sin</mi><mi>α</mi></mrow><mrow><mrow><mo>(</mo><mi>b</mi><mo>+</mo><mi>a</mi><mo>)</mo></mrow><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>cos</mi><mi>α</mi><mo>)</mo></mrow></mrow></mfrac><mo>]</mo></mrow></math>
wherein a is the horizontal distance between the right-eye characteristic point and the mouth characteristic point, b is the horizontal distance between the left-eye characteristic point and the mouth characteristic point, and alpha is 20-30 degrees; the image selecting unit 406 selects a number of frame images whose face rotation angle θ does not exceed a predetermined error range.
The feature extraction module 304 extracts facial features from several frame images. Specifically, the feature extraction module 304 may select Gabor features of different scales and different directions by using an adaboost algorithm, and then select a front M-dimensional Gabor feature that is most effective for authentication.
The facial feature module 305 determines whether the facial features in the images conform to a preset facial model by using the facial model library 308, and when the number of the images conforming to the preset facial model exceeds a first threshold value, the user authentication corresponding to the facial features in the images is considered to be successful.
The feature adding module 306 determines whether the number of images conforming to the preset face model exceeds a second threshold, and if so, extracts all face features in the plurality of frames of images as sample features.
The incremental learning module 307 performs incremental training on the preset face model by using the sample features, and the specific incremental learning training method may refer to the foregoing method. And the second threshold should be no less than the first threshold in order to obtain an image with a higher authentication confidence as a sample.
The foregoing description has disclosed fully preferred embodiments of the present invention. It should be noted that those skilled in the art can make modifications to the embodiments of the present invention without departing from the scope of the appended claims. Accordingly, the scope of the claims of the present invention should not be limited to the particular embodiments described.