WO2018218839A1

WO2018218839A1 - Living body recognition method and system

Info

Publication number: WO2018218839A1
Application number: PCT/CN2017/104612
Authority: WO
Inventors: 陈�全
Original assignee: 广州视源电子科技股份有限公司
Priority date: 2017-06-02
Filing date: 2017-09-29
Publication date: 2018-12-06
Also published as: CN107358152A; CN107358152B

Abstract

A living body recognition method, comprising the steps: detecting the movement of at least two parts of a face to be detected (S1); on the basis of the movement of each part, acquiring a movement score corresponding to the movement of each part of the face to be detected (S2); calculating the weighted sum of the movement scores corresponding to the movement of each part, and using the calculated sum as a living body recognition score (S3), wherein the movement of each part already has a preset corresponding weighting; and determining that a face to be detected having a living body recognition score not less than a preset threshold is a living body (S4). A corresponding living body recognition system, comprising at least two part movement detection units, a part movement score acquisition unit, a living body recognition score calculation unit, and a living body determining unit. The present method and system have low device hardware requirements, can assure effective recognition of a living body, have strong scalability and high security, and are not vulnerable to attack.

Description

Living body identification method and system

Technical field

The present invention relates to the field of face recognition, and in particular, to a living body recognition method and system.

Background technique

With the development of face recognition technology, more and more scenes need to use face detection to quickly identify a person's identity. However, there are unscrupulous members who use pictures or videos instead of real people to perform face recognition, so that the security of the entire face recognition system cannot be guaranteed. The human face detection can detect that the person currently performing face recognition is a living face rather than a face in a photo or video, thereby ensuring the security of the face recognition system.

The following are several existing biometric identification technology solutions and their shortcomings:

In the first scheme, the infrared camera is used to obtain the face temperature to perform the human face detection. The drawback of this type of solution is that it has higher hardware requirements.

In the second scheme, only one type of three-dimensional face gesture is detected to determine whether it is a living body. This type of scheme has a single algorithm and is not safe.

Summary of the invention

An object of the embodiments of the present invention is to provide a living body identification method and system, which have low hardware requirements and high security.

To achieve the above objective, an embodiment of the present invention provides a living body identification method, and the living body identification method includes the following steps:

Detecting the movement of at least two parts of the face to be tested;

Obtaining a motion score corresponding to each part of the motion of the part to be tested based on the motion of each part of the part; calculating a weighted sum of the motion scores corresponding to each part of the motion, and calculating the calculated The sum is used as a living body recognition score; wherein each of the part movements has preset a corresponding weight;

The face to be tested whose living body recognition score is not less than a preset threshold is determined to be a living body.

Compared with the prior art, a living body identification method disclosed in the embodiment of the present invention obtains a motion score of at least two parts on the face of the person to be tested, and weights the part motion score and then sums it as a living body. Identifying the score, using the living body recognition score as a technical solution for determining whether the face to be tested is a living body; using the detection of at least two parts of the motion solves the problem that the algorithm in the prior art is single and the security is not high, The scalability is strong, and the detection based on the motion of the face part can be realized by the two-dimensional image, and the hardware requirements are not high. In addition, the weight of the different parts is weighted and then the score fusion is performed, the accuracy of the living body recognition is high, and the living body identification method is accurate. High rate, low hardware requirements and high security.

Further, the at least two parts of the movement include eye movement, mouth movement, head movement, eyebrow movement, forehead movement and face At least two parts of the movement in the movement.

As a further solution, the motion of the part corresponding to the detection may be several of a plurality of parts on the face part, so that when the living body detection is performed, the selectivity is wide, and the malicious attack is largely resisted, which greatly increases safety.

Further, the detecting the movement of at least two parts of the face to be tested includes the following steps:

Detecting a key point position of the part corresponding to the part motion for each video frame extracted by the face video of the face to be tested;

The motion of the part is determined by the degree of change in the position of the key point of each of the extracted video frames.

As a further solution, the motion of the part motion is determined by detecting the degree of change of the position of the key point corresponding to the motion of the part by detecting each video frame that is extracted, and the detection method can be implemented only by using a two-dimensional image, and The algorithm is simple, the requirements on the device are not high, and the recognition efficiency is high.

Further, the weight corresponding to each part of the motion is set according to the visibility of each part of the motion; or, the weight corresponding to each part of the motion is according to each of the current application scenarios. The accuracy of the movement of the part is set.

Further, determining that the living body identification score is not less than a preset threshold comprises the steps of:

Calculating, by the ratio of the living body recognition score to the total score of the living body recognition, the living body recognition confidence of the face to be tested;

When the living body recognition confidence is not less than a preset value, determining that the living body recognition score is not less than a preset threshold.

As a further solution, the living body recognition score can be normalized to the living body confidence level, thereby performing living body judgment, and the living body confidence level can also be used for living body grading, and the recognition result is richer than the prior art.

Correspondingly, the embodiment of the present invention further provides a living body identification system for identifying whether the face to be tested is a living body, and the living body identification system includes:

At least two parts motion detecting units, each of the part motion detecting units is configured to detect a part motion corresponding to the face to be tested, and obtain a corresponding motion score;

a living body recognition score calculation unit, configured to calculate a weighted sum of motion scores corresponding to each of the part motions, and use the calculated sum as a living body recognition score; wherein the living body recognition score calculation unit The weight corresponding to each of the part movements has been preset;

The living body judging unit is configured to determine that the human face to be tested whose living body recognition score is not less than a preset threshold is a living body.

Compared with the prior art, the living body identification system disclosed in the embodiment of the present invention acquires the motion scores of at least two parts of the face of the person to be tested through at least two parts motion detecting unit, and uses the living body identification score calculation unit. The part motion score is weighted and summed as a living body recognition score, and the living body judgment unit uses the living body recognition score as a criterion for determining whether the face to be tested is a living body. The technical solution solves the problem that the algorithm in the prior art is single and the security is not high, and the scalability is strong, and the detection based on the movement of the face part can be realized by the two-dimensional image, and the hardware requirement is required. It is not high. In addition, the weight of the different parts is weighted and then the score fusion is performed. The accuracy of the living body recognition is high, and the beneficial effects of high recognition accuracy, low hardware requirements and high safety are obtained.

Further, at least two of the at least two part motion detection units in the at least two part motion detection units include at least two parts of the eye movement, the mouth movement, the head movement, the eyebrow movement, the forehead movement, and the facial movement.

Further, each of the part motion detecting units includes:

a part detecting module, configured to detect a key point position of the part corresponding to the part motion for each video frame extracted by the face video of the face to be tested;

The part motion condition obtaining module is configured to determine a motion of the part by using a degree of change of a position of a key point of each of the extracted video frames, and obtain a corresponding motion score according to the motion of the part.

Further, the weight corresponding to each part of the motion in the living body recognition score calculation unit is set according to the visibility of each part of the motion; or, the living body recognition score calculation unit is The weight corresponding to each part of the motion is set according to the accuracy of each part of the motion in the current application scenario.

Further, the living body determining unit includes:

a biometric recognition confidence calculation module, configured to calculate a living body recognition confidence of the human face to be tested by using a ratio of the living body recognition score to a living body recognition total score;

a living body judging module, configured to determine that the living body identification score is not less than a preset threshold when the living body recognition confidence is not less than a preset value, and determine that the living body recognition score is not less than a preset threshold The face is a living body.

DRAWINGS

1 is a schematic flow chart of Embodiment 1 of a living body identification method according to the present invention;

2 is a schematic flow chart of step S1 of Embodiment 1 provided by a living body identification method according to the present invention;

3 is a schematic diagram of a 68-point model of a face to be tested;

4 is a schematic flow chart of step S3 of Embodiment 1 of the living body identification method provided by the present invention;

Fig. 5 is a schematic structural view showing an embodiment of a living body recognition system according to the present invention.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. Based on embodiments in the present invention, the field All other embodiments obtained by a person of ordinary skill in the art without creative efforts are within the scope of the invention.

A living body identification method of the present invention provides a first embodiment. Referring to FIG. 1, FIG. 1 is a schematic flowchart of Embodiment 1 of a living body identification method according to the present invention, including the steps:

S1. Detecting movement of at least two parts of the face to be tested;

S2. Obtain a motion score corresponding to each part of the motion of the face to be tested based on the motion of each of the parts;

S3, calculating a sum of weighted motion scores corresponding to each part of the motion, and using the calculated sum as a living body recognition score; wherein each part of the motion has preset a corresponding weight;

S4. The face to be tested that determines that the living body recognition score is not less than a preset threshold is a living body.

Preferably, detecting at least two parts of the face to be tested in step S1 of the embodiment comprises detecting eye movement, mouth movement and head movement; generally, eye movement, mouth movement and head of the face The degree of exercise is obvious, which is conducive to detection, and the calculation is simple and efficient.

Specifically, referring to FIG. 2, FIG. 2 is a schematic flowchart of step S1 of the first embodiment, where step S1 includes:

S11. The face position of the part corresponding to the motion of each video frame extracted by the face video of the face to be tested is detected every preset frame number;

Referring to FIG. 3, FIG. 3 is a 68-point model of the face to be tested; specifically, the continuous frame/jump frame of the face video of the face to be tested uses the dlib library to perform face detection and face key of the face to be tested. Point detection, the dlib library here is a cross-platform general library written in C++ technology; 68 key points of each video frame can be obtained; it can be obtained from the 68 key points of the acquired face to be tested. The position of the key point corresponding to the desired part movement.

S12. Determine the motion of the part by the degree of change of the position of the key point of each video frame extracted.

A preferred embodiment of setting the weight corresponding to each part of motion in step S3 of the first embodiment is based on the visibility of each part of motion. In the first embodiment, the general strategy is adopted, the mouth movement is relatively obvious, so the weight is the largest, the head motion simulation accuracy is the lowest, and the weight is the smallest. The weighting strategy of the part motion in the first embodiment is: mouth movement>eye movement> Head movement

Or, another preferred embodiment for setting the weight corresponding to the motion of each part in step S3 is set by automatically performing weight adjustment of the part motion according to different application scenarios, in a specific scenario: Collect the normal input video of the motion of various parts of the face to be tested as a positive sample, and attack the video as a negative sample, taking (positive sample pass number + negative sample reject number) / (positive sample total + negative sample total) as the part motion The accuracy rate, then the accuracy of each part of the movement is sorted in descending order, the weight of each part of the movement is also in this order from large to small, to re-adjust the weight of each part of the movement. The re-adjusted weight is used to calculate the living body recognition score, and the recognition result can adapt the accuracy of the part motion detection in different scenarios, and increase the accuracy of the living body recognition result of the embodiment.

Any of the above two preferred embodiments for setting the weight corresponding to the motion of each part is within the scope of protection of the present embodiment.

Specifically, referring to FIG. 4, FIG. 4 is a schematic flowchart of step S4, including steps:

S41. Calculate a living body recognition confidence of the face to be tested by using a ratio of the living body recognition score to the total score of the living body recognition;

S42. When the living body recognition confidence is not less than a preset value, determining that the living body recognition score is not less than a preset threshold;

S43. The face to be tested that determines that the living body recognition score is not less than a preset threshold is a living body.

Specifically, in step S41, the living body recognition total score is the maximum value that can be obtained after the face to be tested is identified in the embodiment, and the living body recognition confidence of the face to be tested is calculated by the following formula:

f=(s/s_max)*100%

Where s_max represents the total score of the living body recognition, and f represents the confidence of the living body recognition, and 0<f<1;

Use e to indicate the preset value. When f≥e, that is, the living body recognition confidence is not less than the preset value, it is determined that the living body recognition score is not less than the preset threshold, and the person whose living body recognition score is not less than the preset threshold is determined. The face is a living body; when f<e, that is, the living body recognition confidence is less than the preset value, it is determined that the living body recognition score is less than the preset threshold, and the face to be tested whose living body recognition score is less than the preset threshold is determined to be inactive.

The living body recognition confidence obtained by using the living body recognition score can be further expanded, and is used in the present embodiment to establish a classification system for living body judgment and living body classification to obtain rich living body recognition results.

In detail, in conjunction with FIG. 3, a specific process of determining the motion of the part from the degree of change in the position of the key point of the acquisition portion in step S12 is as follows:

Among them, the detection process of the mouth movement: the 8 key points of 61-68 in the obtained 68-point model of the face represent the mouth of the face to be tested. Define the maximum value of the x coordinate in the 8 key points minus the minimum value of the x coordinate, which is the length of the mouth. Define the maximum value of the y coordinate among the 8 key points minus the minimum value of the y coordinate. width. The mouth length is divided by the mouth width to represent the mouth value, and the thresholds a1 and a2 are set, wherein a1 < a2; when the mouth value is less than a1, the mouth is opened, and when the mouth value is greater than a2, The mouth is closed. In each video frame of the face video extracted from the face to be tested, if the movement of the mouth movement determined by the partial frame is that the mouth is opened, and the movement of the mouth determined by the other partial frame is When the mouth is closed, it is determined that the mouth has motion.

Among them, the detection process of the eye movement: using the obtained key points of 37-48 in the 68-point model of the face to represent the eye of the face to be tested; wherein, the 37-42 key points represent the right eye The four key points of 43-48 represent the left eye. The maximum value of the x coordinate of the six key points representing the right eye minus the minimum value of the x coordinate is the length of the right eye, and the maximum value of the y coordinate of the six key points of the right eye minus the minimum value of the y coordinate is The width of the right eye; dividing the length of the right eye by the width of the right eye to represent the value of the right eye, the same as the value of the left eye; preferably, defining the average of the value of the left eye and the value of the right eye as the eye value, setting the threshold b1 and B2, where b1 < b2, when the eye value is less than b1, it means that the eye is open, and when the eye value is greater than b2, it means that the eye is closed. In each video frame of the face video extracted from the face to be tested, if the eye movement determined by the partial frame is the eye opening, and the eye movement determined by the other partial frame is When the eye is closed, it is determined that the eye has motion.

In the present embodiment, in addition to the preferred embodiment in which the average value of the left eye value and the right eye value is defined as the eye value to determine the motion condition by the eye value, it is also possible to directly pass the right eye value and/or the left. The eye value determines the corresponding right eye motion and/or left eye motion, that is, changes the eye motion into the left eye-right eye, the right eye-left eye, and only the left eye and only the right eye. As the movement process increases, the whole living body is more variability, which can increase the safety of living body detection.

Among them, the detection process of the head movement: using the six key points representing the left eye, the six key points representing the right eye, and the key points 34, 49 and 55 in the obtained 68-point model of the face to detect the face Head movement; wherein, the average value of the x coordinate of the six key points representing the left eye is defined as the x coordinate of point A, and the average value of the y coordinate of the six key points of the left eye is the y coordinate of point A, the same The right eye B point is defined, and the key points 34, 49, and 55 in the 68-point model are defined as C point, D point, and E point, respectively. The A to E points obtained above are five-point models representing facial feature points. Then, using the small hole camera model in opencv in the open source image library, the angle value of the face in the three-dimensional space-the yaw angle yaw value and the pitch angle pitch value are obtained according to the five-point model of the face feature point described above. There are four types of movements of the head movement: the left head turns, the head turns right, the head rises, and the head bows. Thresholds c1 and c2 are set, where c1 < c2; when yaw < c1, it means that the head turns left, and when yaw > c2, it means that the head turns right. Thresholds d1 and d2 are set, where d1 < d2; when pitch < d1, it means that the head is head down, and when pitch > d2, it means that the head is headed up. When the yaw value is between c1 and c2, and d1 <pitch < d2, it means that the head is facing forward. In each video frame of the face video extracted from the face to be tested, if the condition of the head motion determined by the partial frame is head-up, and the condition of the head movement determined by the other partial frame is normal head toward Before, the head of the face to be tested has a head-up motion, that is, it is determined that the head has motion; and so on, by detecting the head of the face to be tested, there is a head movement, a left-hand movement, and a right-hand movement. It is determined that the head has motion.

Correspondingly, the step S2 acquires the corresponding motion score according to the situation of the part motion determined by the part motion detection process, which specifically includes:

The condition of the mouth movement to obtain the corresponding motion score includes: the mouth has motion, the obtained motion score of the mouth movement is 1 point; the mouth has no motion, and the obtained motion score of the mouth movement is 0 points.

The case of eye movement to obtain the corresponding exercise score includes:

It is determined that there is motion in the eye, and the obtained motion score of the eye movement is 1 point; if the eye is not motioned, the obtained motion score of the eye movement is 0.

The condition of the head movement obtains the corresponding motion score includes: if the head of the face to be tested has any one of a head movement, a head movement, a left-hand movement, and a right-hand movement, the head is determined Exercise, the obtained motor movement has a score of 1 point. If the head of the person to be tested has no head movement, head movement, left-hand movement and right-hand movement, the head is Without exercise, the obtained head movement has a motion score of 0.

In the specific implementation, each video frame extracted by the preset number of frames of the face video of the face is first acquired with 68 key points of the face, thereby acquiring the eye movement, the mouth movement and the to-be-detected respectively. Eye key position corresponding to head movement, mouth key position and head a key point position to determine the state of the eye, mouth, and head of the video frame; then determining the eye movement, mouth motion, and the state of the eye, mouth, and head from the plurality of extracted video frames, respectively The condition of the head movement; the corresponding motion score is obtained according to the motion of each part, specifically, if the part has motion, the obtained exercise score is 1 point, otherwise the obtained exercise score is 0 points; then the above calculation is performed Obtaining a weighted sum of the motion scores of each part, the sum represents a living body identification score; finally, the living body identification confidence is calculated by using the ratio of the living body recognition score to the total score of the living body recognition, wherein the confidence of the living body recognition is not less than When the preset value is determined, it is determined that the living body recognition score is not less than the preset threshold, thereby determining that the face to be tested is a living body; otherwise, determining that the face to be tested is not a living body.

This embodiment can be applied to a variety of device terminals. The implementation scenario of the mobile phone terminal is taken as an example. When the mobile phone is in vivo recognition, a sequence of living action requests is randomly generated, for example, the face to be tested is required to be separately performed. The left side of the head, the blinking of the eye and the opening of the mouth; at this time, if the weight of the preset part motion is the weight of the mouth movement corresponding to the mouth mouth w1=3, the weight of the eye movement corresponding to the blinking is w2=2, the head left The weight of the corresponding head movement is w3=1; the total score of the living body recognition is calculated, that is, the highest score of the living body recognition s_max is 3*1+2*1+1*1=6 points. Assume that the score of the open mouth is 1 point, the score of the blink is 1 point, the score of the left head of the head is 0, the score of the living body recognition is the sum of the weights of each part, and the exercise score of the above part is calculated to calculate the living body. The recognition score s=3*1+2*1+1*0=5 points; finally, the living body recognition confidence f=s/s_max=5/6=83.33% is calculated. If the set value e is set to 80% at this time, it is determined that the face to be tested is a living body, and the living body confidence is 83.33%.

The embodiment solves the problem that the algorithm is single and the security is not high in the prior art, and the scalability is strong; the detection of the motion of the part of the face to be tested can be realized by the two-dimensional image, and the hardware requirement of the device is not high; In the present embodiment, the detection of eye movement, mouth movement and head movement is used for living body recognition, and the motion effects of these parts are obvious, and the accuracy of motion judgment is high; Score fusion, high accuracy of living recognition; detection of multiple parts of motion is conducive to improving safety.

A second embodiment of the present invention provides a second embodiment of the present invention. The main process of the second embodiment can be referred to the steps S1 to S4 of the first embodiment of the present invention. For the flowcharts of the steps S41-S43 in the first embodiment and the setting of the motion weights in the step S3, refer to the first embodiment; the details are not described herein.

The flow of the steps included in the step S1 of the second embodiment can be referred to the first embodiment of FIG. 2, and the steps S11-S12 are also included:

The difference is that in the second embodiment, in combination with FIG. 3, the position of the key point of the acquisition part is changed in step S12 of the embodiment. The specific implementation process to determine the movement of the site is:

Among them, the detection process of the mouth movement: using the obtained key points of 61-68 in the 68-point model of the face to represent the mouth of the face to be tested, using the mouth that has been trained by the SVM classifier in advance The state classification model predicts the mouth state of each frame of the face video of the face to be tested, wherein the pre-training process of the mouth state classification model trained by the SVM classifier is: 61 in the 68-point model of the face -68 These 8 key points indicate the mouth features of the face to be tested, manually select a certain number of face photos of the mouth, and mark the mouth state of these faces as 1; manually select a certain number of mouths The part is a closed face photo, and the face state of these face photos is marked as 0, and then the SVM classifier is used to train the mouth state classification model. If the mouth state of the extracted video frames has both 0 and 1, it is determined that the mouth has motion, otherwise it is determined that the mouth has no motion.

In another embodiment, the 8 key points of 61-68 in the obtained 68-point model of the face are used to represent the mouth of the face to be tested, and the mouth state is trained by the soft-max regression classifier. The model predicts the mouth state score of each frame of the face video of the face to be tested, wherein the pre-training process of the trained mouth state classification model by the soft-max regression classifier is: according to the mouth opening difference Degrees are marked on a number of face photos, that is, the state score is marked on the mouth according to the degree of opening of the mouth: the score can be set to 10 levels, and the value is between 0 and 1; then, the mouth is closed for 0 points. The maximum opening mouth is 1 point, and the half opening mouth is 0.5 points. According to the mouth state classification model that has been trained by the soft-max regression classifier in advance, the mouth state scores of several video frames extracted by the face video of the face to be tested can be obtained; when the maximum and minimum of the mouth state scores When the difference between the values is greater than the preset threshold, the mouth is considered to have motion, otherwise the mouth has no motion.

Among them, the detection process of the eye movement: using the obtained key points of 37-48 in the 68-point model of the face to represent the eye of the face to be tested; wherein, the 37-42 key points represent the right eye The four key points of 43-48 represent the left eye. Predicting the eye state of each frame of the face video of the face to be tested with the eye state classification model trained in advance by the SVM classifier, wherein the pre-training of the eye state classification model trained by the SVM classifier The process is as follows: the 12 key points of 37-48 in the 68-point model of the face represent the eye features of the face to be tested, and manually select a certain number of face images of the eyes in the blinking state, and mark the faces of the faces. The eye state is 1; manually select a certain number of eyes as the face photos of the eye closed state, mark the eye state of these face photos as 0, and then train with the SVM classifier as the eye state classification model. If the eye state of the extracted video frames has both 0 and 1, it is determined that the eye has motion, otherwise it is determined that the eye has no motion.

In another embodiment, the 12 key points of 37-48 in the obtained 68-point model of the face are used to represent the eye of the face to be tested, and the eye state classification trained by the soft-max regression classifier is used in advance. The model predicts an eye state score of each frame of the face video of the face to be tested, wherein the pre-training process of the eye state classification model trained by the soft-max regression classifier is: according to the difference of the eye opening Degrees are marked on a number of face photos, that is, the state score is marked on the eye according to the degree of opening of the eye: the score can be set to 10 levels, and the value is between 0 and 1; then, the eye is closed to 0. Points, the maximum blink is 1 point, and the half blink is 0.5 points. According to the eye state classification model trained by the soft-max regression classifier in advance, the eye state scores of several video frames extracted by the face video of the face to be tested can be obtained; when the maximum and minimum of the eye state scores When the difference between the values is greater than the preset threshold, the eye is considered Have exercise, otherwise there is no movement in the eyes.

In the second embodiment, in addition to the preferred embodiment in which the average value of the left eye value and the right eye value is defined as the eye value to determine the motion condition by the eye value, it is also possible to directly pass the right eye value and/or The left eye value is used to determine the corresponding right eye motion and/or left eye motion, that is, the eye motion is changed to the left eye-right eye, the right eye-left eye, and only the left eye and only the right eye. As the eye movement process increases, the whole living body is more variability, which can increase the safety of living body detection.

Among them, the movement of the head movement is four kinds: the left side of the head, the right turn of the head, the head of the head and the head of the head. Here, the head raising is taken as an example to illustrate the detection process of the head movement: The head state of each frame of the face video of the face to be tested is predicted by the SVM classifier trained head state classification model, wherein the pre-training process of the head state classification model trained by the SVM classifier is: The six key points representing the left eye, the six key points representing the right eye, and the key points 34, 49, and 55 in the 68-point model of the face represent the head features of the face to be tested; Select a certain number of face photos with the head as the heading state, and mark the head state of these face photos as 1; manually select a certain number of heads to face the face in the normal forward state, and mark the head of these face photos The state is 0; then the SVM classifier is trained to classify the head state. If the head states of the extracted video frames have both 0 and 1, it is determined that the head has motion, otherwise it is determined that the head has no motion.

In another embodiment, the six key points representing the left eye, the six key points representing the right eye, and the key points 34, 49, and 55 in the obtained 68-point model represent the person to be tested. The head of the face predicts the head state score of each frame of the face video of the face to be tested using the head state classification model that has been trained in advance by the soft-max regression classifier, wherein the soft-max regression classification is performed. The pre-training process of the trained head state classification model is: labeling a number of face photos according to different degrees of head heading, that is, marking the head with a state score according to the head lifting degree: a score can be set For level 10, the value is between 0 and 1; then, the head is normally 0 points forward, the maximum head is 1 point, and the half head is 0.5 points. According to the head state classification model trained by the soft-max regression classifier in advance, the head state scores of several video frames extracted by the face video of the face to be tested can be obtained; when the maximum and minimum of the head state scores are obtained When the difference between the values is greater than the preset threshold, the head is considered to have motion, otherwise the head has no motion.

Similarly, the detection process of the left head turn, the head right turn, and the head down head three other head movements is similar to the above-described head motion detection process using the head lift as an example, and will not be described here.

Correspondingly, the step S2 acquires the corresponding motion score according to the motion of the part determined by the part motion detection process, which specifically includes:

The motion of the mouth movement obtains the corresponding motion score: it is determined that the mouth has motion, and the obtained motion score of the mouth movement is 1 point; if the mouth has no motion, the obtained motion score of the mouth movement is 0 .

The motion of the eye movement obtains the corresponding motion score: it is determined that the eye has motion, and the obtained motion score of the eye movement is 1 point; if the eye has no motion, the obtained motion score of the eye movement is 0. .

The motion of the head movement obtains the corresponding motion score: it is determined that the head has motion, and the obtained motion score of the head motion is 1 point. If it is determined that the head has no motion, the obtained motion score of the head motion is 0. Minute.

In this embodiment, the degree of motion of each part of the motion can also be obtained by step S1, and correspondingly, in step S2, a motion score between 0 and 1 is obtained based on the degree of motion, instead of just getting 1 or 0. The exercise score, the alternative embodiment not only indicates whether there is motion, but also the degree of exercise.

In a specific implementation, each video frame extracted by the preset number of frames of the face video of the face is obtained by acquiring 68 key points of the face, thereby respectively acquiring the position of the key point of the eye to be detected, and the mouth. Key position and head key position to determine the state of the eye, mouth and head of the video frame; then determine the eye from the state of the eye, mouth and head in several extracted video frames The movement, the mouth movement and the head movement; the corresponding motion score is obtained according to the motion of each part; then the sum of the weight scores of each part is calculated, and the sum represents the living body recognition score; Finally, the living body recognition confidence is calculated by using the ratio of the living body recognition score to the total score of the living body recognition, wherein when the living body recognition confidence is not less than the preset value, determining that the living body recognition score is not less than a preset threshold, thereby determining the person to be tested The face is a living body; otherwise, it is determined that the face to be tested is not a living body.

The second embodiment can be applied to multiple device terminals. The implementation scenario of the mobile phone terminal is taken as an example. When the mobile phone is in vivo recognition, a sequence of living action requests is randomly generated, for example, to request the faces to be tested. Performing a living movement of the left turn of the head, blinking, and opening the mouth; at this time, if the weight of the preset part motion is the weight of the mouth movement corresponding to the mouth opening w1=3, the weight of the eye movement corresponding to the blinking is w2=2, the head The weight of the head movement corresponding to the left turn is w3=1; the total score of the living body recognition is calculated, that is, the highest score of the living body recognition s_max is 3*1+2*1+1*1=6 points. Assume that the score of the open mouth is 1 point, the score of the blink is 1 point, the score of the left head of the head is 0, the score of the living body recognition is the sum of the weights of each part, and the exercise score of the above part is calculated to calculate the living body. The recognition score s=3*1+2*1+1*0=5 points; finally, the living body recognition confidence f=s/s_max=5/6=83.33% is calculated. If the set value e is set to 80% at this time, it is determined that the face to be tested is a living body, and the living body confidence is 83.33%.

The second embodiment solves the problem that the algorithm is single and the security is not high in the prior art, and the scalability is strong; the detection of the motion of the part of the face to be tested can be realized by the two-dimensional image, and the hardware requirement of the device is not high; In addition, in the present embodiment, the detection of eye movement, mouth movement and head movement is used to perform living body recognition, and the motion effects of these parts are obvious, and the accuracy of motion judgment is high; Fractional fusion is performed, and the accuracy of living body recognition is high; the detection of multiple parts of motion is beneficial to improve safety.

The third embodiment of the present invention provides a third embodiment of the present invention. The main process of the third embodiment can be referred to the steps S1 to S4 of the first embodiment of the present invention. For a flowchart of the steps S41-S43 in the first embodiment, the above part may refer to the first embodiment, and details are not described herein.

Generally speaking, the degree of eye movement, mouth movement and head movement of the human face is obvious, which is advantageous for detection, and the calculation is simple and efficient; in the third embodiment, the motion of detecting the part of the face to be tested in step S1 is Including the detection of the eye movement, the mouth movement and the head movement; at the same time, the movement of the part detecting the face to be tested in the step S1 of the third embodiment further includes the movement of the three parts of the facial movement, the eyebrow movement and the forehead movement. At least one of them.

The at least two parts of the motion for detecting the face to be tested in step S1 include the face video of the face to be measured extracted by the preset number of frames. Each video frame detects the location of the key point corresponding to the motion of the part; see Figure 3, Figure 3 is the 68-point model of the face to be tested; specifically, the continuous frame/jump frame of the face video of the face to be measured is dlib The library performs face detection and face key point detection of the face to be tested, and can obtain 68 key points of each video frame extracted; the required part can be obtained from the obtained 68 key points of the face to be tested. The key point position of the part corresponding to the movement. In addition, step S1 further includes face detection of the face to be tested of each video frame, thereby acquiring a face rectangle, which can be seen in the face rectangle HIJK of FIG.

In the third embodiment, a preferred embodiment of setting the weight corresponding to the motion of each part in step S3 is set according to the visibility of each part of the motion. In the third embodiment, the general strategy is adopted, and the weight of the part motion is: mouth movement>eye movement>head movement; the weight of at least one part movement of the facial movement, the eyebrow movement, and the forehead movement is smaller than the above mouth. Weight values for exercise, eye movements, and head movements.

The method for detecting the movement of the mouth of the face to be tested, the movement of the eye and the movement of the head in step S1, and the obtaining the motion score corresponding to the movement of each part of the face to be tested in step S2 may refer to a living body identification method of the present invention. The specific process of detecting the movement of the mouth of the face to be tested, the movement of the eye and the movement of the head, and the motion score corresponding to the movement of each part of the face to be tested, in the first embodiment and the second embodiment, Make a statement. In addition to the above embodiments, the third embodiment of the motion detection of the mouth movement and the eye movement can also adopt other alternative embodiments:

Wherein, an alternative embodiment of the detection process of the mouth movement: the face video of the face to be tested detects the mouth position of the face to be tested for each video frame extracted by the preset number of frames, and calculates the mouth position The gray average value; then it is judged whether the gray level average value of the mouth position is smaller than the preset mouth gray value judgment threshold, and if so, the mouth is in a closed state; if not, the mouth is in an open state. The alternative embodiment utilizes the principle that the mouth is opened to expose the teeth, the teeth are mainly white, and the gray value is relatively large, the average gray value of the mouth opening is large, and the average gray value is small when the mouth is closed, The state of the mouth is recognized by calculating the average gray value of the mouth, thereby determining the condition of the mouth movement. In each video frame of the face video extracted from the face to be tested, if the movement of the mouth movement determined by the partial frame is the mouth opening, and there is another partial frame determined movement of the mouth movement When the mouth is closed, it is determined that the mouth has motion.

Correspondingly, the alternative embodiment obtains the motion score of the corresponding mouth motion, including: determining that the mouth has motion, and the obtained motion score of the mouth motion is 1 point; otherwise, determining that the mouth has no motion, the acquired mouth motion The exercise score is 0.

Wherein, another alternative embodiment of the detection process of the eye movement: the movement of the mouth movement may include the movement of the mouth of the mouth angle, in addition to the mouth opening and closing, such as when the face is smiling, two The corners of the mouth will expand to the sides of the cheeks. The key point 55 in the obtained face 68 point model represents the left corner point, and the key point 49 represents the right corner point. Based on the left and right corner points of the first frame of the face video of the face to be tested, the back extraction is calculated. The distance moved by the left corner of the video frame and the distance moved by the right corner point, and then determines whether the distance moved by the left corner point and the distance moved by the right corner point are greater than a preset threshold, and if so, the state of the mouth motion is determined to be Smile, if not, determine that the state of mouth movement is normal. In each video frame of the face video extracted from the face to be tested, if the movement of the mouth movement determined by the partial frame is a smile state, and the movement of the mouth movement determined by the other partial frame is normal In the state, it is determined that the mouth has motion.

Among them, an alternative embodiment of the detection process of eye movement: the identification object is Asian: the Asian eye color is black, the eyelid color is yellow; the face video of the face is pre-predicted Let each video frame extracted by the number of frames detect the eye position of the face to be tested, determine the position of the eye through the position of the eye, and calculate the average value of the gray of the eye position; then determine whether the average value of the gray of the eye position is less than Set the eyeball gray value judgment threshold. If yes, the eye is in the open state; if not, the eye is closed. This alternative embodiment utilizes the detection of the eyeball position of the eye to identify the difference in the detected average gray value of the closed eye of the eye. Generally, when the Asian eyes blink, the average gray value of the eyeball position of the eye will be relatively small, and when the eye is closed, the average gray value of the eyeball position of the eye will be large. In each video frame of the face video extracted from the face to be tested, if the movement of the eye movement determined by the partial frame is the eye opening, and there is another part of the frame determined movement of the eye movement When the eye is closed, it is determined that the eye has motion.

Correspondingly, the alternative embodiment obtains the motion condition of the corresponding eye movement, and obtains the corresponding motion score, including: determining that the eye has motion, and the obtained motion score of the eye motion is 1 point; determining that the eye has no motion, obtaining The motor score for the eye movement is 0.

Wherein, another alternative embodiment of the detection process of the eye movement: the face video of the face to be tested detects the center position of the eye of the eye of the face to be tested for each video frame extracted by the preset number of frames, And calculating a relative position of the center position of the eyeball in the eye; and then determining whether the distance between the relative position of the center position of the eyeball position in the eye and the normal position of the center position of the eyeball position in the eye is greater than a preset value, If yes, the eyeball position is not in the normal position, and if not, the eyeball position is in the normal position. In each video frame of the face video extracted from the face to be tested, if the eye movement determined by the partial frame is that the eyeball position is not in the normal position, and the eye movement determined by the other partial frame is When the eyeball is in the normal position, the movement of the eye of the face to be tested is that the eyeball rotates, that is, the eye is determined to have motion; otherwise, the eye is determined to have no motion.

The detecting part motion of the face to be tested in step S1 of the third embodiment further includes detecting at least one of facial motion, eyebrow motion, and forehead movement, and the process of detecting facial motion, eyebrow motion, and forehead motion of the face to be tested includes :

Wherein, the process of detecting the facial motion: determining the eye, the mouth and the face region of the face to be tested; and calculating the ratio of the sum of the eye area and the mouth area to the area of the face region; and then determining whether the ratio is Within the preset range value, if yes, it indicates that the face state is normal, and if not, it indicates that the face state is a ghost face state. In each video frame of the face video extracted from the face to be tested, if a part of the frame determines that the state of the face is a ghost state, and another partial frame determines that the state of the face is a normal state, it is determined that the face has motion. The facial movement here includes ghost face movements. In this embodiment, the ratio of the sum of the eye area and the mouth area of the face to the area of the face area exceeds a preset range value; otherwise, it is a normal state; when it is detected that the face has both a ghost state and a normal state, It is determined that the face has a ghost face movement, that is, the face has motion. An example is to calculate the eye area, the mouth area, and the face area: the eye area is obtained by multiplying the eye length by the eye width, and the mouth area is obtained by multiplying the mouth length by the mouth width, through the face rectangle HI JK The area gets the area of the face area.

Correspondingly, obtaining the facial motion to obtain the exercise score includes: the facial score of the facial motion obtained by the motion is 1 point; otherwise, the facial motion is determined to be no motion, and the obtained facial motion has a motion score of 0.

Among them, the detection process of eyebrow movement: the 5 key points of 18-22 in the obtained 68-point model of the face represent the right eyebrow point, and the 5 key points of 23-27 represent the left eyebrow point; The method fits the curve of each eyebrow and calculates the curvature of the key point 20 of the right eyebrow as the characteristic value of the right eyebrow and the curvature of the key point 25 of the left eyebrow as the characteristic value of the left eyebrow, the characteristic value of the right eyebrow and the characteristic value of the left eyebrow. The average value is the eyebrow eigenvalue; then it is judged whether the eyebrow eigenvalue is greater than a preset threshold, and if so, the condition indicating the eyebrow is the eyebrow, and if not, the eyebrow is normal. In each video frame of the face video extracted from the face to be tested, if some frames determine that the state of the eyebrows is an eyebrow, and another partial frame determines that the state of the eyebrows is normal, it is determined that the eyebrows have motion, otherwise Determine that there is no movement of the eyebrows.

Correspondingly, obtaining the eyebrow movement to obtain the exercise score includes: determining that the eyebrow has motion, and obtaining the exercise score of the eyebrow motion is 1 point; determining that the eyebrow has no motion, and obtaining the exercise score of the eyebrow motion is 0.

Among them, the detection process of the forehead movement: the forehead position is determined by the obtained 68-point model of the face, wherein the forehead is determined and then the sobel value of the forehead area is calculated by the sobel operator, and the variance of the sobel value of the forehead area is taken as the forehead wrinkle value. The sobel value here is the result of the convolution operation of the convolution of the pixel of the area containing the same size as the convolution kernel at the center of the current pixel; the extraction of the face video of the face to be tested In a video frame, if the forehead wrinkle value of the partial frame is greater than the first preset threshold, and the forehead wrinkle value of the other partial frame is less than the second predetermined threshold, it is determined that the forehead has motion; otherwise, the forehead is determined to have no motion. The example determines the position of the forehead area: usually the forehead area refers to the area above the eyebrow in the face of the face, based on this definition, the position of the eyebrow key point can be obtained first, and then the forehead area is determined according to the position of the face rectangle and the key point of the eyebrow. As shown in the rectangular box HOPK of Figure 3.

Correspondingly, obtaining the forehead movement to obtain the exercise score includes: determining that the forehead has motion, and the obtained forehead motion has a motion score of 1; determining that the forehead has no motion, and obtaining the forehead motion has a motion score of 0.

In the third embodiment, in addition to the above-mentioned embodiment of whether or not there is a motion score according to whether or not the motion of each part is motioned, it is also possible to obtain a motion score of 0 according to the degree of motion of each part. The motion score between 1 and not just the two motion scores of 1 or 0. This alternative embodiment not only indicates whether there is motion, but also the degree of motion. The third embodiment implemented by this alternative embodiment is also within the scope of the present invention.

In the specific implementation, the face video of the face to be tested is detected for each video frame extracted by the preset number of frames, and the key points of the face are acquired, thereby obtaining the key point positions of each part of the motion, thereby The characteristics of the corresponding part, according to the location of several video frames The characteristic condition determines the motion of each part of the motion, and obtains the corresponding motion score; then calculates the sum of the weighted each part of the motion score, and the sum represents the living body recognition score; and finally uses the living body identification score The value of the living body recognition total score is used to calculate the living body recognition confidence, wherein when the living body recognition confidence is not less than the preset value, it is determined that the living body recognition score is not less than the preset threshold, thereby determining that the face to be tested is a living body; otherwise , to determine that the face to be tested is not a living body.

The third embodiment solves the problem that the algorithm is single and the security is not high in the prior art, and the scalability is strong; the detection of the motion of the part of the face to be tested can be realized by the two-dimensional image, and the hardware requirement of the device is not high; In addition, in the third embodiment, the detection of eye movement, mouth movement and head movement is used to perform living body recognition, and the motion effects of these parts are obvious, the accuracy of motion judgment is high, and the facial motion is expanded. The detection of the movement of the eyebrows and forehead movements improves the accuracy of the recognition results; the weighting of the different parts is used to perform the score fusion, and the accuracy of the living body recognition is high; the detection of the movement of various parts is beneficial to improve the safety. .

An embodiment of the present invention provides a living body identification system. Referring to FIG. 5, FIG. 5 is a schematic structural diagram of the embodiment. The embodiment includes:

At least two parts motion detecting unit 1, each part motion detecting unit 1 is used for detecting the motion of the part corresponding to the face to be tested. In FIG. 5, the part motion detecting unit 1a and the part motion detecting unit 1b indicate that two different parts are detected. The two-part motion detection unit 1 of the movement.

The part motion score unit 2 is configured to obtain a motion score corresponding to each part of the motion of the face to be tested based on the motion of each part;

The living body recognition score calculation unit 3 is configured to calculate the weighted sum of the motion scores corresponding to each part motion obtained, and use the calculated sum as a living body recognition score; wherein the living body recognition score calculation unit 3 has Preset the weight corresponding to each part of the movement.

The living body judging unit 4 is configured to determine that the human face to be tested whose living body recognition score is not less than a preset threshold is a living body.

The motion of at least two parts corresponding to the detected at least two parts of the motion detecting unit 1 includes at least two parts of the movements of the eye movement, the mouth movement, the head movement, the eyebrow movement, the forehead movement and the facial movement.

Preferably, each part of the motion detecting unit 1 comprises:

The part detecting module 11 is configured to detect a key point position of the part corresponding to the movement of the part of each video frame extracted by the face video of the face to be tested;

The part motion condition obtaining module 12 is configured to determine the motion of the part by the degree of change of the position of the key point of each video frame extracted.

The weight corresponding to the motion of each part in the living body recognition score calculation unit 3 is set according to the visibility of the motion of each part; or the weight corresponding to the motion of each part in the living body recognition score calculation unit 3 It is set according to the accuracy of the movement of each part in the current application scenario.

The living body judging unit 4 includes:

The living body recognition confidence calculation module 41 is configured to calculate a living body recognition confidence of the face to be tested by using a ratio of the living body recognition score to the total score of the living body recognition;

The living body judging module 42 is configured to determine that the living body recognition score is not less than a preset threshold when the living body recognition confidence is not less than the preset value, and determine that the living face whose living body recognition score is not less than the preset threshold is a living body.

In a specific implementation, first, the part detecting module 11 of each part of the motion detecting unit 1 detects the key point position of the corresponding part in each of the extracted video frames, and determines the motion of the part motion by the motion score obtaining module 12, Then, the motion score of the part motion is obtained by the part motion score unit 2 based on the motion of the part; then, the motion score of each part motion obtained by the vital body recognition score calculation unit 3 is weighted and summed as the living body recognition. In the last, the biometric recognition confidence calculation module 41 of the living body judging unit 4 calculates the biometric recognition confidence of the face to be tested using the wallpaper of the living body recognition score in the living body recognition score, and determines by the living body judging module 42 when calculating The obtained living body recognition confidence is not less than the preset threshold, and the face to be tested is a living body.

In this embodiment, the detection of at least two parts motion detecting unit solves the problem that the algorithm in the prior art is single and the security is not high, and the scalability is strong, and the detection of the part motion based on the face can be realized by the two-dimensional image, The hardware requirements are not high. In addition, the living body recognition score calculation unit weights the motion of different parts and then performs score fusion. The accuracy of living body recognition is high, and the beneficial effects of high recognition accuracy, low hardware requirements and high safety are obtained.

The above is a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It is the scope of protection of the present invention.

Claims

A living body identification method, characterized in that the living body identification method comprises the steps of:

Detecting the movement of at least two parts of the face to be tested;

Acquiring a motion score corresponding to each part of the motion of the face to be tested based on the motion of each of the parts;

Calculating a weighted sum of the motion scores corresponding to each of the part motions, and using the calculated sum as a living body recognition score; wherein each of the part motions has preset a corresponding weight;

The face to be tested whose living body recognition score is not less than a preset threshold is determined to be a living body.
A living body recognition method according to claim 1, wherein said at least two parts of motion include at least two parts of eye movement, mouth movement, head movement, eyebrow movement, forehead movement, and facial movement .
The living body identification method according to claim 1, wherein the detecting the movement of at least two parts of the face to be tested comprises the steps of:

Detecting a key point position of the part corresponding to the part motion for each video frame extracted by the face video of the face to be tested;

The motion of the part is determined by the degree of change in the position of the key point of each of the extracted video frames.
A living body identification method according to claim 1, wherein a weight corresponding to each of said part movements is set according to a degree of visibility of said each part movement; or each of said part movements The corresponding weights are set according to the accuracy of each part of the motion in the current application scenario.
The living body identification method according to claim 1, wherein the determining that the living body recognition score is not less than a preset threshold comprises the steps of:

Calculating, by the ratio of the living body recognition score to the total score of the living body recognition, the living body recognition confidence of the face to be tested;

When the living body recognition confidence is not less than a preset value, determining that the living body recognition score is not less than a preset threshold.
A living body identification system, characterized in that the living body identification system comprises:

At least two parts motion detecting units, each of the part motion detecting units is configured to detect a motion of a part corresponding to the face to be tested;

a part motion score obtaining unit, configured to acquire a motion score corresponding to each part of the motion of the face to be tested based on the motion of each of the parts;

a living body recognition score calculation unit, configured to calculate a weighted sum of motion scores corresponding to each of the part motions, and use the calculated sum as a living body recognition score; wherein the living body recognition score calculation unit The weight corresponding to each of the part movements has been preset;

The living body judging unit is configured to determine that the human face to be tested whose living body recognition score is not less than a preset threshold is a living body.
A living body identification system according to claim 6, wherein at least two of said at least two of said part motion detecting units correspondingly detected movements including eye movement, mouth movement, head movement, eyebrows At least two parts of the movement, forehead movement and facial movement.
A living body identification system according to claim 6, wherein each of said part motion detecting units comprises:

a part detecting module, configured to detect a key point position of the part corresponding to the part motion for each video frame extracted by the face video of the face to be tested;

The part motion condition obtaining module is configured to determine the motion of the part by the degree of change of the position of the key point of each part of the extracted video frame.
The living body identification system according to claim 6, wherein the weight corresponding to each of the part movements in the living body recognition score calculation unit is set according to the visibility of each part of the movement Or; the weight corresponding to each of the part motions in the living body recognition score calculation unit is set according to an accuracy rate of each of the part motions in the current application scenario.
The living body identification system according to claim 6, wherein the living body determining unit comprises:

a biometric recognition confidence calculation module, configured to calculate a living body recognition confidence of the human face to be tested by using a ratio of the living body recognition score to a living body recognition total score;

a living body judging module, configured to determine that the living body identification score is not less than a preset threshold when the living body recognition confidence is not less than a preset value, and determine that the living body recognition score is not less than a preset threshold The face is a living body.