CN109784302B

CN109784302B - Face living body detection method and face recognition device

Info

Publication number: CN109784302B
Application number: CN201910082329.XA
Authority: CN
Inventors: 曹诚; 占广; 陈涛; 陈炳轩; 吴梦溪; 李发成
Original assignee: Shenzhen Xinheyuan Technology Co ltd
Current assignee: Shenzhen Xinheyuan Technology Co ltd
Priority date: 2019-01-28
Filing date: 2019-01-28
Publication date: 2023-08-15
Anticipated expiration: 2039-01-28
Also published as: CN109784302A

Abstract

The invention is suitable for the field of image processing, and provides a human face living body detection method and human face recognition equipment. The method comprises the following steps: detecting a human face in the image; generating a random action instruction related to the human face; collecting an image sequence; detecting the face action of a user according to the acquired image sequence, calculating statistics related to each random action instruction according to the positions of face key points, respectively calculating the relative variation of all statistics related to each random action instruction, judging whether the face action of the user is consistent with the random action instruction according to the relative variation, and judging that the current face is a living body if the face action is consistent with the random action instruction. The human face living body detection method provided by the invention does not need to describe different statistics by means of a state machine and the like, can greatly reduce the operation amount, reduce the complexity of an algorithm, improve the operation efficiency of the algorithm, and can realize high-efficiency human face living body detection on embedded terminals with less resources such as mobile phones, flat plates and the like.

Description

Face living body detection method and face recognition device

Technical Field

The invention belongs to the field of image processing, and particularly relates to a human face living body detection method and human face recognition equipment.

Background

In order to prevent the attack of non-living faces such as face photos, videos and the like on a face recognition system, a face living detection method needs to be researched. Common face living body detection methods include a face living body detection method based on binocular shooting, a face living body detection method based on near infrared shooting, a face living body detection method based on machine learning, and a face living body detection method based on random action instructions.

The face living body detection method based on binocular shooting calculates depth information of each part of the face according to binocular vision principle, and accordingly whether the current object is a planar face photo, video or living face is distinguished. But such methods require the use of two cameras to simultaneously capture a picture of the face. The method for detecting the human face living body based on the near infrared shooting mainly detects whether the human face is a living body according to the texture difference of the living body human face and the non-living body human face under the near infrared rays, and the method requires the near infrared light source and the filter. Therefore, the face living body detection methods based on binocular shooting and near infrared shooting all need special imaging equipment and are not suitable for embedded terminals such as mobile phones and tablets which are common at present.

The human face living body detection method based on machine learning generally adopts a large number of living body face images and non-living body face images for training to construct a human face living body detection classifier, has the advantages of realizing blind detection, having the defects of difficult construction of a complete training data set, higher detection performance influenced by imaging quality, higher resource occupancy rate, lower operation efficiency and difficult deployment on embedded terminals such as mobile phones, flat plates and the like.

The human face living body detection method based on the random action instruction requires the user to move according to the random instruction, such as mouth opening, blink and the like, and the human face is judged to be living body when the instruction is matched correctly. The method has low requirements on imaging conditions and equipment resources, and is widely applied to the field of human face living body detection of embedded terminals such as mobile phones, tablets and the like.

However, the existing human face living body detection method based on the random action instruction is low in calculation efficiency. If the application publication number is CN 105989264A, the Chinese patent with the invention name of biological feature living body detection method and system is trained by adopting modes such as SVM or regression to obtain a posture estimation classifier, then the posture and expression estimation is carried out on the face image by using the training to obtain the posture estimation classifier, and the operation amount is relatively large; the Chinese patent with the name of photo face and living body face computer automatic identification method with the authorized bulletin number CN 100592322C requires that a model for judging blinking actions is established by adopting a conditional random field theory, and the calculation complexity is high; the application publication number is CN 106874876A, the Chinese patent with the name of a face living body detection method and device also needs face recognition, the target face information is obtained from the historical data through the face recognition, and the operation amount of the face recognition is quite large.

Disclosure of Invention

The invention aims to provide a human face living body detection method, a computer readable storage medium and human face recognition equipment, which can reduce the complexity of an algorithm and improve the operation efficiency of the algorithm.

In a first aspect, the present invention provides a face living body detection method, the method comprising:

s101, detecting a face in an image;

s102, generating a random action instruction related to a human face;

s103, acquiring an image sequence;

s104, detecting the face action of the user according to the acquired image sequence, calculating statistics related to each random action instruction according to the positions of the key points of the face, respectively calculating the relative variation of all statistics related to each random action instruction, judging whether the face action of the user is consistent with the random action instruction according to the relative variation, and if so, judging that the current face is a living body.

In a second aspect, the present invention provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of a face in vivo detection method as described above.

In a third aspect, the present invention provides a face recognition apparatus, comprising:

one or more processors;

a memory; and

one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the steps of the above-mentioned face living detection method are implemented when the processor executes the computer programs.

In the invention, as the statistics related to each random action instruction are calculated according to the positions of the key points of the human face, the relative variation of all the statistics related to each random action instruction is calculated respectively, whether the human face action of the user is consistent with the random action instruction is judged according to the relative variation, and if so, the current human face is judged to be a living body. Therefore, different statistics do not need to be described by means of a state machine and the like, the operation amount can be greatly reduced, the complexity of an algorithm is reduced, the operation efficiency of the algorithm is improved, and efficient human face living detection can be realized at embedded terminals with fewer resources such as mobile phones, tablets and the like.

Drawings

Fig. 1 is a flowchart of a face living body detection method according to an embodiment of the present invention.

Fig. 2 is a specific block diagram of a face recognition device according to a third embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantageous effects of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In order to illustrate the technical scheme of the invention, the following description is made by specific examples.

Embodiment one:

referring to fig. 1, a face living body detection method provided by an embodiment of the invention includes the following steps: it should be noted that, if the results are substantially the same, the face living body detection method of the present invention is not limited to the flow sequence shown in fig. 1.

S101, detecting faces in the images.

In the first embodiment of the present invention, S101 may specifically include the following steps:

s1011, acquiring an image;

for example, acquiring a frame of image acquired by a camera; the camera can be self-contained or external camera connected with the face recognition device; the face recognition device can be a mobile terminal (such as a mobile phone, a tablet computer and the like) or a desktop computer and the like;

s1012, detecting a human face in the image;

in the first embodiment of the invention, the VJ algorithm (please refer to Rapid object detection using a boosted cascade of simple features (P.Viola, M.Jones, proceedings of the 2001IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2001)) can be adopted to perform face detection, the algorithm combines Haar features and Adaboost classifiers to perform face detection, and an integral diagram is adopted to accelerate feature extraction, and the strong classifiers constructed by Adaboost are cascaded, so that the detection speed can be greatly accelerated while the face detection performance is improved, and the method has the advantages of high operation efficiency and low resource occupancy rate, and is suitable for performing real-time face detection on embedded terminals such as mobile phones and tablet computers.

S1013, judging whether the size of the detected face is matched with a preset window, and if so, executing S102.

In order to improve the accuracy of the human face living body detection, a preset window is arranged, and a user is required to place the human face in the preset window to perform relevant instruction actions. The preset window may be circular, oval, rectangular, etc.

The vertex sitting at the upper left corner of the external rectangular frame of the preset window is marked as (X1, Y1), and the vertex sitting at the lower right corner is marked as (X2, Y2). In the first embodiment of the present invention, x1=69, y2=169, x2=254, and y2=408, but other values may be set. Assuming that the vertex sitting at the upper left corner of the rectangular frame of the face of the detected face region is marked (x 1, y 1), the vertex sitting at the lower right corner is marked (x 2, y 2). Then, the overlapping area a of the detected face area and the preset window may be expressed as:

a= (min (X2, X2) -max (X1, X1) +1) (min (Y2, Y2) -max (Y1, Y1) +1), wherein max and min represent maximum and minimum operations, respectively.

The coincidence degree I of the detected face region and the preset window can be expressed as:

if I < preset value, the detected face size is not matched with the preset window, returning to S1011, otherwise, executing S102. The preset value may be set to 0.4, but may be other empirical values.

S102, generating a random action instruction related to the human face.

In the first embodiment of the present invention, the random action command related to the face may include one or any combination of head shaking, nodding, blinking, and mouth opening.

S103, acquiring an image sequence.

The image sequence includes images corresponding to actions performed by the user in coordination with the random action instruction, and the length of the image sequence acquired by the first embodiment of the invention is 100 frames considering the waiting and finishing time of the actions, which can be other experience values.

In the first embodiment of the present invention, S104 may specifically include the following steps:

s1041, reading a frame of image in the acquired image sequence, then executing S1042, and if all the images in the image sequence are read, directly executing S1045.

S1042, locating key points of the face.

Currently, locating the key points of the face generally requires locating 68 key points, however, most key points have no meaning for the living body detection of the face, and the locating of the redundant key points increases the resource occupancy rate and reduces the operation efficiency. There are also only 12 key points to be located, but these key points are not sufficiently expressed for the eyes and mouth regions, for example, only one key point is located on the upper eyelid, the lower eyelid and the upper lip and the lower lip, and these parts have no special locating feature, so that the error of judging the states of the eyes and the mouth is easily caused by the locating error. In this way, the accuracy of the face biopsy is liable to be lowered.

The embodiment of the invention aims at four actions of head shaking, nodding, blinking and mouth opening to position 19 face key points, wherein 6 key points are respectively left eye, right eye and mouth, and 1 key point is at the tip of the nose. The method comprises the following steps: the upper eyelid and the lower eyelid of the left eye are respectively provided with 2 key points, and the two side corners of the left eye are respectively provided with 1 key point; the upper eyelid and the lower eyelid of the right eye are respectively provided with 2 key points, and the two side corners of the right eye are respectively provided with 1 key point; the upper lip and the lower lip are respectively provided with 2 key points, and the corners of the two sides of the lips are respectively provided with 1 key point. The eye and mouth have more key points, mainly because the blink and mouth opening movement amplitude is relatively smaller, and the requirement on the positioning precision of the key points of the eye and the mouth is higher. The upper eyelid, the lower eyelid and the upper lip and the lower lip are provided with 2 key points, so that not only can the error judgment of the state of eyes or mouths caused by the positioning error of a single key point be avoided, but also the error judgment of the state of eyes or mouths caused by the shape difference of eyes and mouths of different users can be distinguished. The relative position change of the nose tip, eyes and mouth can be used for judging the head shaking and nodding actions. 25 face key points, namely 8 key points of the left eye, the right eye and the mouth, and 1 key point of the nose tip can also be positioned. The method comprises the following steps: the upper eyelid and the lower eyelid of the left eye are respectively provided with 3 key points, and the two side corners of the left eye are respectively provided with 1 key point; the upper eyelid and the lower eyelid of the right eye are respectively provided with 3 key points, and the two side corners of the right eye are respectively provided with 1 key point; the upper lip and the lower lip are respectively provided with 3 key points, and the corners of the two sides of the lips are respectively provided with 1 key point. Of course, 18 or 24 facial keypoints, i.e., keypoints on the tip of the nose, may also be located. Other numbers of face keypoints are also possible, as long as at least 6 keypoints are guaranteed for the left eye, right eye and mouth, respectively.

The positioning method of the key points of the face adopts an active shape model method. In the embodiment of the invention, the key point positions of 500 face images are marked manually and used as a training data set.

Because the face actions are mainly performed in a preset window, in order to reduce the operation amount, the positioning face key points are specifically as follows: and positioning the key points of the human face in a preset window.

S1043, calculating statistics related to the random action instruction generated at present according to the positions of the key points of the human face.

The invention only calculates the statistic related to each generated random action instruction, does not need to calculate the statistic related to other random action instructions, and does not need to describe different statistic by means of a state machine and the like. Thus, the operation amount can be greatly reduced, and the operation efficiency can be improved. For example, in S102, the random motion instruction currently generated is a shaking head, and in S1043, only statistics related to the shaking head are calculated.

For the head shaking and nodding actions, a common method is to calculate three euler angles (pitch, yaw, roll) related to face pose information, and the calculation of the three values involves complex angle calculation and matrix operation, so that the calculation complexity is high. Because the invention only needs to judge whether the face action of the user is consistent with the random action instruction, the invention does not need to judge the specific action of the current face. Therefore, the invention calculates statistics only according to the facial position when the head is swung or nodded, and judges whether the facial motion of the user is consistent with the random motion instruction according to the relative variation.

Specifically, when the currently generated random motion instruction is head shaking, calculating statistics U related to head shaking according to the positions of key points of the human face ₁ The method comprises the following steps:

where x1 is the x-axis coordinate value of the right corner of the right eye, x10 is the x-axis coordinate value of the left corner of the left eye, and x13 is the x-axis coordinate value of the tip of the nose. When the face gesture is a quasi-frontal, U ₁ Near 1, U when shaking head ₁ Away from 1.

When the currently generated random action instruction is nodding, calculating statistics U related to nodding according to the positions of the key points of the human face ₂ The method comprises the following steps:

where y13 is the y-axis coordinate of the tip of the nose, y4 is the y-axis coordinate of the left corner of the right eye, y7 is the y-axis coordinate of the right corner of the left eye, y14 is the y-axis coordinate of the right corner of the mouth of the lips, and y17 is the y-axis coordinate of the left corner of the lips. When the face gesture is a quasi-frontal, U ₂ Near 1, U when nodding ₂ Away from 1.

When the currently generated random action instruction is blink, calculating statistics U related to blink according to the positions of key points of human faces ₃ The method comprises the following steps:

wherein (x 2, y 2) and (x 3, y 3) are the coordinate values of 2 key points of the upper eyelid of the right eye, respectively, (x 5, y 5) and (x 6, y 6) are the coordinate values of 2 key points of the lower eyelid of the right eye, respectively, (x 4, y 4) are the coordinate values of the left corner of the right eye, (x 1, y 1) are the coordinate values of the right corner of the right eye, (x 8, y 8) and (x 9, y 9) are the coordinate values of 2 key points of the upper eyelid of the left eye, respectively, (x 11, y 11) and (x 12, y 12) are the coordinate values of 2 key points of the lower eyelid of the left eye, respectively, (x 10, y 10) are the coordinate values of the left corner of the left eye, (x 7, y 7) are the coordinate values of the right corner of the left eye, U ₃ The larger the value, the greater the opening degree of the eyes; conversely, the smaller the opening degree of both eyes is.

When the current random action instruction is mouth opening, the user closes according to the human faceCalculation of statistics U relating to mouth opening by key point position ₄ The method comprises the following steps:

wherein (x 15, y 15) and (x 16, y 16) are the coordinate values of 2 key points of the upper lip, respectively, (x 18, y 18) and (x 19, y 19) are the coordinate values of 2 key points of the lower lip, respectively, (x 14, y 14) are the coordinate values of the right mouth corner of the lip, and (x 17, y 17) are the coordinate values of the left mouth corner of the lip. U (U) ₄ The larger the value, the larger the opening degree of the mouth is; conversely, the smaller the opening degree of the mouth is.

Therefore, in the first embodiment of the invention, the calculation complexity of calculating the statistics related to the random action instruction generated at present is low, the calculation amount is small, and the quick solution can be realized on the embedded terminal such as the mobile phone.

S1044, caching the statistics related to the currently generated random action instruction.

S1045, after the image sequence is read, reading all statistics related to the random action instructions of the cache, and respectively calculating the relative variation of all statistics related to each random action instruction.

Assuming that the random motion command is k (k=1 represents a shaking head, k=2 represents a nodding, k=3 represents a blinking, and k=4 represents a mouth opening), the corresponding statistic is U _k The number of statistics in the buffer space is N (N.ltoreq.100 because the length of the image sequence acquired in S103 is 100 frames in the present invention, and S1042 may have a critical point positioning failure). All statistics aiming at each random action instruction can be connected into a curve, and the change condition of statistics in the face action process is recorded. Wherein, the peaks and troughs of the curve reflect the extreme states of the face motion, such as: u when the oscillating motion moves to the leftmost side ₁ Minimum, reaching the trough position; u when the oscillating motion moves to the rightmost side ₁ Maximum, peak position is reached. U when nodding motion moves to the uppermost side ₂ Minimum, reaching the trough position; u when nodding motion moves to the lowest position ₂ Maximum, peak position is reached. U when the eye opening degree is maximum ₃ Reaching the peak position and U when the eye is closed ₃ Reaching the trough position. U when mouth opening degree is maximum ₄ Reaching the peak position and U when the mouth is closed ₄ Reaching the trough position.

In order to reduce the data calculation error, the embodiment of the invention firstly carries out filtering processing on the statistic, the filtering window is 3 (of course, other empirical values can be adopted, such as 4, 5, etc.), the average filtering method (of course, other filtering methods can be adopted), and the filtered statistic is

N is the number of statistics in the cache space;

considering the continuity of the face motion, the maximum value and the minimum value are used to replace the wave crest and the wave trough for the sake of simplicity, and are respectively recorded asAnd->

Considering that the opening degree of eyes, the closing degree of mouth and the relative positions of five sense organs are different under the normal postures of the quasi-front face of different users, the average value of all statistics related to each random action instruction is taken as the reference value of the users

Calculating the relative variation DeltaU of all statistics related to each random action instruction _k Is that

The physical meaning of the relative change quantity is that the more remarkable the action of the human face is, thenAnd->The larger the difference of (a) is, the relative change amount DeltaU _k The larger. Thus, the relative variation Δu can be made _k As the basis for action decision.

S1046, when the relative variation of all statistics related to the random action instruction is greater than or equal to a preset threshold value of the action corresponding to the random action instruction, judging that the face action of the user is consistent with the random action instruction, and judging that the current face is a living body, otherwise, judging that the face action of the user is inconsistent with the random action instruction.

The first embodiment of the invention adopts a simple threshold judgment strategy to judge. Let T be _k For a preset threshold of actions of the kth class, (k=1 for shaking, k=2 for nodding, k=3 for blinking, and k=4 for opening) when the relative change amount Δu is _k Greater than or equal to a preset threshold T _k And if not, judging that the face action of the user is inconsistent with the random action instruction. T (T) _k An empirical value is used, in one embodiment of the invention T ₁ ＝T ₂ ＝0.6，T ₃ ＝0.3，T ₄ =0.9. The invention judges whether the face action of the user is consistent with the random action instruction according to the relative variation, thereby avoiding complex angle and matrix operation and improving the operation efficiency. Due to the inventionIn the first embodiment, only the mean value, the maximum value and the minimum value of each statistic are calculated, a simple threshold judgment strategy is adopted to realize action judgment, compared with a machine learning method such as a classifier, the efficiency is greatly improved, training data is not needed, and the resource occupancy rate is very small.

Embodiment two:

the second embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program implements the steps of the face living body detection method according to the first embodiment of the present invention when the computer program is executed by a processor.

Embodiment III:

fig. 2 shows a specific block diagram of a face recognition device according to a third embodiment of the present invention, and a face recognition device 100 includes: one or more processors 101, a memory 102, and one or more computer programs, wherein the processors 101 and the memory 102 are connected through a bus, the one or more computer programs are stored in the memory 102 and configured to be executed by the one or more processors 101, and the steps of the face living body detection method provided as the first embodiment of the present invention are implemented when the processor 101 executes the computer programs.

In the third embodiment of the present invention, the face recognition device may be a mobile terminal (e.g., a mobile phone, a tablet computer, etc.) or a desktop computer, etc.

In the invention, as the statistics related to each random action instruction are calculated according to the positions of the key points of the human face, the relative variation of all the statistics related to each random action instruction is calculated respectively, whether the human face action of the user is consistent with the random action instruction is judged according to the relative variation, and if so, the current human face is judged to be a living body. Therefore, the method for detecting the human face living body does not need to describe different statistics by means of a state machine and the like, can greatly reduce the operation amount, reduce the complexity of an algorithm, improve the operation efficiency of the algorithm, and can realize high-efficiency human face living body detection on embedded terminals with fewer resources such as mobile phones, flat plates and the like.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. A method for detecting a human face in vivo, the method comprising:

s101, detecting a face in an image;

s102, generating a random action instruction related to a human face;

s103, acquiring an image sequence;

s104, detecting the face action of the user according to the acquired image sequence, calculating statistics related to each random action instruction according to the positions of the key points of the face, respectively calculating the relative variation of all statistics related to each random action instruction, judging whether the face action of the user is consistent with the random action instruction according to the relative variation, and judging that the current face is a living body if the face action is consistent with the random action instruction; the random action instruction related to the human face comprises one or any combination of shaking head, nodding head, blinking and opening mouth;

positioning 19 face key points aiming at four actions of head shaking, nodding, blinking and mouth opening, wherein 6 key points are respectively arranged on the left eye, the right eye and the mouth, 1 key point is arranged on the nose tip, and the method specifically comprises the following steps: the upper eyelid and the lower eyelid of the left eye are respectively provided with 2 key points, and the two side corners of the left eye are respectively provided with 1 key point; the upper eyelid and the lower eyelid of the right eye are respectively provided with 2 key points, and the two side corners of the right eye are respectively provided with 1 key point; the upper lip and the lower lip are respectively provided with 2 key points, and the corners of the two sides of the lip are respectively provided with 1 key point;

when the currently generated random action instruction is head shaking, calculating statistics U related to head shaking according to the positions of key points of the human face ₁ The method comprises the following steps:

wherein x1 is the x-axis coordinate value of the right corner of the right eye, x10 is the x-axis coordinate value of the left corner of the left eye, and x13 is the x-axis coordinate value of the tip of the nose;

wherein y13 is the y-axis coordinate value of the tip of the nose, y4 is the y-axis coordinate value of the left corner of the right eye, y7 is the y-axis coordinate value of the right corner of the left eye, y14 is the y-axis coordinate value of the right corner of the lip, and y17 is the y-axis coordinate value of the left corner of the lip;

wherein (x 2, y 2) and (x 3, y 3) are the coordinate values of 2 key points of the upper eyelid of the right eye, respectively, (x 5, y 5) and (x 6, y 6) are the coordinate values of 2 key points of the lower eyelid of the right eye, respectively, (x 4, y 4) are the coordinate values of the left corner of the right eye, (x 1, y 1) are the coordinate values of the right corner of the right eye, (x 8, y 8) and (x 9, y 9) are the coordinate values of 2 key points of the upper eyelid of the left eye, respectively, (x 11, y 11) and (x 12, y 12) are the coordinate values of 2 key points of the lower eyelid of the left eye, respectively, (x 10, y 10) are the coordinate values of the left corner of the left eye, and (x 7, y 7) are the coordinate values of the right corner of the left eye;

when the current random action instruction is mouth opening, calculating statistics U related to mouth opening according to the positions of key points of the human face ₄ The method comprises the following steps:

wherein (x 15, y 15) and (x 16, y 16) are the coordinate values of 2 key points of the upper lip, respectively, (x 18, y 18) and (x 19, y 19) are the coordinate values of 2 key points of the lower lip, respectively, (x 14, y 14) are the coordinate values of the right mouth corner of the lip, and (x 17, y 17) are the coordinate values of the left mouth corner of the lip;

the relative variation of all statistics related to each random action instruction is calculated specifically as follows:

calculating the maximum and minimum of the statistics, respectively recorded asAnd->Wherein k=1, 2,3,4;

calculating the average value of all statistics related to each random action instruction as a reference value of the user

Calculating the relative variation of all statistics associated with each random action instruction

2. The method of claim 1, wherein S101 specifically includes:

s1011, acquiring an image;

s1012, detecting a human face in the image;

3. The method according to claim 1 or 2, wherein S104 specifically comprises:

s1041, reading a frame of image in the acquired image sequence, then executing S1042, and if all the images in the image sequence are read, directly executing S1045;

s1042, locating key points of the face;

s1043, calculating statistics related to the random action instruction generated at present according to the positions of the key points of the face;

s1044, caching the statistics related to the currently generated random action instruction;

s1045, after the image sequence is read, reading all statistics related to the random action instructions of the cache, and respectively calculating the relative variation of all statistics related to each random action instruction;

4. A method according to claim 3, wherein the locating the face key points is specifically: and positioning the key points of the human face in a preset window.

5. The method of claim 1, wherein 25 face keypoints are located for four actions of shaking head, nodding head, blinking and opening mouth, wherein 8 keypoints for left eye, right eye and mouth, respectively, 1 keypoint for nose tip, specifically: the upper eyelid and the lower eyelid of the left eye are respectively provided with 3 key points, and the two side corners of the left eye are respectively provided with 1 key point; the upper eyelid and the lower eyelid of the right eye are respectively provided with 3 key points, and the two side corners of the right eye are respectively provided with 1 key point; the upper lip and the lower lip are respectively provided with 3 key points, and the corners of the two sides of the lips are respectively provided with 1 key point.

6. The method of claim 3, wherein after the reading of all statistics related to random action instructions of the cache, the method further comprises: the statistics are filtered.

7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the face in vivo detection method according to any one of claims 1 to 6.

8. A face recognition device, comprising:

one or more processors;

a memory; and

one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the processor, when executing the computer programs, implements the steps of the face in vivo detection method according to any one of claims 1 to 6.