CN111539249A

CN111539249A - Multi-factor human face in-vivo detection system and method

Info

Publication number: CN111539249A
Application number: CN202010167011.4A
Authority: CN
Inventors: 王子龙; 李秋衡; 王毅刚
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2020-03-11
Filing date: 2020-03-11
Publication date: 2020-08-14

Abstract

The invention provides a multi-factor human face in-vivo detection system and a method, which belong to the technical field of human face recognition, and comprise a human face detection module, a human face image preprocessing module, a data storage module, a random number generation module and a multi-factor in-vivo detection module; the method comprises the following steps: the method comprises the steps of collecting face image information and video information, and sending the collected face image information and video information to a face image preprocessing module and a data storage module; the face image preprocessing module preprocesses the received face image information and video information and judges whether the face information meets the requirements or not according to the processing result; a random number generation module randomly generates multi-factor in-vivo detection parameters; and the multi-factor in-vivo detection module performs multi-factor human face in-vivo detection according to the multi-factor in-vivo detection parameters generated by the random number generation module. The method has various combinations of challenge actions, and the challenge pool is richer, thereby greatly increasing the attack difficulty.

Description

Multi-factor human face in-vivo detection system and method

Technical Field

The invention belongs to the technical field of face recognition, and particularly relates to a multi-factor face in-vivo detection system and method.

Background

The user identity authentication system is the most important and first line of defense of a security system, and a classical user authentication method comprises a secret key, a security token and biological human body characteristics. In most user identity authentication methods, identity authentication based on biological characteristics has the advantages of convenience, good usability, no need of user memorization and the like, becomes a research direction of the current user identity authentication, and has been paid attention from both theoretical and practical application angles by a plurality of information security researchers in the academic and industrial fields. The face authentication is a biological feature recognition technology for identity identification based on the face features of people, follows a mode that people memorize different faces and corresponding identities, does not depend on the support of extra special hardware, is easy to deploy, is non-invasive to authenticate, and has good user experience. Therefore, the application of the face authentication system in real life is widely concerned, and the face authentication system has been transplanted to a mobile device (mobile phone, tablet computer, etc.) platform to protect a plurality of systems taking users as cores, such as an access control device system, a mobile phone-related operating system, and a PC running system, and can also be used for protecting a safety payment function, such as a payment bank, an ATM of each big bank, and an App ly Pay. The increasingly novel attacking means present considerable challenges to face authentication systems.

The face authentication widely protects personal information systems and electronic payments, and becomes an alternative to password authentication because memory is not required. However, face authentication is threatened by spoofing attacks and face feature template leakage. The spoofing attack means that a face photo or a video of a victim is spoofed by an adversary, and a face feature template in the face authentication system can be leaked due to illegal access or system loopholes. The face research is widely applied to the real life, wherein face-swiping login of banks is not poor built, face-swiping payment of payment treasures and face card punching are carried out, wherein the face-based research is adopted, and the famous companies google, microsoft, ari, science news, face + + and the like at home and abroad have higher tree building on faces in succession, so that the recognition rate and the robustness are improved by building a deeper network, but additional high-performance hardware support is required. The face detection systems in the past provide more detection methods in face recognition, but the systems have lower security compared with the means based on photo and video attacks, and many systems still have difficulty in resisting similar attacks compared with the latest attacks based on virtual faces.

Therefore, the application provides a multi-factor human face in-vivo detection system and method.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a multi-factor human face in-vivo detection system and a multi-factor human face in-vivo detection method.

In order to achieve the above purpose, the invention provides the following technical scheme:

the multi-factor human face in-vivo detection system comprises a human face detection module, a human face image preprocessing module, a data storage module, a random number generation module and a multi-factor in-vivo detection module;

the face detection module is used for acquiring face image information and video information and sending the acquired face image information and video information to the face image preprocessing module and the data storage module;

the face image preprocessing module is used for processing the received face image information and video information and judging whether the face information meets the requirements or not according to the processing result;

the data storage module is used for storing face image information and video information;

the random number generation module is used for randomly generating multi-factor in-vivo detection parameters;

and the multi-factor in-vivo detection module is used for carrying out multi-factor human face in-vivo detection according to the multi-factor in-vivo detection parameters generated by the random number generation module.

Preferably, the face detection module includes a display device, a camera, and an image sensor.

Preferably, the face image preprocessing module performs the intended processing on the received face image information and video information by using a neural network.

Preferably, the face image preprocessing module determines whether the face information meets the requirement criteria that there is only one face in the display device and one face occupies the center of the display device.

Preferably, the detection method of the multi-factor in-vivo detection module comprises a face key point detection method, a method based on mouth opening detection, a method based on head shaking detection, a method based on eye blinking in-vivo detection, a method based on human eye movement trajectory in-vivo detection, a method based on facial expression in-vivo detection, and a method based on face back and forth movement in-vivo detection.

The invention also provides a detection method of the multi-factor human face in-vivo detection system, which comprises the following steps:

step 1, the face detection module collects face image information and video information and sends the collected face image information and video information to the face image preprocessing module and the data storage module;

step 2, the face image preprocessing module processes the received face image information and video information to be processed and judges whether the face information meets the requirements or not according to the processing result; if so, carrying out the next step; otherwise, returning to the previous step to acquire information again;

and step 3: the random number generation module randomly generates multi-factor in-vivo detection parameters;

and 4, step 4: and the multi-factor in-vivo detection module performs multi-factor human face in-vivo detection according to the multi-factor in-vivo detection parameters generated by the random number generation module.

Preferably, the step 2 includes:

step 2.1: judging whether a face exists in the video or not;

step 2.2: and carrying out normalization processing on the face image information.

Preferably, the multi-factor in-vivo detection parameters in step 3 correspond to actions that the user needs to complete interaction with the system, including shaking head, blinking, opening mouth, emotional changes, eye sight direction, and movement of the portable mobile device.

Preferably, the step 3 will collect the video before the start of the biopsy, take the picture of the last frame, convert it into binary, convert it generated by the secure _ random function of Java into binary, and xor with the binary of the picture to match only the binary length of the random number generated by the function.

Preferably, in the process of randomly performing living body detection in the step 4, when the level accumulation of the detection results exceeds a certain threshold, the system determines that the detection is passed; when the detection is passed, the system enters an interface which can be related to the user, and when the detection level accumulation does not reach a certain threshold value and exceeds a certain number of times, the user information is locked, and the user information is unlocked through an additional method. Additional methods are referred to herein as username-password or dynamic authentication, among others.

The multi-factor human face in-vivo detection system and method provided by the invention have the following beneficial effects:

(1) the invention belongs to a living body detection method based on corresponding challenges, has better robustness to illumination compared with other two methods (a living body detection method based on human face three-dimensional characteristics and a living body detection method based on texture analysis), does not depend on ideal illumination conditions, and does not need to add an additional camera in mobile hardware equipment.

(2) Considering that the existing in-vivo detection methods based on challenge response only include few challenge response actions, namely shaking head, opening mouth and nodding head, and are easy to collect video materials by attackers, the method includes more challenge response actions including shaking head, blinking, opening mouth, nodding head and various facial expressions, and the attackers cannot easily collect so many attack materials.

(3) Many existing challenge response systems do not allow the user to complete the challenge randomly, and an attacker may guess what kind of challenge action the system will prepare for the user to complete in advance, and the attack is successful. In the invention, the actions in the action pool are numbered one by one, the action numbers are converted into five-bit binary numbers, and then a random binary sequence is generated by a system, wherein each five-bit binary number corresponds to one action, so that the randomness of the challenging action is ensured.

Drawings

FIG. 1 is a flowchart of a multi-factor face in-vivo detection method according to embodiment 1 of the present invention;

FIG. 2 is a flowchart of a multi-factor liveness detection system login;

FIG. 3 is a flow chart of system face detection;

FIG. 4 is an explanatory view of a neuronal cell;

FIG. 5 is a schematic diagram of a neural network;

fig. 6 is a diagram of a neural network architecture.

Detailed Description

In order that those skilled in the art will better understand the technical solutions of the present invention and can practice the same, the present invention will be described in detail with reference to the accompanying drawings and specific examples. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

Example 1

The invention provides a multi-factor human face in-vivo detection system, which comprises a human face detection module, a human face image preprocessing module, a data storage module, a random number generation module and a multi-factor in-vivo detection module, wherein the human face image preprocessing module is used for generating a random number;

the face detection module is used for acquiring face image information and video information and sending the acquired face image information and video information to the face image preprocessing module and the data storage module; in this embodiment, the face detection module includes a display device, a camera, and an image sensor.

The face image preprocessing module is used for processing the received face image information and video information and judging whether the face information meets the requirements or not according to the processing result; in this embodiment, the facial image preprocessing module performs the desired processing on the received facial image information and video information by using the neural network, and the criterion that the facial image preprocessing module determines whether the facial information meets the requirement is that there is only one face in the display device, and one face occupies the center of the display device.

The data storage module is used for storing the face image information and the video information;

and the multi-factor in-vivo detection module is used for carrying out multi-factor human face in-vivo detection according to the multi-factor in-vivo detection parameters generated by the random number generation module. In this embodiment, the detection method of the multi-factor in-vivo detection module includes a face key point detection method, a method based on mouth opening detection, a method based on head shaking detection, a method based on eye blinking in-vivo detection, a method based on eye movement trajectory in-vivo detection, a method based on face expression in-vivo detection, and a method based on face forward and backward movement in-vivo detection. The embodiment accurately positions 83 key points of the human face through the cascaded convolutional neural network, wherein the key points comprise face contours, eyebrows, mouths, noses, eyes, key points and the like.

The embodiment also provides a detection method of the multi-factor human face in-vivo detection system, which specifically includes, as shown in fig. 1, the general flow: calling a camera in hardware equipment at the beginning stage, capturing a video, detecting whether the video contains a face or not, when the face is not detected, prompting a user to be in a camera acquirable area through display equipment by a system, wherein the system is divided into two parts, the first part is user registration, when the user enters the system for the first time, a background database of the system does not contain user data information, the system prompts the user to register, the user registration comprises user name and password registration and acquisition of user face information, the user face information is stored in the background data system after the acquisition is finished, then the user registration is carried out, most users register in the system, the system contains detailed information of the user for login, when the user carries out login operation, the user can select face detection login or user name and key combination login, when the user selects face detection, the background of the system can generate random numbers, the random number corresponds to the actions required by the user to interact with the system, including shaking head, blinking, opening mouth, emotion, eye sight direction, and movement of the mobile device (provided by the portable mobile device and provided with a built-in parameter sensor), after the user completes the corresponding actions, the security evaluation is performed on each detection mode, and the detection mode which is proved to be the least secure before voiceprint login is performed, so that the grades of modes with lower similar security are scored lower, the latest distance distortion based on the human face at different distances from the camera is different, the robustness is different, and the grade positioning is the highest in the identification process, so that in the process of randomly performing living body detection, when the grade accumulation of the detection result exceeds a certain threshold value, the system judges that the detection is passed, when the detection is passed, the system enters the interface related to the user, and when the grade accumulation does not reach the certain threshold value, and if the number of times exceeds a certain number, the user information is locked, and the user information is unlocked by an additional method, so that the in-vivo detection method of the system has a plurality of samples, and the robustness of the in-vivo detection system is stronger by selecting the random number.

The specific detection process of the detection method of the multi-factor human face in-vivo detection system provided by the embodiment comprises the following steps:

step 1, a face detection module collects face image information and video information and sends the collected face image information and video information to a face image preprocessing module and a data storage module;

and step 3: a random number generation module randomly generates multi-factor in-vivo detection parameters;

In this embodiment, step 2 includes:

step 2.1: judging whether a face exists in the video or not; the system is based on the living body detection of human faces, firstly, the fact that an image sensor can collect human face information must be guaranteed, whether a human face is contained in a section of video or a picture is detected, a similar neural network is mature, an FDDB human face database, an WIDER FACE human face database, a Yaleb human face database, an MIT human face database, an ORL human face database and an AR human face database are provided, a picture can be easily detected, whether a human face is contained in a section of video or not, the number of the human faces is large, an output result can be rapidly obtained, and high output precision is achieved. In the necessity of detecting the human face, when the human face does not appear in the detected video or photo, there is no way to perform the next living body detection, so that before the living body detection is performed, whether the image sensor can detect the human face information is detected, and only one human face is required to occupy the central position of the display device; after the camera is started, adjusting the direction of the camera, collecting video streams, storing the collected video streams, performing framing processing, representing the position of the face of each frame of picture, calling the camera by opencv in mature Python to complete the code section, and using the library face recognition frame to perform face detection, wherein the cascade classifier is haarcascade _ front _ default.xml, and the specific implementation flow is shown in fig. 3;

step 2.2: the normalization processing of the face image information specifically comprises the following steps:

in the process of acquiring a human face image, the quality of image acquisition is influenced by the influence of objective reasons (illumination, posture and physical vibration of hardware equipment) or involuntary vibration, shaking and winking of a human face region, an originally acquired video stream is firstly subjected to frame division processing in the process of human face living body detection, then the correction and normalization processing of the original position of the human face is carried out on a picture of each frame, at the initial stage of the image, the face region of a user is required to be positioned in the middle of real equipment and filled with a certain region, at the moment, the video quality is influenced by biological actions such as human face rotation and winking, and at the moment, the image rotation is carried out by the position of a key point of the human face;

the position of the eyebrow center is recorded as B (x)_b,y_b) The nose center position is denoted as N (x)_n,y_n) Determining a straight line before two points, determining the slope of the straight line by the two points, and then solving the inclination angle of the image to calculate according to the following formula:

after the inclination angle of the face is calculated, the images of the subsequent frames are processed according to the inclination angle

Rotation, correction, and emerging a lot of and excellent algorithms in the image rotation process, wherein the algorithms are not poor

Affine transformation, rotation matrix, which is used herein to preprocess an image, always defaults the upper left corner of the image as the origin of a coordinate system during the presentation of an image, but after calculation

The origin of the image should be moved to the center of the image in the process and the Y-axis needs to be flipped, assuming there is a point p (x) in the image_p,y_p) The width of the image is w, the height is h, and the transformed point is referred to as p '(x'_p, y′_p) The transformation expression is:

the rotation angle α of the picture was previously confirmed by two-point coordinates, and it is assumed that p '(x ″') was obtained after the origin transformation and the rotation of α angles by the rotation matrix_p,y″_p) The relationship is as follows:

the height and width of the rotated picture are denoted as w ', h', respectively, and there is a point p (x) in the original picture_p,y_p) Is changed to the final P (X) by the following expression_P,Y_P):

At this time, the image including the face is preprocessed, and then the image is subjected to the size scaling processing to the same size by the following code.

In this embodiment, the multi-factor in-vivo detection parameter corresponding to step 3 corresponds to an action that the user needs to complete interaction with the system, including shaking head, blinking, opening mouth, emotional change, eye sight direction, and movement of the portable mobile device.

In this embodiment, step 3 is to collect a video before the start of the biopsy, obtain a picture of the last frame, convert the video into a binary system, convert the video generated by the secure _ random function of Java into a binary system, perform an exclusive or operation with the binary system of the picture, and only match the binary length of the random number generated by the function. Using the generated random technology to perform Out mod n-number operation, wherein Out is a random number which is generated by Java code and is greater than zero, and the value of n is the number of methods in the multi-factor living body detection method, wherein blinking can be normalized to blink twice to blink five times, and the confidence coefficient is 2, and the head shaking is four directions, and the confidence coefficient is 2, and the mouth opening is one to three times, and the confidence coefficient is 2, and the trajectory of human eye living body detection can be constrained to rotate clockwise and counterclockwise, and the screen is simply watched on a certain place in the screen, and then the screen is divided into four blocks. Two of the methods are selected for in-vivo detection, the confidence coefficient of the eight methods can be set to be 2, the method based on photo distortion can be set to be two detection methods of moving from a screen nearest to the screen to be full to a certain distance and moving from a distance to be full, the confidence coefficient is set to be 3, for the method of emotion detection, seven detection methods can be listed based on the seven methods, but the confidence coefficient is set to be 1 due to the problems of robustness and user experience, the methods are sequenced in sequence, the value of n is 28, the number generated finally corresponds to the detection method, and the system calls a test model of the detection method.

In this embodiment, in the process of randomly performing living body detection, when the level accumulation of the detection results exceeds a certain threshold, the system performs judgment to pass the detection; when the detection is passed, the system enters an interface which can be related by the user, when the detection level accumulation does not reach a certain threshold value and exceeds a certain number of times, the user information is locked, and the unlocking is carried out through an additional method.

In addition, in this embodiment, when entering the system, login needs to be performed, at this time, the system is divided into two parts, the first part is user registration, when the user enters the system for the first time, the background database of the system does not have the user data information, the system prompts the user to perform registration, which includes user name and password registration and acquisition of user face information, the user face information is acquired and stored in the background data system, and then the user logs in, most users are registered in the system, the system includes the detailed information of the user for logging in, and when the user performs login operation, the user can select face detection login or user name and key combination login;

when the face automatic detection is used for logging in, the next-level operation process is entered, when the password is forgotten, the system prompts the user name to be input to collect a face photo for mailbox or short message verification and recovery, and a specific flow chart is shown in fig. 2.

The multi-factor in-vivo detection system provided by the invention detects the activity of the human face in a human-computer interaction mode on the basis of the human face, wherein the human face in-vivo detection method has more methods, from the simplest method of opening mouth, blinking and shaking, even based on the motion expansion method to the current real-time detection based on the eye sight, the analysis of the human face expression and the latest change of different key points generated based on the distance between the human face and a camera to resist the most classical photo, video and 3D virtual human face attack, when the human face in-vivo detection method has more methods, a corresponding method pool can be established, the generated random numbers are called in the method pool in real time by using the method of newly generating safe random numbers, challenges are provided for a user, and the judgment is carried out according to the response made by the user, so that the aim of in-vivo detection is achieved, wherein more methods are generated randomly, the robustness of the system is enhanced from a random sequence, a pseudo-random algorithm to a newly researched quantum random number, and the method is continuously expanded to generate more choices, so that the randomness of the system is stronger, and the system has stronger resistance to the attack of a picture, a video and a 3D face.

The embodiment designs a safe, robust and practical multi-factor living body detection system to resist the spoofing attack and meet the requirements of various platforms, in particular mobile terminal platforms. By the challenge-response-based in-vivo detection method, the human face in-vivo detection system which is easy to transplant and can resist video and 3D human face attacks is realized, and on terminal equipment with continuously superior performance, man-machine interaction is more convenient and faster due to strong computing power and fast feedback, the system has better completeness, and is more suitable for continuously developed terminal equipment.

The multi-factor human face in-vivo detection system provided by the embodiment is transplanted to a mobile phone terminal for testing as follows:

android + TensorFlow + CNN + MNIST handwritten number discernment is realized, in view of the live body detection system that this article proposed hopes to transplant to the cell-phone terminal at last, wherein the main platform is Android and apple, calls the model trained in order to detect APP this moment, and the frame that utilizes most classical handwritten number to discern carries out the test especially at the consumption of memory electric quantity and time on the terminal, tests as follows: the environment configuration was Tensorflow: 1.2.0 Python: 3.6 Pythonnide: pycharm2017.2, the compilation environment is: android IDE: the first study of the androidstudio3.0 on the neural network model was performed on the line of sight of cats, and the neural network model was composed of a plurality of nerve cells, and the feature vector for recognition of each neuron was different, so that a large number of neurons were comprehensively determined and recognized to complete their functions, and fig. 4 shows the formula expression of the neuron cells:

the training and inspection module mainly aims to generate a pb file for testing, saves a network topology structure and related parameter information after training is built by using a TensorFlowkyton API, and has a considerable number of types in an implementation mode, wherein rnn, fcnn and the like can be applied besides cnn. The most classical convolutional neural network model will be used, whose general framework is shown in fig. 5:

the test framework designed below has two types of CNN-based function files, tf.layers.conv2d and tf.nn.conv2d, respectively, where tf.layers.conv2d is used as the back-end process, the filters in the parameters are set as integers, and the tensor of the filter is 4-dimensional. The structure used in the network model example is shown in fig. 6:

(1) convolutional layer # 1: 32 filters of 5 × 5, calling ReLU Activate function

(2) Pooling layer # 1: 2 × 2 filter for max firing with step size of 2

(3) Convolutional layer # 2: 64 filters of 5 × 5 calling the ReLU Activate function

(4) Pooling layer # 2: 2X 2 Filter for max Pooling with 2/5 steps

(5) Full connection layer # 1: 1024 neurons are included, the ReLU activation function is called, the dropout rate is 0.3 (in order to avoid the overfitting phenomenon, 30 percent of the neurons can be randomly lost in the training process)

Full connectivity Layer #2 (logs Layer): 10 neurons, each neuron corresponding to one of the categories (0-9);

the final experimental results data are shown in table 1:

table 1 handset test model overhead

The model consumption test based on the terminal shows that the time consumed by unit area is increased in proportion in the process of increasing pixels, the number of times of testing each unit of electric quantity is reduced, and the consumption is in a controllable range, so that the model calling detection based on the terminal is feasible, the terminal is continuously developed in the future, the computing capability is not increased, the model calling and the result feedback are greatly improved, and the user experience is also continuously enhanced.

Analysis of test results

The output results based on various liveness detection methods using the accuracy in the video input model recorded in advance are shown in table 2:

TABLE 2 System test result output

Based on the time consumed by the model detection and the time used for system login, the time is calculated as the time from the input of the video to the output of the result, and the time from the detection of the model to the success of the detection, all based on the average value of the detection times, as shown in table 3:

TABLE 3 System test time overhead

According to the experimental results, the recognition rate of the expressions in the living body detection-based method is not particularly high, and the analysis is mainly caused by that the shaking of the human face and the understanding of the individual to the expressions are not particularly obvious, while the resolution of the camera device is not particularly high in the line-of-sight detection process, the resolution of the obtained video stream is low, and then the human eyes can cause the involuntary shift of the head when moving according to the guidance of the screen, the training of the model should be strengthened later, the accuracy of the model is improved, and other detection modes are better, but when the living body detection and shaking of the head and the deformation photo based on the line-of-sight detection are detected, the user is required to keep in the middle of the screen, the deflection and movement are reduced as much as possible, so that the movement of the user during interaction tends to be relatively slow, because the limitation of the hardware device (web _ app operating environment is win7, 8G, i5 without video card support), and at the same time, attacks based on photos and videos are also carried out, the passing rate is zero because the photos do not interact with the system, and attacks based on videos cannot attack the system because corresponding actions cannot be carried out according to the system requirements in real time. Compared with the test of detection time, when the real-time test is carried out in a development environment, the performance of the overhead in terms of time is not particularly good, particularly when the visual line and the deformation photo are subjected to head shaking detection, because the user needs to concentrate on the visual line detection and the visual line needs to move according to the screen prompt, the interaction is slow, but the robustness is strong, the action amplitude of the interaction of the user is large for the head shaking and the deformation photo detection, the main time overhead is interactive, the difference is not particularly large when the background calculation is carried out, and therefore the problem that the time overhead problem such as time delay caused by hardware is transplanted to the hardware equipment with better conditions is solved.

When the system is continuously perfected and transferred to equipment with better hardware, the system has very good practical effect, effectively resists attacks based on photos and videos, and has stronger robustness. The user interaction experience degree is good, so that the multi-factor in-vivo detection system can be widely applied to practical application.

Through the deep research of the human face living body detection method, the existing living body detection methods, such as winking, shaking, mouth opening, expression and other analysis, are judged based on the human face key points, therefore, in the process of detecting the human face key points, the accuracy of the key points directly influences the accuracy of the later detection results, but the existing single detection can be attacked based on the pre-recorded video, and certain problems exist in the safety, the existing living body detection method is integrated, the living body detection method based on sight tracking is provided, the phenomenon that the human face changes based on the face key points is deeply analyzed, each living body detection method is subjected to trust scoring, the confidence coefficient of each living body detection method is given through the comprehensive analysis of the safety, the accuracy and the user experience degree of the system, the highest confidence coefficient is 3, and the lowest confidence coefficient is 1, and when the overall confidence value is larger than 2.5, the user passes the live detection, otherwise, the user is prohibited from performing the live detection login when the overall confidence value is out of the three detection ranges.

The innovation points of the embodiment are as follows:

1. the research of the living body detection method, and improve some algorithms among them, improve the robustness of the method, judge according to the key point horizontal direction of the mouth and vertical direction to open mouth to detect, judge through the relative distance, when the relative position of the continuous 5 frames of photos is greater than a certain threshold, judge as opening mouth, the same blink, the expression is based on the key point of human eye to judge, shake the head to detect mainly calculate the rotation matrix and offset vector to analyze according to the continuous frame, track according to the eye sight if the vector formed by the eye sight direction and the eye watching screen area is consistent within a certain threshold, thus draw, detect the activity of human eye based on the methods of clockwise, anticlockwise rotation of the screen, thus judge the activity of human face, in the living body detection method based on human face moving back and forth at the camera, through the optical distortion principle, and analyzing the consistency of the movement and distortion processes in the front-back movement process of the human face to obtain a detection result.

2. The generation of random numbers is a big highlight of the system, the robustness of the system is improved by analyzing the existing random number generation method in the system, the random numbers comprise a random sequence, a true random number, a pseudo random number and a quantum random number, and the better random number is the random number. Including the fact that system construction personnel cannot predict the action required by the user in the next step of the system. Therefore, the safety is better.

3. The living body detection system is built, firstly, data acquisition comprises thirty people including shaking head data, blink data and eye sight movement tracks, video based on the front and back movement of a camera is displayed, an expression training library is Kaggle facial expression conversation dataset, model training is based on tensierflow model training, the system is Ubuntu16.04, a display card is NVIDIA GeForce GTX TITAN X, and a language Python 2.7. The system is mainly integrated into a login interface and mainly comprises user name and key login and face active login. The system has better performance when the face is actively logged in, the consumed time is mainly about 3s, and the consumed time is mainly in the aspects of face movement and sight tracking. But is more robust so that it gives some compromise in user experience. The system is successfully built, successfully operates and detects, so that the living body detection system is more mature and more practical.

4. Considering that a single challenge action has a certain probability of being hit by video material prepared by an attacker in advance and the difficulty acquired by different challenge action attackers in advance is different, the invention considers that the trust degrees of different challenge actions are different. The invention sets an evaluation system, a confidence score is obtained when a user logs in the system by using a face, after the user finishes a challenge action, the confidence of the action is measured into the confidence score, the user needs to reach a confidence score threshold within three challenges, and the confidence score exceeds the threshold, namely, the user passes the living body detection. The user who adds the scoring system can not log in the system due to one detection failure, and the scoring system is more friendly. Due to the fact that the trust degrees of the randomly generated challenge actions are different and the chance that the user can challenge successfully or not is happened, the number of the challenge actions completed each time of login is different from one to three, the combination of the challenge actions is various, the challenge pool is richer, and the attack difficulty is greatly increased.

The above-mentioned embodiments are only preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, and any simple modifications or equivalent substitutions of the technical solutions that can be obviously obtained by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims

1. A multi-factor human face in-vivo detection system is characterized by comprising a human face detection module, a human face image preprocessing module, a data storage module, a random number generation module and a multi-factor in-vivo detection module;

the face image preprocessing module is used for preprocessing the received face image information and video information and judging whether the face information meets the requirements or not according to the processing result;

2. The multi-factor human face in-vivo detection system according to claim 1, wherein the human face detection module comprises a display device, a camera and an image sensor.

3. The multi-factor human face in-vivo detection system according to claim 1, wherein the human face image preprocessing module performs the processing on the received human face image information and video information by using a neural network.

4. The system for multi-factor human face in-vivo detection according to claim 2, wherein the human face image preprocessing module determines whether the human face information meets the required criteria that there is only one human face in the display device and one human face occupies a central position of the display device.

5. The multi-factor in-vivo human face detection system of claim 1, wherein the detection methods of the multi-factor in-vivo human face detection module comprise a human face key point detection method, a method based on open mouth detection, a method based on head shaking detection, a method based on blink in-vivo detection, a method based on human eye to realize motion trajectory in-vivo detection, a method based on facial expression in-vivo detection, and a method based on human face forward and backward movement in-vivo detection.

6. The detection method of the multi-factor human face in-vivo detection system according to any one of claims 1 to 5, characterized by comprising the following steps:

step 2, the face image preprocessing module preprocesses the received face image information and video information and judges whether the face information meets the requirements according to the processing result; if so, carrying out the next step; otherwise, returning to the previous step to acquire information again;

7. The multi-factor human face in-vivo detection method according to claim 6, wherein the step 2 comprises:

step 2.1: judging whether a face exists in the video or not;

8. The multi-factor human face in-vivo detection method as claimed in claim 6, wherein the multi-factor in-vivo detection parameters in step 3 correspond to actions that the user needs to complete interaction with the system, including shaking head, blinking, opening mouth, emotional changes, eye sight direction, and movement of the portable mobile device.

9. The multi-factor human face live body detection method according to claim 8, wherein step 3 is to collect the video before the live body detection is started, to obtain the photo of the last frame, to convert the video into binary, to convert the video generated by the secure _ random function of Java into binary, to perform xor operation with the binary of the photo, and to match only the binary length of the random number generated by the function.

10. The multi-factor human face in-vivo detection method according to claim 6, wherein in the process of randomly performing in-vivo detection in the step 4, when the level accumulation of the detection results exceeds a certain threshold, the system judges that the detection is passed; when the detection is passed, the system enters an interface which can be related to the user, and when the detection level accumulation does not reach a certain threshold value and exceeds a certain number of times, the user information is locked, and the user information is unlocked through an additional method.