Detailed Description
In order to make the technical solutions and advantages of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and not an exhaustive list of all embodiments. And the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
At present, the application of face recognition is more and more extensive, but face recognition has a core security problem: face fraud, such as may trick the face recognition system through face photographs, face videos, or 3D facial films.
In order to solve the face fraud problem, the safety of the face recognition system is improved. The embodiment of the application provides a human face in-vivo detection method based on a cloud, a plurality of first human face images of a user are continuously collected, after each first human face image is determined to be a living body image, whether micro-action exists in the plurality of continuous first human face images is identified, if the micro-action exists, it is confirmed that the human face in-vivo detection of the user passes through, the human face in-vivo detection is carried out on the user through in-vivo identification and micro-action identification, the accuracy of the human face in-vivo detection is effectively improved, the behavior of a human face identification system is prevented from being deceived through human face photos or human face videos, the function of distinguishing real human and dummy is achieved, and information safety.
Referring to fig. 1, the cloud-based human face live detection method provided in this embodiment includes:
and 101, determining that the user meets the distance requirement.
The common intrusion means for face recognition is usually a printed photo/mobile phone screen/computer screen/3D facial mask containing face images or face videos, and these intrusion tools usually have characteristic differences from normal living faces. In order to better recognize the difference, the proposal firstly requires the distance between a user (such as a human face) and a recognition device (such as a camera), and recognizes the characteristic difference on the basis of keeping the camera and the human face at a proper distance.
Implementations for determining that a user satisfies a distance requirement include, but are not limited to:
step 1, a second face image of the user is obtained.
The second face image is an image used for distance adjustment of a user and is different from an image used for subsequent face recognition.
And step 2, acquiring a face area in the second face image.
And step 3, determining the user distance according to the face area.
Specifically, the user distance may be determined according to the proportion of the face region in the second face image. And the distance between preset parts of the face can be extracted from the face region, and the user distance is determined according to the ratio of the distance to the width and the height of the second face image.
And 4, if the distance of the user is matched with the distance requirement, determining that the user meets the distance requirement.
And 5, if the distance of the user is not matched with the distance requirement, guiding the user to move so as to meet the distance requirement.
Specifically, a prompt (e.g., a voice prompt, or a text prompt) may be sent to the user to guide the user to adjust his position, appearance, etc. And after adjustment, executing the step 1 to the step 3 again, determining whether the adjusted distance is matched with the distance requirement, if so, executing the step 4 again, otherwise, executing the step 5 again. And circulating the steps until the user meets the distance requirement.
For example, when the embodiment is executed, the human face detection is performed to obtain the human face region. The distance of the face can be approximately estimated according to the size of the face region and the region proportion in the image, if the distance is within a proper proportion range, the distance is considered to be within an optimal distance, otherwise, the user is correspondingly reminded to approach or depart from the human face according to the size of the ratio. In addition, some key feature parts (points) of the human face can be detected, as shown in fig. 2, distance can be determined according to the ratio of the distance between the key parts (points) to the image width, for example, two eyes are detected first, and the ratio of the distance between the centers of the two eyes to the image width is determined.
When the user distance is calculated, the 2D coordinates of key points can be obtained, and then the Euler angle and the 3D translation (T) of the face relative to the camera are obtained through the solvepnp algorithmx,Ty,Tz) And further obtaining the 3D distance, and then judging whether the distance is in a proper range.
After determining whether the distance is within the appropriate range, the face posture of the user may be prompted (e.g., roll, pitch, yaw) and the position in the 2D image may be prompted (left, right, up, down, etc.) according to the detection result of the position and posture.
The user here alerts: the prompt can be realized by voice or in a text form on an image.
Where the roll is rotated about the Z axis, also called the roll angle. pitch is the rotation about the X axis, also called pitch angle. yaw is the rotation about the Y axis, also called the yaw angle.
And 102, continuously acquiring a plurality of first face images of the user.
After confirming that the user meets the distance requirement, a plurality of continuous face images of the user, namely a first face image, are collected. Here, the first face image user is a basis for performing face live body detection on the user.
103, it is determined whether each first face image is a living body image.
For any one of the first face images,
and if any first face image is determined to be a living body image, storing any first face image into the image sequence.
If the next face image of a certain person is a living body image, the next face image of the certain person is stored in the image sequence, and the process is circulated until all the first face images are subjected to living body image detection. If a next non-living body image is found in the detection process, the face image in the image sequence is emptied.
And if any first face image is determined to be a non-living body image, the process is terminated, the image sequence is emptied, and the living body detection of the face of the user does not pass.
Non-living body images, among others, include but are not limited to: photos (e.g., printed photos, photos in a cell phone screen, photos in a computer screen), videos (e.g., videos in a cell phone screen, videos in a computer screen), facial films (e.g., 3D facial films).
Filtering of a single image may be achieved by step 103.
Specifically, a single image is classified and judged by a machine learning method.
For example, CNN (convolutional neural network) based on deep learning is used for classification discrimination, such as a very popular resnet classification network.
Various possible fraud samples are first collected and trained, for example, in several categories that can be classified as printed photo/cell phone screen/computer screen/3D facial film/normal face.
After the CNN training is finished, each first face image is classified and identified by using the successfully trained network model and the weights, the output probability of which category is high, namely the category can be considered as the category, and meanwhile, a threshold value can be set for further judgment, for example, the maximum probability is larger than a set value.
If the classification result is a normal face, step 104 is executed to perform image sequence classification and judgment. And if the classification result is of other types, emptying the image sequence, and returning to restart the whole detection process.
And 104, identifying whether micro-motions exist in a plurality of continuous first face images.
If there is a micro-action, step 105 is performed.
If the micro-motion does not exist, the flow is terminated, the image sequence is emptied, and the user human face living body detection does not pass.
After execution 103, recognition of a printed photo/mobile phone screen/computer screen/3D facial film or the like containing a face image or a face video can be realized, but only depending on the recognition conclusion, it is determined whether the user's face living body detection is passed or not, and there is a case that misjudgment still exists.
During the whole process of face recognition, a person may make many careless micro-motions, such as some micro-changes in eyes and mouth, or movement deformation of facial muscles, or slight shaking of head, and the accuracy of face living body detection can be further improved through the recognition of the micro-motions.
Specifically, the image sequence classification filtering is performed through step 104.
And if the length of the image sequence meets a certain length, performing image sequence classification filtering. And inputting the image sequence into a deep neural network to directly classify and judge, and outputting the image sequence into two categories of a normal face and an abnormal face.
The deep neural network may be directly based on a 3D convolutional neural network, or a general 2D convolutional neural network, such as resnet, may be adopted, except that the network input is stacked sequential image data, as shown in fig. 3.
A general resnet classification network is input as 1 channel or 3 channels, and after stacking image sequences, taking a color image as an example, the input is equivalent to N × 3 channel data.
Where N is the length of the image sequence input to the deep neural network, i.e., the number of first face images in the image sequence input to the deep neural network.
For example, if the image sequence is input into the deep neural network to be directly classified and determined, N is the number of all the first face images in the image sequence.
And then training the 3D convolutional neural network or the 2D convolutional neural network according to the two collected samples. And after the training is finished, judging the input image sequence by directly utilizing the trained model and the weight. The output probability of which category is high is the category, and a threshold value can be set for further filtering.
And (3) for the situation that the image sequence is directly input into the deep neural network for classification and judgment, if the final output is a normal face, determining that micro-motion exists, executing step 105 to pass the living body detection, otherwise, determining that micro-motion does not exist, terminating the flow, emptying the image sequence, failing to pass the living body detection of the face of the user, and restarting the whole detection flow.
And 105, confirming that the human face living body detection of the user passes.
Executing the method for detecting the living human face based on the cloud end in the embodiment.
Referring to the flow shown in fig. 4, the cloud-based face live detection method of the present embodiment is described again.
Firstly, face distance detection is carried out to remind a user to keep a proper distance from a camera, so that subsequent living body detection is facilitated; then collecting a single face image for classification, judging whether the image is a printed picture, a mobile phone screen, a computer screen, a 3D facial mask or a normal face, and filtering abnormal faces; and finally, classifying the continuous picture sequence filtered by the human face to judge whether the picture sequence is a real person.
Has the advantages that:
the embodiment of the application, gather user's many first face images in succession, confirm that every first face image is the live body image after, whether there is the micro-motion in discerning many continuous first face images, if there is the micro-motion, then confirm user's human face live body detection and pass through, carry out human face live body detection to the user through live body identification and micro-motion identification, effectively promote human face live body detection's accuracy, prevent the action through people's face photo or the deception face video face identification system, realize distinguishing real man dummy's function, guarantee information security.
Based on the same concept, an embodiment of the present application further provides an electronic device, with reference to fig. 5, the electronic device includes:
memory 501, one or more processors 502; the memory, the processor and the transceiver component 503 are connected through a communication bus (in the embodiment of the present application, the communication bus is used as an I/O bus for explanation); the storage medium has stored therein instructions for performing the steps of:
continuously acquiring a plurality of first face images of a user;
after each first face image is determined to be a living body image, identifying whether micro-motions exist in a plurality of continuous first face images or not;
and if the micro-motion exists, confirming that the human face living body detection of the user passes.
Optionally, before the acquiring the plurality of first face images in succession, the method further includes:
it is determined that the user satisfies the distance requirement.
Optionally, determining that the user satisfies the distance requirement comprises:
acquiring a second face image of the user;
acquiring a face area in a second face image;
determining a user distance according to the face area;
and if the user distance is matched with the distance requirement, determining that the user meets the distance requirement.
Optionally, determining the user distance according to the face region includes:
determining the user distance according to the proportion of the face area in the second face image; or,
and extracting the distance between preset parts of the human face from the human face region, and determining the user distance according to the ratio of the distance to the width and the height of the second human face image.
Optionally, after determining the user distance according to the face region, the method further includes:
and if the distance of the user is not matched with the distance requirement, guiding the user to move so as to meet the distance requirement.
Optionally, after the multiple first face images of the user are continuously acquired, the method further includes:
determining whether each first face image is a living body image;
for any first face image, if the fact that the any first face image is a living body image is determined, storing the any first face image into an image sequence; and if any first face image is determined to be a non-living body image, the process is terminated, the image sequence is emptied, and the living body detection of the face of the user does not pass.
Optionally, the non-live image comprises: photos, videos, facial films.
Optionally, the micro-motion comprises human face organ micro-changes, human face muscle micro-changes and human face micro-movements.
Optionally, after identifying whether the micro-motion exists in the plurality of consecutive first face images, the method further includes:
if the micro-motion does not exist, the flow is terminated, the first face image in the image sequence is emptied, and the user face living body detection is not passed.
It will be appreciated that in practice, the above-described transceiver component 503 need not necessarily be included for the purpose of achieving the basic objectives of the present application.
Has the advantages that:
the embodiment of the application, gather user's many first face images in succession, confirm that every first face image is the live body image after, whether there is the micro-motion in discerning many continuous first face images, if there is the micro-motion, then confirm user's human face live body detection and pass through, carry out human face live body detection to the user through live body identification and micro-motion identification, effectively promote human face live body detection's accuracy, prevent the action through people's face photo or the deception face video face identification system, realize distinguishing real man dummy's function, guarantee information security.
In yet another aspect, embodiments of the present application further provide a computer program product for use in conjunction with an electronic device including a display, the computer program product including a computer-readable storage medium and a computer program mechanism embedded therein, the computer program mechanism including instructions for performing the following steps:
continuously acquiring a plurality of first face images of a user;
after each first face image is determined to be a living body image, identifying whether micro-motions exist in a plurality of continuous first face images or not;
and if the micro-motion exists, confirming that the human face living body detection of the user passes.
Optionally, before the acquiring the plurality of first face images in succession, the method further includes:
it is determined that the user satisfies the distance requirement.
Optionally, determining that the user satisfies the distance requirement comprises:
acquiring a second face image of the user;
acquiring a face area in a second face image;
determining a user distance according to the face area;
and if the user distance is matched with the distance requirement, determining that the user meets the distance requirement.
Optionally, determining the user distance according to the face region includes:
determining the user distance according to the proportion of the face area in the second face image; or,
and extracting the distance between preset parts of the human face from the human face region, and determining the user distance according to the ratio of the distance to the width and the height of the second human face image.
Optionally, after determining the user distance according to the face region, the method further includes:
and if the distance of the user is not matched with the distance requirement, guiding the user to move so as to meet the distance requirement.
Optionally, after the multiple first face images of the user are continuously acquired, the method further includes:
determining whether each first face image is a living body image;
for any first face image, if the fact that the any first face image is a living body image is determined, storing the any first face image into an image sequence; and if any first face image is determined to be a non-living body image, the process is terminated, the image sequence is emptied, and the living body detection of the face of the user does not pass.
Optionally, the non-live image comprises: photos, videos, facial films.
Optionally, the micro-motion comprises human face organ micro-changes, human face muscle micro-changes and human face micro-movements.
Optionally, after identifying whether the micro-motion exists in the plurality of consecutive first face images, the method further includes:
if the micro-motion does not exist, the flow is terminated, the first face image in the image sequence is emptied, and the user face living body detection is not passed.
Has the advantages that:
the embodiment of the application, gather user's many first face images in succession, confirm that every first face image is the live body image after, whether there is the micro-motion in discerning many continuous first face images, if there is the micro-motion, then confirm user's human face live body detection and pass through, carry out human face live body detection to the user through live body identification and micro-motion identification, effectively promote human face live body detection's accuracy, prevent the action through people's face photo or the deception face video face identification system, realize distinguishing real man dummy's function, guarantee information security.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.