CN110363111B

CN110363111B - Face living body detection method, device and storage medium based on lens distortion principle

Info

Publication number: CN110363111B
Application number: CN201910567529.4A
Authority: CN
Inventors: 王义文; 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-06-27
Filing date: 2019-06-27
Publication date: 2023-08-25
Anticipated expiration: 2039-06-27
Also published as: CN110363111A

Abstract

The invention provides a human face living body detection method based on a lens distortion principle, which comprises the following steps: acquiring one long-distance face image and one short-distance face image of a user to be detected by using an image pick-up device; performing human face key point detection on the obtained long-distance face image and short-distance face image of the user to be detected through a Dlib database to obtain multi-dimensional feature vectors of the two face images of the user to be detected; and classifying and identifying the multidimensional feature vector by using a living body detection classifier to obtain an output living body detection result. The invention also provides an electronic device and a computer readable storage medium. The invention combines the distortion characteristic principle on the optical lens with the neural network to realize living body detection, can be suitable for various scenes, is not affected by light rays and the like, and has stronger generalization capability.

Description

Face living body detection method, device and storage medium based on lens distortion principle

Technical Field

The present invention relates to the field of biological recognition technologies, and in particular, to a method and an apparatus for detecting a human face living body based on a lens distortion principle, and a computer readable storage medium.

Background

Nowadays, with rapid development of face detection and face verification technologies, more and more face unlocking projects are applied to life and work. Therefore, how to guarantee the security of face verification is more and more important. And living body verification is adopted to judge whether the person or the photo or the video is in front of the lens, so that the security of face verification is ensured.

In the living body verification of the conventional algorithm, the characteristics in the image are obtained by taking the differences (such as color textures, non-rigid motion deformation, materials and image or video quality) between living bodies and non-living bodies into consideration, and the characteristics in the image are obtained by using a manually designed filter, are manufactured into positive samples and negative samples, and are then put into svm for training, so that whether a target is a living body or a non-living body is judged. However, the conventional algorithm has the following disadvantages: firstly, the filter design is complex, an engineer must manually change parameters, and the final filter can be manufactured through repeated attempts; second, the generalization ability of the conventional learning method is poor because various environmental changes such as light, shielding, angles, highlights, shadows, etc. exist in video acquisition, and the conventional learning method cannot manufacture a corresponding filter according to each case. Therefore, the conventional algorithm cannot be widely applied to various scenes.

At present, more and more researches apply a deep learning network to living body detection, wherein a CNN model can extract the characteristics of an image, further analyze the image and manually debug the image by automatic learners; moreover, the deep learning network can observe features which can not be observed by human beings, and the whole network is a black box and can learn according to data. Some mainstream algorithms at present select a proper classification network to inject VGG16, resNet or DenseNet, etc., then mix and disturb positive samples and negative samples, and put the mixed and disturbed positive samples and negative samples into a neural network to train, so that a result can be obtained, and thus, inference can be performed. But the deep learning has holes, and can still be cracked by shaking the video or the picture.

Therefore, a high-safety human face living body detection method is needed.

Disclosure of Invention

The invention provides a human face living body detection method, an electronic device and a computer readable storage medium based on a lens distortion principle, which mainly aim to realize living body detection by combining distortion characteristics on an optical lens with a neural network, so that the human face living body detection method is not influenced by light rays and the like, can be suitable for various scenes, and has stronger generalization capability.

In order to achieve the above object, the present invention provides a method for detecting a human face living body based on a lens distortion principle, the method comprising:

s110, acquiring one long-distance face image and one short-distance face image of a user to be detected by using a camera device; s120, detecting key points of human faces of the obtained long-distance face image and the obtained short-distance face image of the user to be detected through a Dlib database to obtain multidimensional feature vectors of the two face images of the user to be detected; s130, classifying and identifying the multidimensional feature vector by using a living body detection classifier to obtain an output living body detection result.

Preferably, before the step of S130, the method further includes: the living body detection classifier is obtained through a training step, wherein the training step comprises the following steps:

s210, positive training samples and negative training samples of each legal user are obtained; the positive training samples are long-distance face images and short-distance face images of living bodies of the legal users, and the negative training samples are long-distance face images and short-distance face images of non-living bodies of the legal users; and S220, performing neural network training on the positive training sample and the negative training sample to obtain a living body detection classifier.

Preferably, the training step further includes:

s310, obtaining a long-distance face image and a short-distance face image of a legal user living body;

s320, detecting key points of the human face of the obtained long-distance face image and the short-distance face image of the user to be detected through a Dlib database, and selecting a plurality of groups of key points respectively;

s330, obtaining distance data among each group of key points in the long distance by using the extracted multiple groups of key points in the long distance; obtaining distance data between each group of key points at short distance by using the extracted groups of key points at short distance;

s340, obtaining a multidimensional feature vector by using the obtained distance data between each group of key points at long distance and short distance;

s350, inputting the multidimensional feature vector into a living body detection classifier to perform neural network training.

Preferably, the neural network comprises a five-layer structure, and the fifth layer is an output layer, and the output layer outputs the detection result.

Preferably, the output layer performs two classifications by using a sigmoid function, and determines whether the user to be tested is a living body or a non-living body.

In addition, to achieve the above object, the present invention also provides an electronic device including: the device comprises a memory, a processor and an image pickup device, wherein the memory comprises a human face living body detection program, and the human face living body detection program realizes the following steps when being executed by the processor:

Preferably, the training step further includes:

Preferably, the neural network comprises a five-layer structure, wherein the fifth layer is an output layer, the output layer performs two classification by using a sigmoid function, and whether the user to be tested is a living body or a non-living body is determined.

In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium including therein a face in-vivo detection program based on a lens distortion principle, which when executed by a processor, implements the steps of the face in-vivo detection method based on the lens distortion principle as described above.

The invention provides a human face living body detection method, an electronic device and a computer readable storage medium based on a lens distortion principle, which are characterized in that a large amount of positive and negative sample data are put into a neural network for training, wherein long-distance and short-distance human face samples of living bodies of all legal users are taken as one type of training samples, and long-distance and short-distance human face samples of negative samples (namely non-living bodies, photos or videos) are taken as another type of training samples; the neural network can learn the difference of the distance proportion of the key points of the living body and the non-living body so as to accurately infer. The invention combines the distortion characteristics on the optical lens with the neural network to realize living body detection, can be suitable for various scenes, cannot be influenced by light rays and the like, can calculate the distance between key points as long as the face can be detected normally, does not consider other characteristics which are easy to influence, and has stronger generalization capability.

Drawings

FIG. 1 is a flowchart of a face in-vivo detection method based on the principle of lens distortion according to a preferred embodiment of the present invention;

FIG. 2 is a flow chart of a preferred embodiment of a training method of the predetermined in-vivo detection classifier of the present invention;

FIG. 3 is a flow chart of a preferred embodiment of a training method for training a neural network using a positive training sample according to the present invention;

fig. 4 is an application environment schematic diagram of a face living body detection method based on a lens distortion principle according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of face key point detection according to the present invention;

FIG. 6 is a schematic diagram of a full connection layer extracting 29-dimensional feature vectors according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a neural network according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The invention provides a human face living body detection method based on a lens distortion principle. Referring to fig. 1, a flowchart of a face living body detection method based on the principle of lens distortion according to a preferred embodiment of the present invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.

In this embodiment, the face living body detection method based on the lens distortion principle includes: step S110-step S130.

In step S110, the image capturing device is used to obtain one long-distance facial image and one short-distance facial image of the user to be detected.

When the image pickup device shoots a real-time image, the image pickup device sends the real-time image to the processor, and after the processor receives the real-time image, the processor firstly acquires the size of a picture and establishes a gray image with the same size; converting the acquired color image into a gray image, and simultaneously creating a memory space; equalizing the gray image histogram to reduce the gray image information amount and increase the detection speed, loading a training library, detecting the face in the picture, returning an object containing the face information, obtaining the data of the position of the face, and recording the number; finally, the area of the head portrait is acquired and saved, thus completing the process of extracting the face image in real time.

It should be noted that, the processing of the real-time image further includes preprocessing such as scaling, cropping, flipping, and warping the sample image.

Step S120, detecting key points of human faces of the obtained long-distance face image and the obtained short-distance face image of the user to be detected through a Dlib database to obtain multidimensional feature vectors of the two face images of the user to be detected; wherein in the implementation, the multi-dimensional feature vector is a 29-dimensional feature vector.

Next, the change of the distance between the face key points when the face is at two distances from the lens will be extracted from the real-time image using Dlib database.

The method comprises the steps of extracting key points of a human face by using a Dlib database, wherein Dlib is a relatively mature human face detection library, and a C++ open source toolkit containing a machine learning algorithm is provided, and the library is provided with a C++ interface and a python interface and contains a plurality of machine learning algorithms, so that human face detection and key point detection can be conveniently carried out. By using Dlib, the face position is obtained, and the positions of 68 key points can be obtained at the same time, and each key point can be marked. The key point detection is to detect the key points of the faces of the pictures by using a gradient lifting decision tree face key point detector to obtain the key point position information of the faces on the pictures; and the gradient lifting decision tree face key point detector is obtained based on Dlib pre-training.

The extraction of the key points is firstly carried out according to a preset template picture, wherein the preset template picture comprises information such as skin color, eyebrows, key point positions of a target person, and the like, and the key point position information of the target person is a target coordinate value of an eye frame, a nose frame, a mouth frame and a face frame of the target person.

Extraction of 68 key points of a face can be achieved by using the Dlib database, but only 58 of them are selected in the invention. Referring to fig. 5, a schematic diagram of 68 key points of a face is shown; removing 10 key points in the box (namely, 10 key points of 28, 29, 30, 31, 34, 52, 63, 67,9 and 58) to obtain 58 key points selected by the invention; the 58 key points are key points based on the outside of the central axis of the human face, the 58 key points are distributed in bilateral symmetry with the central axis, and each key point can find a symmetrical point; that is, the 58 keypoints comprise 29 groups, each group comprising two keypoints that are symmetric with each other, for example, the keypoint 7 and the keypoint 11 in fig. 5 are a group of keypoints. That is, 29-dimensional feature vectors are obtained through 29 sets of keypoints.

In the invention, only 58 key points are intercepted, and whether the user is a photo or a video can be judged by changing the distance between every two groups of bilaterally symmetrical key points when the lens is at a long distance and a short distance. Wherein obtaining a photograph of the user's face at a distance and a photograph of the user's face at a close distance comprises: facial contours, eyebrows, nose, eyes and mouth. The distance between the human face and the lens can be 50-70 cm; and the close distance may be 30-50 cm from the lens.

And step S130, classifying and identifying the 29-dimensional feature vector by using a living body detection classifier to obtain an output living body detection result.

When the input image is a non-living body, although key points of the face can be detected, when an attacker approaches the screen, both the photograph and the video are 2-dimensional; so that the distance between the key points of bilateral symmetry of each group is unchanged; illustratively, the ratio between nose keypoints and eye keypoints will remain the same; therefore, a large amount of positive and negative sample data are put into the neural network for training, and the neural network can learn the difference of the distance proportion of the key points of the living body and the non-living body, so that correct deduction can be performed.

Referring to fig. 2, fig. 2 is a flowchart showing a preferred embodiment of the training method of the predetermined living body detection classifier according to the present invention.

Before the step of S130, the method further includes: the living body detection classifier is obtained through a training step, wherein the training step comprises the following steps: steps S210-S220.

S210, positive training samples and negative training samples of each legal user are obtained; the positive training samples are long-distance face images and short-distance face images of living bodies of the legal users, and the negative training samples are long-distance face images and short-distance face images of non-living bodies of the legal users;

and S220, training the neural network by using the positive training sample and the negative training sample to obtain the living body detection classifier.

That is, the method of the present invention is to put a large amount of positive and negative sample data into the neural network for training, wherein the long-distance and short-distance face samples of the living body of each legal user are used as one type of training samples, and the long-distance and short-distance face samples of the negative sample (i.e. non-living body, photo or video) are used as another type of training samples; the neural network can learn the difference of the distance proportion of the key points of the living body and the non-living body so as to accurately infer.

Referring to fig. 3, fig. 3 is a flowchart of a training method for training a neural network using a training sample according to a preferred embodiment of the present invention. The training step of training the neural network on the training sample further comprises: steps S310-S350.

S310, obtaining a long-distance face image and a short-distance face image of a legal user living body; s320, detecting the obtained long-distance face image and short-distance face image of the user to be detected through a Dlib database, and selecting 29 groups of key points respectively; s330, obtaining distance data between each group of key points at long distance by using the 29 groups of key points at long distance, and obtaining distance data between each group of key points at short distance by using the 29 groups of key points at short distance; s340, obtaining a 29-dimensional feature vector by using the obtained distance data between each group of key points at long distance and short distance; s350, inputting the 29-dimensional feature vector into a living body detection classifier to perform neural network training.

An exemplary description will be given below, taking the keypoints 7 and 11 as examples, and the distance data between the set of keypoints extracted from the long-distance face image is denoted as L _{Far distance} The method comprises the steps of carrying out a first treatment on the surface of the The distance data between the key points extracted from the near-distance facial image is denoted as L _Near-to-near The method comprises the steps of carrying out a first treatment on the surface of the And eigenvector=l _{Far distance} /L _Near-to-near The method comprises the steps of carrying out a first treatment on the surface of the Such feature vectors have 29 sets in total, namely 29-dimensional feature vectors.

Therefore, 29-dimensional feature vectors of the positive and negative samples are put into the neural network for training, and the neural network can learn that the living body and the non-living body have different distance proportions of key points, so that whether the user to be tested is a living body or a non-living body can be judged correctly.

Referring to fig. 6, fig. 6 is a schematic diagram of the full connection layer of the present invention for extracting 29-dimensional feature vectors.

Wherein, since there are only 29 dimension data, the dimension of the input data is smaller; therefore, the feature extraction is performed by using the full-connection layer, which is formed by connecting each point of the layer to each point of the next layer, so that the features of the same key point in long distance and short distance are reserved as far as possible. The neural network can automatically learn the change between key points caused by distortion, namely the distance between the key points between noses and the distance between other key points of the face, which are mentioned before, changes along with the distance of the lens.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a neural network according to the present invention. The neural network adopts a 5-layer structure altogether, 29-dimensional key point data are input, 64 neurons are used in the first layer, 128 neurons are used in the second layer, 256 neurons are used in the third layer, 512 neurons are used in the fourth layer, and finally, a result is output in the last layer, and a sigmoid function is used for carrying out two classification to determine whether a user to be detected is a living body or a non-living body.

Note that, the Sigmoid function, i.e., f (x) =1/(1+e-x). Is a nonlinear function of the neurons, and the function of activation in the neural network is to introduce nonlinearities. Specific nonlinear forms, there are a variety of options. sigmoid has the advantage that the output range is limited, so that data is not easy to diverge in the process of transferring. There is of course a corresponding disadvantage in that the gradient is too small at saturation. sigmoid has the further advantage that the output range is (0, 1) and can therefore be used as an output layer, outputting the representation probability.

The invention provides a face living body detection method based on a lens distortion principle, which is applied to an electronic device 4. Referring to fig. 4, an application environment diagram of a preferred embodiment of a face living body detection method based on a lens distortion principle according to the present invention is shown.

In this embodiment, the electronic apparatus 1 may be a terminal device having an operation function, such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer.

The electronic device 4 includes: a processor 42, a memory 41, an imaging device 43, a network interface 44, and a communication bus 45.

The memory 41 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as flash memory, a hard disk, a multimedia card, a card memory 41, etc. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 4, such as a hard disk of the electronic device 4. In other embodiments, the readable storage medium may also be an external memory 41 of the electronic device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the electronic device 4.

In the present embodiment, the readable storage medium of the memory 41 is generally used to store a face biopsy program 40, a biopsy classifier, and the like, which are installed in the electronic device 4. The memory 41 may also be used for temporarily storing data that has been output or is to be output.

The processor 42 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for executing program code or processing data stored in the memory 41, such as the face biopsy program 40.

The image pickup device 43 may be a part of the electronic device 4 or may be independent of the electronic device 4. In some embodiments, the electronic device 4 is a terminal device with a camera, such as a smart phone, a tablet computer, a portable computer, etc., and the camera 43 is the camera of the electronic device 4. In other embodiments, the electronic device 4 may be a server, and the image capturing device 43 is independent of the electronic device 4 and connected to the electronic device 4 through a network, for example, the image capturing device 43 is installed in a specific location, such as an office location or a monitoring area, and captures a real-time image of a target entering the specific location in real time, and transmits the captured real-time image to the processor 42 through the network.

The network interface 44 may optionally comprise a standard wired interface, a wireless interface (e.g., WI-FI interface), typically used to establish a communication connection between the electronic apparatus 4 and other electronic devices.

The communication bus 45 is used to enable connection communication between these components.

Fig. 4 shows only an electronic device 4 having components 41-45, but it is understood that not all of the illustrated components are required to be implemented, and that more or fewer components may alternatively be implemented.

In a specific embodiment of the invention, the electronic device 4 may further comprise a user interface, which may comprise an input unit such as a Keyboard (Keyboard), a voice input device such as a microphone or the like with voice recognition function, a voice output device such as a sound box, an earphone or the like, and optionally a standard wired interface, a wireless interface.

The electronic device 4 may furthermore comprise a display, which may also be referred to as a display screen or display unit. In some embodiments, the display may be an LED display, a liquid crystal display, a touch-control liquid crystal display, an Organic Light-Emitting Diode (OLED) touch device, or the like. The display is used for displaying information processed in the electronic device 4 and for displaying a visualized user interface.

In addition, the electronic device 4 further comprises a touch sensor. The area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area. Further, the touch sensors described herein may be resistive touch sensors, capacitive touch sensors, and the like. The touch sensor may include not only a contact type touch sensor but also a proximity type touch sensor. Furthermore, the touch sensor may be a single sensor or may be a plurality of sensors arranged in an array, for example.

The area of the display of the electronic device 4 may be the same as or different from the area of the touch sensor. Optionally, a display is stacked with the touch sensor to form a touch display screen. The device detects a touch operation triggered by a user based on a touch display screen.

Optionally, the electronic device 4 may further include a Radio Frequency (RF) circuit, a sensor, an audio circuit, etc., which are not described herein.

In the embodiment of the apparatus shown in fig. 4, an operating system, and a face in-vivo detection program 40 may be included in a memory 41 as a computer storage medium; the processor 42 performs the following steps when executing the face living body detection program 40 stored in the memory 41:

In the electronic device provided in the above embodiment, the long-distance images and the short-distance images of living bodies and non-living bodies of a large number of legal users are put into the neural network for training, and the neural network can learn the difference of the distance and the near proportions of the key points of the living bodies and the non-living bodies, so that accurate estimation can be performed.

In other embodiments, the face biopsy program 40 may also be partitioned into one or more modules, one or more modules stored in the memory 41 and executed by the processor 42 to complete the present invention. The invention may refer to a series of computer program instruction segments capable of performing a specified function.

The face biopsy procedure 40 may be divided into: the device comprises an acquisition module, a key point identification module and a living body detection module. The functions or operation steps performed by the acquisition module, the keypoint identification module, and the living detection module are similar to those described above, and will not be described in detail herein, for example, wherein: an acquisition module, configured to acquire a real-time image captured by the image capturing device 43, and extract a real-time face image from the real-time image by using a face recognition algorithm; the key point recognition module is used for extracting 29 groups of key points of the human face of the real-time face image through the Dlib database and forming 29-dimensional feature vectors; and the living body detection module compares the obtained feature vector with the feature vector in the living body detection classifier, so as to infer whether the user to be detected is a living body.

In addition, an embodiment of the present invention also proposes a computer-readable storage medium, in which a face biopsy program is included, which when executed by a processor, implements the following operations:

In one embodiment of the present invention,

prior to the step of S130, the method further includes: the living body detection classifier is obtained through a training step, wherein the training step comprises the following steps:

s210, positive training samples and negative training samples of each legal user are obtained; the positive training samples are long-distance face images and short-distance face images of living bodies of legal users, and the negative training samples are non-living long-distance face images and short-distance face images of non-living bodies of the legal users; and S220, performing neural network training on the obtained positive training sample and negative training sample to obtain a living body detection classifier.

The training step further comprises: :

The neural network adopted in the embodiment includes a five-layer structure in which 64 neurons are used in the first layer, 128 neurons are used in the second layer, 256 neurons are used in the third layer, 512 neurons are used in the fourth layer, and the fifth layer is an output layer for outputting the detection result.

In a specific embodiment of the present invention, the output layer performs two classification by using a sigmoid function, and determines whether the user to be tested is a living body or a non-living body.

The specific embodiments of the computer readable storage medium of the present invention are substantially the same as the specific embodiments of the face living body detection method and the electronic device based on the lens distortion principle, and are not described herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. A human face living body detection method based on a lens distortion principle, which is applied to an electronic device, and is characterized in that the method comprises the following steps:

s110, acquiring one long-distance face image and one short-distance face image of a user to be detected by using a camera device;

s120, detecting key points of human faces of the obtained long-distance face image and the obtained short-distance face image of the user to be detected through a Dlib database to obtain multidimensional feature vectors of the two face images of the user to be detected;

s130, classifying and identifying the multidimensional feature vectors by using a living body detection classifier to obtain an output living body detection result;

prior to the step of S130, the method further includes: acquiring the living body detection classifier through a training step; wherein the training step comprises:

and S220, performing neural network training on the positive training sample and the negative training sample to obtain a living body detection classifier.

2. The method for in-vivo detection of a face based on the principle of lens distortion according to claim 1, wherein the training step further comprises:

3. The method for detecting human face living body based on the principle of lens distortion according to claim 1, wherein,

the neural network comprises a five-layer structure, wherein the fifth layer is an output layer, and the output layer outputs a detection result.

4. A face living body detection method based on a lens distortion principle according to claim 3, wherein the output layer performs two classification by using a sigmoid function to determine whether the user to be detected is a living body or a non-living body.

5. The electronic device is characterized by comprising a memory, a processor and an image pickup device, wherein the memory comprises a human face living body detection program, and the human face living body detection program realizes the following steps when being executed by the processor:

s130, classifying and identifying the multidimensional feature vectors by using a living body detection classifier to obtain an output living body detection result; wherein, the living body detection classifier is obtained through the following training steps:

s210, obtaining positive training samples and negative training samples of each legal user, wherein the positive training samples are long-distance face images and short-distance face images of living bodies of the legal users; the negative training sample is a long-distance face image and a short-distance face image of a non-living body of the legal user;

and S230, performing neural network training on the positive training sample and the negative training sample to obtain a living body detection classifier.

6. The electronic device of claim 5, wherein the training step further comprises:

s310, obtaining a face image of a legal user living body at a long distance and a face image of the legal user living body at a short distance;

s320, detecting the obtained long-distance face image and short-distance face image of the user to be detected through a Dlib database, and selecting a plurality of groups of key points respectively;

7. The electronic device of claim 5, wherein the neural network comprises a five-layer structure, and the fifth layer is an output layer, and the output layer performs two classifications by using a sigmoid function to determine whether the user to be tested is a living body or a non-living body.

8. A computer-readable storage medium, wherein a face in-vivo detection program based on a lens distortion principle is included in the computer-readable storage medium, and the face in-vivo detection program, when executed by a processor, implements the steps of the face in-vivo detection method based on a lens distortion principle as set forth in any one of claims 1 to 4.