CN112597885A - Face living body detection method and device, electronic equipment and computer storage medium - Google Patents

Face living body detection method and device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN112597885A
CN112597885A CN202011526858.3A CN202011526858A CN112597885A CN 112597885 A CN112597885 A CN 112597885A CN 202011526858 A CN202011526858 A CN 202011526858A CN 112597885 A CN112597885 A CN 112597885A
Authority
CN
China
Prior art keywords
face image
living body
face
training sample
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011526858.3A
Other languages
Chinese (zh)
Inventor
聂凤梅
李骊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing HJIMI Technology Co Ltd
Original Assignee
Beijing HJIMI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing HJIMI Technology Co Ltd filed Critical Beijing HJIMI Technology Co Ltd
Priority to CN202011526858.3A priority Critical patent/CN112597885A/en
Publication of CN112597885A publication Critical patent/CN112597885A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The application provides a face in-vivo detection method, a face in-vivo detection device, electronic equipment and a computer storage medium, wherein the face in-vivo detection method comprises the following steps: firstly, acquiring a face image to be detected; then, inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution; and finally, if the living body prediction value of the face image to be detected is larger than the threshold value, determining that the face in the image to be detected is a living body. Therefore, the aim of considering both the detection speed and the detection accuracy during the human face living body detection is fulfilled.

Description

Face living body detection method and device, electronic equipment and computer storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting a living human face, an electronic device, and a computer storage medium.
Background
Ever since the commercial face recognition system enters the market, the face recognition system is widely applied to the fields of identity authentication, electronic commerce, man-machine interaction, information security and the like. The face living body recognition is more and more important, in many scenes needing face recognition, face living body detection needs to be carried out firstly, otherwise, the face recognition is meaningless, for example, in a face payment scene, a face image needs to be ensured to come from a real person before the face recognition, otherwise, a picture A possibly holding a picture B passes through the face recognition, and serious potential safety hazards are caused.
At present, the method for detecting the living human face generally adopts a texture feature method or a time method for detecting the living human face. However, the method for detecting the living human face by using the texture features comprises the following steps: the single-frame face image is used for living body detection, although the speed is relatively high, the accuracy of the face living body detection cannot be guaranteed; the method for detecting the living human face based on time comprises the following steps: the video (continuous multiframe) is used for detecting the living human face, although the accuracy is high, the speed is relatively slow. It can be seen that the existing human face living body detection method cannot give consideration to both detection speed and accuracy on mobile equipment.
Disclosure of Invention
In view of the above, the present application provides a method and an apparatus for detecting a living human face, an electronic device, and a computer storage medium, which are used for detecting a living human face while considering detection speed and accuracy.
The application provides a face living body detection method in a first aspect, which comprises the following steps:
acquiring a human face image to be detected;
inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates original convolution and cavity convolution;
and if the living body prediction value of the face image to be detected is greater than the threshold value, determining that the face in the image to be detected is a living body.
Optionally, the method for constructing the human face living body detection model includes:
constructing a training sample set of the face image; wherein the training sample set of the facial images comprises a plurality of training samples of the facial images; the training samples of the plurality of face images comprise face images of a plurality of living bodies and face images of a plurality of non-living bodies;
aiming at a training sample of each face image, inputting the training sample of the face image into a plurality of continuous dense network layers in a deep learning model to obtain a first prediction result of whether the training sample of the face image is a living body and a second prediction result of whether the training sample of the face image is the living body in a plurality of scenes;
calculating to obtain a final loss function value of the training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body or not, whether the training sample of the face image is a second prediction result of the living body or not in a plurality of scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of the living body or not, and whether the training sample of the face image is a real result of the living body or not in a plurality of scenes;
and if the final loss function value does not meet the preset convergence condition, adjusting parameters in the deep learning model until the final loss function value calculated by the adjusted deep learning model meets the preset convergence condition, and taking the deep learning model as a human face living body detection model.
Optionally, the calculating, according to whether the training sample of the face image is the first prediction result of the living body, whether the training sample of the face image is the second prediction result of the living body in multiple scenes, the output value of each dense network layer, whether the training sample of the face image is the real result of the living body, and whether the training sample of the face image is the real result of the living body in multiple scenes, to obtain the final loss function value of the training sample of the face image, includes:
combining the output values of each dense network layer to obtain a series characteristic diagram;
inputting the feature maps connected in series into a spatial attention output layer to obtain a third prediction result of whether the training sample of the face image is a living body;
determining a first loss function value of the training sample of the face image according to a third prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
determining a second loss function value of the training sample of the face image according to a first prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
determining a third loss function value of the training sample of the face image according to a second prediction result of whether the training sample of the face image is a living body in a plurality of scenes and a real result of whether the training sample of the face image is a living body in a plurality of scenes;
and determining a sum of the first loss function value, the second loss function value, and the third loss function value as the final loss function value.
Optionally, the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and the inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected includes:
inputting the face image to be detected into a plurality of continuous dense network layers in a face living body detection model to obtain an output value of each dense network layer;
combining the output values of each dense network layer to obtain a series characteristic diagram;
and inputting the series characteristic diagrams into the space attention output layer to obtain a living body prediction value of the face image to be detected.
Optionally, the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and the inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected includes:
inputting the face image to be detected into a plurality of continuous dense network layers in a face living body detection model to obtain an output value of each dense network layer;
combining the output values of each dense network layer to obtain a series characteristic diagram;
inputting the feature maps connected in series into the spatial attention output layer to obtain a first living body prediction value of the face image to be detected;
inputting the face image to be detected into an output value of the last dense network layer in a plurality of continuous dense network layers in a face living body detection model, and taking the output value as a second living body prediction value of the face image to be detected;
and determining the living body prediction value of the face image to be detected according to the first living body prediction value and the second living body prediction value.
The present application provides in a second aspect a living human face detection apparatus, including:
the acquisition unit is used for acquiring a face image to be detected;
the first input unit is used for inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates original convolution and cavity convolution;
and the first determining unit is used for determining the human face in the image to be detected as the living body if the living body prediction value of the human face image to be detected is greater than a threshold value.
Optionally, the construction unit of the human face living body detection model includes:
the training sample set constructing unit is used for constructing a training sample set of the face image; wherein the training sample set of the facial images comprises a plurality of training samples of the facial images; the training samples of the plurality of face images comprise face images of a plurality of living bodies and face images of a plurality of non-living bodies;
the second input unit is used for inputting the training samples of the face images to a plurality of continuous dense network layers in a deep learning model aiming at the training samples of each face image to obtain a first prediction result of whether the training samples of the face images are living bodies and a second prediction result of whether the training samples of the face images are living bodies in a plurality of scenes;
a calculating unit, configured to calculate a final loss function value of a training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body, whether the training sample of the face image is a second prediction result of the living body in multiple scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of the living body, and whether the training sample of the face image is a real result of the living body in multiple scenes;
and the adjusting unit is used for adjusting parameters in the deep learning model if the final loss function value does not meet the preset convergence condition until the final loss function value calculated by the adjusted deep learning model meets the preset convergence condition, and taking the deep learning model as a human face living body detection model.
Optionally, the computing unit includes:
the first merging unit is used for merging the output values of each dense network layer to obtain a series-connected characteristic diagram;
the third input unit is used for inputting the series-connected feature maps into a space attention output layer to obtain a third prediction result of whether the training sample of the face image is a living body;
a second determining unit, configured to determine a first loss function value of the training sample of the face image according to a third prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
the second determining unit is further configured to determine a second loss function value of the training sample of the face image according to a first prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
the second determining unit is further configured to determine a third loss function value of the training sample of the face image according to a second prediction result of whether the training sample of the face image is a living body in multiple scenes and a real result of whether the training sample of the face image is a living body in multiple scenes;
a third determining unit configured to determine a sum of the first loss function value, the second loss function value, and the third loss function value as the final loss function value.
Optionally, the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and the first input unit includes:
the first input subunit is used for inputting the face image to be detected to a plurality of continuous dense network layers in a face living body detection model to obtain an output value of each dense network layer;
a second merging unit, configured to merge the output values of each dense network layer to obtain a series-connected feature map;
the first input subunit is further configured to input the series-connected feature maps to the spatial attention output layer, so as to obtain a living body prediction value of the face image to be detected.
Optionally, the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and the first input unit includes:
the first input subunit is used for inputting the face image to be detected to a plurality of continuous dense network layers in a face living body detection model to obtain an output value of each dense network layer;
a second merging unit, configured to merge the output values of each dense network layer to obtain a series-connected feature map;
the first input subunit is further configured to input the series-connected feature maps to the spatial attention output layer, so as to obtain a first living body prediction value of the face image to be detected;
the first input subunit is further configured to input the face image to be detected to an output value in a last dense network layer of a plurality of continuous dense network layers in a face in-vivo detection model, and the output value is used as a second in-vivo prediction value of the face image to be detected;
and the fourth determining unit is used for determining the living body prediction value of the face image to be detected according to the first living body prediction value and the second living body prediction value.
A third aspect of the present application provides an electronic device comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of the first aspects.
A fourth aspect of the present application provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method according to any one of the first aspect.
In view of the above, in the human face living body detection method, the human face living body detection device, the electronic device, and the computer storage medium provided by the present application, the human face living body detection method includes: firstly, acquiring a face image to be detected; then, inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates original convolution and cavity convolution; and finally, if the living body prediction value of the face image to be detected is larger than a threshold value, determining that the face in the image to be detected is a living body. Therefore, the aim of considering both the detection speed and the detection accuracy during the human face living body detection is fulfilled.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a specific flowchart of a human face live detection method according to an embodiment of the present application;
FIG. 2 is a diagram of a dense network layer in the prior art;
fig. 3 is a schematic diagram of a dense network layer according to an embodiment of the present application;
fig. 4 is a flowchart of a method for constructing a human face live detection model according to an embodiment of the present application;
FIG. 5 is a flow chart of a method for calculating a final loss function value according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a plurality of spatial attention output layers according to an embodiment of the present application;
fig. 7 is a detailed flowchart of a living human face detection method according to another embodiment of the present application;
fig. 8 is a detailed flowchart of a living human face detection method according to another embodiment of the present application;
fig. 9 is a schematic view of a living human face detection apparatus according to another embodiment of the present application;
fig. 10 is a schematic view of an electronic device for implementing a face live detection method according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first", "second", and the like, referred to in this application, are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence of functions performed by these devices, modules or units, but the terms "include", or any other variation thereof are intended to cover a non-exclusive inclusion, so that a process, method, article, or apparatus that includes a series of elements includes not only those elements but also other elements that are not explicitly listed, or includes elements inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiment of the application provides a face living body detection method, as shown in fig. 1, specifically comprising the following steps:
s101, obtaining a face image to be detected.
Specifically, firstly, in a scene where living human face recognition is required, for example: the method comprises the steps of identity authentication, man-machine interaction, video monitoring and the like, wherein images containing human faces are shot through a camera and the like, and then the shot images are extracted through the existing human face detection system to obtain human face images to be detected.
S102, inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected.
The human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution.
The hole convolution can increase the perception field of the convolution neuron without increasing the calculation amount, and the principle is as follows:
Figure BDA0002851077850000081
y represents the output of the 1-dimensional convolution, y [ i ]]The ith neuron representing the output, x representing the characteristic of the input, x [ i + r.k ]]The element with index i + r.k is represented, r represents the void rate, the ordinary convolution, namely the original convolution can be regarded as the convolution with the void rate of 1, when r is not 1, the sampling with the interval of r-1 is equivalently carried out on the input characteristic, and then the convolution is carried out, so that the void convolution can be seen to expand the perception field of the neuron in a sampling mode without increasing the calculation amount.
Each dense network layer in the classification model in the prior art is implemented by a plurality of original convolutions, as shown in fig. 2, Filter conditioner represents a feature series, a Previous layer represents a Previous network layer, and conv represents a convolution, and it can be seen that in order to obtain different perception fields, the existing dense network layer needs to be implemented by 1 original convolution of 3 × 3 and 2 continuous original convolutions of 3 × 3, which correspond to the left and right parts in fig. 2, respectively. Referring to fig. 3, a schematic diagram of a dense network layer of a network structure in which an original convolution and a hole convolution are merged in the present application is shown, where atous conv.r is 2, which represents a hole convolution with a hole rate of 2, and it can be seen that the dense network layer of a network structure in which an original convolution and a hole convolution are merged only needs 1 original convolution of 3 × 3 and 1 hole convolution to achieve the effect of the dense network layer in fig. 2, that is, compared with the dense network layer in the prior art, the dense network layer of a network structure in which an original convolution and a hole convolution are merged can obtain a sensing field of the same size and reduce the number of parameters and the amount of calculation of a model.
It should be noted that the number of layers of the dense network layer and the number of features in the dense network layer may be changed according to the actual application, and is not limited herein. Similarly, the void rate of the void convolution can be changed according to the actual application, and is not limited herein.
Optionally, in another embodiment of the present application, an implementation manner of the method for constructing a living human face detection model, as shown in fig. 4, includes:
s401, constructing a training sample set of the face image.
The training sample set of the face images comprises a plurality of training samples of the face images; the training samples of the plurality of face images comprise face images of a plurality of living bodies and face images of a plurality of non-living bodies.
Specifically, the data acquisition system may be used to acquire images including faces in various scenes, for example: different lighting conditions, different distances between the cameras, different attack modes, different postures, different angles, and the like, which are not limited herein. And extracting the face image in the image by using a face detection system. Labeling the training sample of the face image, for example, if the training sample of the face image is a living body, setting a label gt _ label _1 as 1; if the training sample of the face image is a non-living body, a label gt _ label _1 may be set to 0.
And in order to enable the deep learning model to learn the essential features of the human face living body, wherein the essential features are the features of the human face living body which do not change along with the scene. The generalization capability of the deep learning model under different scenes is increased, so that an auxiliary classification task is also added during model training, labels of living bodies under different scenes are 0, labels of non-living bodies under different scenes are different, and the labels of the non-living bodies under the three scenes are respectively 1, 2 and 3 under the assumption that three scenes exist. For example: if the training sample of the face image in the scene a is a non-living body, setting a label gt _ label _2 as a; if the training sample of the face image is a non-living body in the scene b, setting a label gt _ label _2 as b; if the training sample of the face image is a non-living body in the scene c, setting a label gt _ label _2 as c; in any case, as long as the training sample of the face image is a living body, a label gt _ label _2 can be set to 0.
Therefore, each training sample of the face image corresponds to two labels, one is a label indicating that the training sample of the face image is a living body or a non-living body, and the other is a label indicating that the training sample of the face image is a living body 0 in different scenes or a non-living body in a certain scene.
S402, aiming at the training sample of each face image, inputting the training sample of the face image into a plurality of continuous dense network layers in the deep learning model, and obtaining a first prediction result whether the training sample of the face image is a living body or not and a second prediction result whether the training sample of the face image is a living body or not in a plurality of scenes.
S403, calculating to obtain a final loss function value of the training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body, whether the training sample of the face image is a second prediction result of the living body in a plurality of scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of the living body, and whether the training sample of the face image is a real result of the living body in a plurality of scenes.
It should be noted that, as the number of layers of the dense network layer in the deep learning increases, the size of the output value of the output dense network layer may change, and then the dense network layer with the same specification may be used as the same dense network layer, and only one dense network layer may be calculated, so as to further reduce the calculation amount.
Optionally, in another embodiment of the present application, an implementation manner of step S403, as shown in fig. 5, includes:
and S501, combining the output values of each dense network layer to obtain a series characteristic diagram.
And S502, inputting the series-connected feature maps into a space attention output layer to obtain a third prediction result of whether the training sample of the face image is a living body.
The structure of the spatial attention output layer and the corresponding loss function are beneficial to improving the accuracy of the classification model without increasing the calculation amount.
The dotted line in fig. 6 is a schematic diagram of the spatial attention output layer, feature represents the output feature of the dense network layer in front of the spatial attention output layer, and after a plurality of features are connected, that is, the features are merged to obtain a serial feature map, which is input to the spatial attention output layer, it should be noted that when a plurality of features are connected and input to the spatial attention output layer, the features of which layers are used for connection may be selected as needed, and therefore, the number of feature channels participating in connection may be different; k represents the number of categories of the classification task; ConvBlock represents a spatial attention map; the Spatial attribute map represents a Spatial attention map, the size of the Spatial attention map is m multiplied by n, the Spatial attention map represents a matrix with m rows and n columns, the size of each value can reflect the importance degree of the corresponding position feature, and the sum of all pixels in the Spatial attention map is 1; spatial logs represent Spatial logic, the scale of which is a 3-dimensional matrix of m × n × K, and each position in the Spatial logic corresponds to K values, and it can be understood that there are m × n 1-dimensional vectors of length K in total, that is, m × n squares are directly opposite to one another when viewed from the front side of the image to the direction penetrating the paper surface, and each square represents a vector of length K.
S503, determining a first loss function value of the training sample of the face image according to whether the training sample of the face image is a third prediction result of the living body and whether the training sample of the face image is a real result of the living body.
Specifically, the first loss function value of the training sample of the face image may be calculated by using a preset calculation formula of the first loss function value. Wherein, the preset calculation formula of the first loss function value is as follows:
Figure BDA0002851077850000111
wherein L isSAOLRepresenting the first loss function value, N representing the size of the training sample of the face image, yiRepresenting the real result, y, corresponding to the training sample of the ith personal imageiSAOLAnd whether the training sample which is the ith personal face image is the third prediction result of the living body.
S504, determining a second loss function value of the training sample of the face image according to the first prediction result of whether the training sample of the face image is a living body and the real result of whether the training sample of the face image is a living body.
Specifically, the second loss function value of the training sample of the face image may be calculated by using a preset calculation formula of the second loss function value. Wherein, the preset calculation formula of the second loss function value is as follows:
Figure BDA0002851077850000112
wherein L isCLSRepresenting the second loss function value, N representing the size of the training sample of the face image, yiRepresenting the real result, y, corresponding to the training sample of the ith personal imageiPWhether the training sample which is the ith personal face image is the first prediction result of the living body.
And S505, determining a third loss function value of the training sample of the face image according to a second prediction result of whether the training sample of the face image is a living body in a plurality of scenes and a real result of whether the training sample of the face image is a living body in a plurality of scenes.
Specifically, the third loss function value of the training sample of the face image may be calculated by using a preset calculation formula of the third loss function value. Wherein, the preset calculation formula of the second loss function value is as follows:
Figure BDA0002851077850000113
Figure BDA0002851077850000114
wherein L isarc_faceRepresenting a third loss function value, wherein N represents the size of a training sample of the face image; s and m are empirical parameters; can be set and changed according to the actual application condition; n is the total number of categories; y isiRepresenting a real result corresponding to the training sample of the ith personal face image; wj、xiAnd cos θjAnd a constraint condition of a calculation formula of the preset third loss function value. | | | is a two-norm operation symbol, WjRepresenting a weight vector corresponding to the j category label; x is the number ofiRepresenting input feature vectors corresponding to training samples of the ith personal face image; thetajDenotes xiThe angle between the weight vectors corresponding to each class.
S506, the sum of the first loss function value, the second loss function value, and the third loss function value is used as a final loss function value.
It should be noted that, according to the actual situation, weight parameters may be set for the first loss function value, the second loss function value, and the third loss function value, respectively, and the initial weight parameter defaults to 1, which is not limited herein.
And S404, judging whether the final loss function value meets a preset convergence condition.
The preset convergence condition is preset by a technician and can be adjusted according to an actual application situation, an application scenario, and the like, and is not limited herein.
Specifically, if it is determined that the final loss function value does not satisfy the preset convergence condition, step S405 is executed; if the final loss function value is determined to satisfy the predetermined convergence condition, step S406 is executed.
And S405, adjusting parameters in the deep learning model.
And S406, taking the deep learning model as a human face living body detection model.
It can be understood that, in the implementation process of this embodiment, a mode of presetting a maximum number of training rounds may also be adopted, the deep learning model is continuously trained until the maximum number of training rounds is reached, and the deep learning model reaching the maximum number of training rounds is used as the face living body detection model.
S103, judging whether the living body prediction value of the face image to be detected is larger than a threshold value.
The threshold is obtained by inputting all data in the verification set into the trained model after training of the face living body detection model is completed, obtaining all outputs, controlling the result of all outputs between 0 and 1, counting the living body detection rate under the corresponding threshold after adding 1/10000 values from 0, and obtaining the corresponding threshold when the living body detection rate is maximum as the final living body detection threshold of the model. The construction method can correspondingly refer to the way of constructing the training sample set of the face image, and the training sample set of the face image and the verification set of the face image are not crossed.
It is to be understood that the method of determining the threshold is not limited to the above-mentioned methods, for example: the threshold value when the equal error rate is reached on the verification set may be taken as the final threshold value, or the threshold value when the FAR is equal to a specific value may be taken.
And the threshold interval and the starting and ending values of the human face living body detection model can be determined according to the requirements when the threshold is determined.
Specifically, if the predicted value of the living body of the face image to be detected is judged to be greater than the threshold value, the step S104 is executed; and if the living body prediction value of the face image to be detected is judged to be larger than the threshold value, executing the step S105.
And S104, determining the human face in the image to be detected as a living body.
And S105, determining that the human face in the image to be detected is a non-living body.
According to the scheme, the living human face detection method provided by the application comprises the steps of firstly, obtaining a human face image to be detected; then, inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution; and finally, if the living body prediction value of the face image to be detected is larger than the threshold value, determining that the face in the image to be detected is a living body. Therefore, the aim of considering both the detection speed and the detection accuracy during the human face living body detection is fulfilled.
Optionally, in another embodiment of the present application, an implementation manner of the face live detection method, as shown in fig. 7, includes:
and S701, acquiring a face image to be detected.
It should be noted that the specific implementation process of step S701 is the same as the specific implementation process of step S101, and reference may be made to this.
S702, inputting the face image to be detected into a plurality of continuous dense network layers in the face living body detection model to obtain an output value of each dense network layer.
The human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution.
And S703, combining the output values of each dense network layer to obtain a series characteristic diagram.
It should be noted that the specific implementation process of step S703 is the same as the specific implementation process of step S501, and reference may be made to this.
And S704, inputting the series characteristic images into a space attention output layer to obtain a living body prediction value of the face image to be detected.
It should be noted that, although step S704 is a step in the actual application process of the face live detection model, and step S502 is a step in the face live detection model construction process, the specific implementation process of step S704 is the same as that of step S502 described above, and therefore, reference may be made to each other.
S705, judging whether the living body prediction value of the face image to be detected is larger than a threshold value.
And S706, determining the human face in the image to be detected as a living body.
And S707, determining that the human face in the image to be detected is a non-living body.
It should be noted that the specific implementation process of steps S705 to S707 is the same as the specific implementation process of steps S103 to S105, and may be referred to each other.
According to the scheme, the living human face detection method provided by the application comprises the steps of firstly, obtaining a human face image to be detected; then, the face image to be detected is input to a plurality of continuous dense network layers in the face living body detection model, and an output value of each dense network layer is obtained. The human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution; combining the output values of each dense network layer to obtain a series characteristic diagram; inputting the series characteristic images into a space attention output layer to obtain a living body prediction value of the face image to be detected; and finally, if the living body prediction value of the face image to be detected is larger than the threshold value, determining that the face in the image to be detected is a living body. Therefore, the aim of considering both the detection speed and the detection accuracy during the human face living body detection is fulfilled.
Optionally, in another embodiment of the present application, an implementation manner of the face live detection method, as shown in fig. 8, includes:
and S801, acquiring a face image to be detected.
It should be noted that the specific implementation process of step S801 is the same as the specific implementation process of step S101, and reference may be made to this.
S802, inputting the face image to be detected to a plurality of continuous dense network layers in the face living body detection model to obtain an output value of each dense network layer.
The human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution.
And S803, combining the output values of each dense network layer to obtain a series characteristic diagram.
It should be noted that the specific implementation process of step S803 is the same as the specific implementation process of step S501, and reference may be made to this.
S804, inputting the feature maps connected in series into a space attention output layer to obtain a first living body prediction value of the face image to be detected.
It should be noted that, although step S804 is a step in the actual application process of the face live detection model, and step S502 is a step in the face live detection model construction process, the specific implementation process of step S704 is the same as that of step S502 described above, and therefore, reference may be made to each other.
And S805, inputting the face image to be detected into an output value in the last dense network layer in a plurality of continuous dense network layers in the face living body detection model, and taking the output value as a second living body prediction value of the face image to be detected.
It should be noted that the second living body prediction value may also be equal to the living body prediction value of the face image to be detected output in step S102.
And S806, determining the living body prediction value of the face image to be detected according to the first living body prediction value and the second living body prediction value.
Specifically, the average value of the first living body prediction value and the second living body prediction value is used as the living body prediction value of the face image to be detected; the first living body prediction value and the second living body prediction value can also be weighted and averaged to be used as the living body prediction value of the face image to be detected, and the method is quite diversified and is not limited here. The weight may be adjusted according to the actual usage, and is not limited here.
S807, judging whether the living body prediction value of the face image to be detected is larger than a threshold value.
And S808, determining the human face in the image to be detected as a living body.
And S809, determining the human face in the image to be detected as a non-living body.
It should be noted that the specific implementation procedure of steps S707 to S709 is the same as the specific implementation procedure of steps S103 to S105, and can be referred to each other.
According to the scheme, the living human face detection method provided by the application comprises the steps of firstly, obtaining a human face image to be detected; then, the face image to be detected is input to a plurality of continuous dense network layers in the face living body detection model, and an output value of each dense network layer is obtained. The human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution; combining the output values of each dense network layer to obtain a series characteristic diagram; inputting the feature maps connected in series into a spatial attention output layer to obtain a first living body prediction value of the face image to be detected; inputting the face image to be detected into an output value of the last dense network layer in a plurality of continuous dense network layers in the face living body detection model, and taking the output value as a second living body prediction value of the face image to be detected; determining a living body prediction value of the face image to be detected according to the first living body prediction value and the second living body prediction value; and finally, if the living body prediction value of the face image to be detected is larger than the threshold value, determining that the face in the image to be detected is a living body. Therefore, the aim of considering both the detection speed and the detection accuracy during the human face living body detection is fulfilled.
Another embodiment of the present application provides a human face live detection device, as shown in fig. 9, specifically including:
an obtaining unit 901, configured to obtain a face image to be detected.
The first input unit 902 is configured to input the face image to be detected into the face living body detection model, so as to obtain a living body prediction value of the face image to be detected.
The human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution.
The first determining unit 903 is configured to determine that a face in the image to be detected is a living body if the predicted value of the living body of the face image to be detected is greater than a threshold value.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 1, which is not described herein again.
According to the above scheme, in the living human face detection device provided by the application, firstly, the acquisition unit 901 acquires a human face image to be detected; then, the first input unit 902 inputs the face image to be detected into the face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates the original convolution and the cavity convolution; finally, if the living body prediction value of the face image to be detected is greater than the threshold, the first determining unit 903 determines that the face in the image to be detected is a living body. Therefore, the aim of considering both the detection speed and the detection accuracy during the human face living body detection is fulfilled.
Optionally, in another embodiment of the present application, an implementation manner of the construction unit of the living human face detection model includes:
and the training sample set constructing unit is used for constructing a training sample set of the face image.
The training sample set of the face images comprises a plurality of training samples of the face images; the training samples of the plurality of face images comprise face images of a plurality of living bodies and face images of a plurality of non-living bodies.
And the second input unit is used for inputting the training samples of the face images to a plurality of continuous dense network layers in the deep learning model aiming at the training samples of each face image to obtain a first prediction result of whether the training samples of the face images are living bodies and a second prediction result of whether the training samples of the face images are living bodies in a plurality of scenes.
And the calculating unit is used for calculating a final loss function value of the training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body, whether the training sample of the face image is a second prediction result of the living body in a plurality of scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of the living body, and whether the training sample of the face image is a real result of the living body in a plurality of scenes.
And the adjusting unit is used for adjusting parameters in the deep learning model if the final loss function value does not meet the preset convergence condition until the final loss function value calculated by the adjusted deep learning model meets the preset convergence condition, and taking the deep learning model as the human face living body detection model.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 4, which is not described herein again.
Optionally, in another embodiment of the present application, an implementation manner of the computing unit specifically includes:
and the first merging unit is used for merging the output values of each dense network layer to obtain a series-connected characteristic diagram.
And the third input unit is used for inputting the feature maps connected in series into the spatial attention output layer to obtain a third prediction result of whether the training sample of the face image is a living body.
And the second determining unit is used for determining the first loss function value of the training sample of the face image according to the third prediction result of whether the training sample of the face image is a living body and the real result of whether the training sample of the face image is a living body.
And the second determining unit is further used for determining a second loss function value of the training sample of the face image according to the first prediction result of whether the training sample of the face image is a living body and the real result of whether the training sample of the face image is a living body.
And the second determining unit is further used for determining a third loss function value of the training sample of the face image according to a second prediction result of whether the training sample of the face image is a living body in a plurality of scenes and a real result of whether the training sample of the face image is a living body in the plurality of scenes.
A third determining unit configured to determine a sum of the first loss function value, the second loss function value, and the third loss function value as a final loss function value.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 5, which is not described herein again.
Optionally, in another embodiment of the present application, the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and then an implementation of the first input unit includes:
the first input subunit is used for inputting the face image to be detected to a plurality of continuous dense network layers in the face living body detection model to obtain an output value of each dense network layer.
And the second merging unit is used for merging the output values of each dense network layer to obtain a series-connected characteristic diagram.
The first input subunit is further configured to input the feature maps connected in series to the spatial attention output layer, so as to obtain a living body prediction value of the face image to be detected.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 7, which is not described herein again.
Optionally, in another embodiment of the present application, the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and then an implementation of the first input unit includes:
the first input subunit is used for inputting the face image to be detected to a plurality of continuous dense network layers in the face living body detection model to obtain an output value of each dense network layer.
And the second merging unit is used for merging the output values of each dense network layer to obtain a series-connected characteristic diagram.
The first input subunit is further configured to input the feature maps connected in series to the spatial attention output layer, so as to obtain a first living body prediction value of the face image to be detected.
The first input subunit is further configured to input the face image to be detected into an output value in a last dense network layer of the consecutive dense network layers in the face in-vivo detection model, as a second in-vivo prediction value of the face image to be detected.
And the fourth determining unit is used for determining the living body prediction value of the face image to be detected according to the first living body prediction value and the second living body prediction value.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 8, which is not described herein again.
Another embodiment of the present application provides an electronic device, as shown in fig. 10, including:
one or more processors 1001.
Storage 1002 on which one or more programs are stored.
The one or more programs, when executed by the one or more processors 1001, cause the one or more processors 1001 to implement the methods as in any of the above embodiments.
Another embodiment of the present application provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method as described in any of the above embodiments.
In the above embodiments disclosed in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present disclosure may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a live broadcast device, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those skilled in the art can make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A face living body detection method is characterized by comprising the following steps:
acquiring a human face image to be detected;
inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates original convolution and cavity convolution;
and if the living body prediction value of the face image to be detected is greater than the threshold value, determining that the face in the image to be detected is a living body.
2. The face in-vivo detection method according to claim 1, wherein the face in-vivo detection model construction method comprises the following steps:
constructing a training sample set of the face image; wherein the training sample set of the facial images comprises a plurality of training samples of the facial images; the training samples of the plurality of face images comprise face images of a plurality of living bodies and face images of a plurality of non-living bodies;
aiming at a training sample of each face image, inputting the training sample of the face image into a plurality of continuous dense network layers in a deep learning model to obtain a first prediction result of whether the training sample of the face image is a living body and a second prediction result of whether the training sample of the face image is the living body in a plurality of scenes;
calculating to obtain a final loss function value of the training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body or not, whether the training sample of the face image is a second prediction result of the living body or not in a plurality of scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of the living body or not, and whether the training sample of the face image is a real result of the living body or not in a plurality of scenes;
and if the final loss function value does not meet the preset convergence condition, adjusting parameters in the deep learning model until the final loss function value calculated by the adjusted deep learning model meets the preset convergence condition, and taking the deep learning model as a human face living body detection model.
3. The method according to claim 2, wherein the calculating a final loss function value of the training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body, whether the training sample of the face image is a second prediction result of a living body in a plurality of scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of a living body, and whether the training sample of the face image is a real result of a living body in a plurality of scenes comprises:
combining the output values of each dense network layer to obtain a series characteristic diagram;
inputting the feature maps connected in series into a spatial attention output layer to obtain a third prediction result of whether the training sample of the face image is a living body;
determining a first loss function value of the training sample of the face image according to a third prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
determining a second loss function value of the training sample of the face image according to a first prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
determining a third loss function value of the training sample of the face image according to a second prediction result of whether the training sample of the face image is a living body in a plurality of scenes and a real result of whether the training sample of the face image is a living body in a plurality of scenes;
and determining a sum of the first loss function value, the second loss function value, and the third loss function value as the final loss function value.
4. The method according to claim 1, wherein the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and the inputting the face image to be detected into the face living detection model to obtain the living prediction value of the face image to be detected comprises:
inputting the face image to be detected into a plurality of continuous dense network layers in a face living body detection model to obtain an output value of each dense network layer;
combining the output values of each dense network layer to obtain a series characteristic diagram;
and inputting the series characteristic diagrams into the space attention output layer to obtain a living body prediction value of the face image to be detected.
5. The method according to claim 1, wherein the deep learning model is a deep learning model combining a classification model structure and a spatial attention output layer, and the inputting the face image to be detected into the face living detection model to obtain the living prediction value of the face image to be detected comprises:
inputting the face image to be detected into a plurality of continuous dense network layers in a face living body detection model to obtain an output value of each dense network layer;
combining the output values of each dense network layer to obtain a series characteristic diagram;
inputting the feature maps connected in series into the spatial attention output layer to obtain a first living body prediction value of the face image to be detected;
inputting the face image to be detected into an output value of the last dense network layer in a plurality of continuous dense network layers in a face living body detection model, and taking the output value as a second living body prediction value of the face image to be detected;
and determining the living body prediction value of the face image to be detected according to the first living body prediction value and the second living body prediction value.
6. A face liveness detection device, comprising:
the acquisition unit is used for acquiring a face image to be detected;
the first input unit is used for inputting the face image to be detected into a face living body detection model to obtain a living body prediction value of the face image to be detected; the human face living body detection model is obtained by training a deep learning model by using a plurality of training samples of human face images; the training sample of the face image comprises a plurality of living face images and a plurality of non-living face images; the deep learning model is a deep learning model containing a classification model structure; each dense network layer in the classification model structure is a network structure which integrates original convolution and cavity convolution;
and the first determining unit is used for determining the human face in the image to be detected as the living body if the living body prediction value of the human face image to be detected is greater than a threshold value.
7. The face in-vivo detection device according to claim 6, wherein the face in-vivo detection model construction unit comprises:
the training sample set constructing unit is used for constructing a training sample set of the face image; wherein the training sample set of the facial images comprises a plurality of training samples of the facial images; the training samples of the plurality of face images comprise face images of a plurality of living bodies and face images of a plurality of non-living bodies;
the second input unit is used for inputting the training samples of the face images to a plurality of continuous dense network layers in a deep learning model aiming at the training samples of each face image to obtain a first prediction result of whether the training samples of the face images are living bodies and a second prediction result of whether the training samples of the face images are living bodies in a plurality of scenes;
a calculating unit, configured to calculate a final loss function value of a training sample of the face image according to whether the training sample of the face image is a first prediction result of a living body, whether the training sample of the face image is a second prediction result of the living body in multiple scenes, an output value of each dense network layer, whether the training sample of the face image is a real result of the living body, and whether the training sample of the face image is a real result of the living body in multiple scenes;
and the adjusting unit is used for adjusting parameters in the deep learning model if the final loss function value does not meet the preset convergence condition until the final loss function value calculated by the adjusted deep learning model meets the preset convergence condition, and taking the deep learning model as a human face living body detection model.
8. The face liveness detection device according to claim 7, wherein the calculation unit comprises:
the first merging unit is used for merging the output values of each dense network layer to obtain a series-connected characteristic diagram;
the third input unit is used for inputting the series-connected feature maps into a space attention output layer to obtain a third prediction result of whether the training sample of the face image is a living body;
a second determining unit, configured to determine a first loss function value of the training sample of the face image according to a third prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
the second determining unit is further configured to determine a second loss function value of the training sample of the face image according to a first prediction result of whether the training sample of the face image is a living body and a real result of whether the training sample of the face image is a living body;
the second determining unit is further configured to determine a third loss function value of the training sample of the face image according to a second prediction result of whether the training sample of the face image is a living body in multiple scenes and a real result of whether the training sample of the face image is a living body in multiple scenes;
a third determining unit configured to determine a sum of the first loss function value, the second loss function value, and the third loss function value as the final loss function value.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.
10. A computer storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of any of claims 1 to 5.
CN202011526858.3A 2020-12-22 2020-12-22 Face living body detection method and device, electronic equipment and computer storage medium Pending CN112597885A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011526858.3A CN112597885A (en) 2020-12-22 2020-12-22 Face living body detection method and device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011526858.3A CN112597885A (en) 2020-12-22 2020-12-22 Face living body detection method and device, electronic equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN112597885A true CN112597885A (en) 2021-04-02

Family

ID=75200061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011526858.3A Pending CN112597885A (en) 2020-12-22 2020-12-22 Face living body detection method and device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN112597885A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990090A (en) * 2021-04-09 2021-06-18 北京华捷艾米科技有限公司 Face living body detection method and device
CN113052144A (en) * 2021-04-30 2021-06-29 平安科技(深圳)有限公司 Training method, device and equipment of living human face detection model and storage medium
CN113221842A (en) * 2021-06-04 2021-08-06 第六镜科技(北京)有限公司 Model training method, image recognition method, device, equipment and medium
CN113283376A (en) * 2021-06-10 2021-08-20 泰康保险集团股份有限公司 Face living body detection method, face living body detection device, medium and equipment
CN114821823A (en) * 2022-04-12 2022-07-29 马上消费金融股份有限公司 Image processing, training of human face anti-counterfeiting model and living body detection method and device
CN115131880A (en) * 2022-05-30 2022-09-30 上海大学 Multi-scale attention fusion double-supervision human face in-vivo detection method
WO2023000792A1 (en) * 2021-07-22 2023-01-26 京东科技控股股份有限公司 Methods and apparatuses for constructing living body identification model and for living body identification, device and medium

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017000116A1 (en) * 2015-06-29 2017-01-05 北京旷视科技有限公司 Living body detection method, living body detection system, and computer program product
CN107358157A (en) * 2017-06-07 2017-11-17 阿里巴巴集团控股有限公司 A kind of human face in-vivo detection method, device and electronic equipment
US20180025217A1 (en) * 2016-07-22 2018-01-25 Nec Laboratories America, Inc. Liveness detection for antispoof face recognition
CN107679477A (en) * 2017-09-27 2018-02-09 深圳市未来媒体技术研究院 Face depth and surface normal Forecasting Methodology based on empty convolutional neural networks
CN108596041A (en) * 2018-03-28 2018-09-28 中科博宏(北京)科技有限公司 A kind of human face in-vivo detection method based on video
CN108898112A (en) * 2018-07-03 2018-11-27 东北大学 A kind of near-infrared human face in-vivo detection method and system
CN109255322A (en) * 2018-09-03 2019-01-22 北京诚志重科海图科技有限公司 A kind of human face in-vivo detection method and device
US20190026575A1 (en) * 2017-07-20 2019-01-24 Baidu Online Network Technology (Beijing) Co., Ltd. Living body detecting method and apparatus, device and storage medium
CN109409322A (en) * 2018-11-09 2019-03-01 北京京东尚科信息技术有限公司 Biopsy method, device and face identification method and face detection system
CN110427828A (en) * 2019-07-05 2019-11-08 中国平安人寿保险股份有限公司 Human face in-vivo detection method, device and computer readable storage medium
CN110674759A (en) * 2019-09-26 2020-01-10 深圳市捷顺科技实业股份有限公司 Monocular face in-vivo detection method, device and equipment based on depth map
CN110929569A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Face recognition method, device, equipment and storage medium
CN111310724A (en) * 2020-03-12 2020-06-19 苏州科达科技股份有限公司 In-vivo detection method and device based on deep learning, storage medium and equipment
CN111539942A (en) * 2020-04-28 2020-08-14 中国科学院自动化研究所 Method for detecting face depth tampered image based on multi-scale depth feature fusion
WO2020199577A1 (en) * 2019-03-29 2020-10-08 北京市商汤科技开发有限公司 Method and device for living body detection, equipment, and storage medium
CN111860078A (en) * 2019-04-30 2020-10-30 北京眼神智能科技有限公司 Face silence living body detection method and device, readable storage medium and equipment
CN112001240A (en) * 2020-07-15 2020-11-27 浙江大华技术股份有限公司 Living body detection method, living body detection device, computer equipment and storage medium
CN112070158A (en) * 2020-09-08 2020-12-11 哈尔滨工业大学(威海) Facial flaw detection method based on convolutional neural network and bilateral filtering

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017000116A1 (en) * 2015-06-29 2017-01-05 北京旷视科技有限公司 Living body detection method, living body detection system, and computer program product
US20180025217A1 (en) * 2016-07-22 2018-01-25 Nec Laboratories America, Inc. Liveness detection for antispoof face recognition
CN107358157A (en) * 2017-06-07 2017-11-17 阿里巴巴集团控股有限公司 A kind of human face in-vivo detection method, device and electronic equipment
US20190026575A1 (en) * 2017-07-20 2019-01-24 Baidu Online Network Technology (Beijing) Co., Ltd. Living body detecting method and apparatus, device and storage medium
CN107679477A (en) * 2017-09-27 2018-02-09 深圳市未来媒体技术研究院 Face depth and surface normal Forecasting Methodology based on empty convolutional neural networks
CN108596041A (en) * 2018-03-28 2018-09-28 中科博宏(北京)科技有限公司 A kind of human face in-vivo detection method based on video
CN108898112A (en) * 2018-07-03 2018-11-27 东北大学 A kind of near-infrared human face in-vivo detection method and system
CN109255322A (en) * 2018-09-03 2019-01-22 北京诚志重科海图科技有限公司 A kind of human face in-vivo detection method and device
CN109409322A (en) * 2018-11-09 2019-03-01 北京京东尚科信息技术有限公司 Biopsy method, device and face identification method and face detection system
WO2020199577A1 (en) * 2019-03-29 2020-10-08 北京市商汤科技开发有限公司 Method and device for living body detection, equipment, and storage medium
CN111860078A (en) * 2019-04-30 2020-10-30 北京眼神智能科技有限公司 Face silence living body detection method and device, readable storage medium and equipment
CN110427828A (en) * 2019-07-05 2019-11-08 中国平安人寿保险股份有限公司 Human face in-vivo detection method, device and computer readable storage medium
CN110674759A (en) * 2019-09-26 2020-01-10 深圳市捷顺科技实业股份有限公司 Monocular face in-vivo detection method, device and equipment based on depth map
CN110929569A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Face recognition method, device, equipment and storage medium
CN111310724A (en) * 2020-03-12 2020-06-19 苏州科达科技股份有限公司 In-vivo detection method and device based on deep learning, storage medium and equipment
CN111539942A (en) * 2020-04-28 2020-08-14 中国科学院自动化研究所 Method for detecting face depth tampered image based on multi-scale depth feature fusion
CN112001240A (en) * 2020-07-15 2020-11-27 浙江大华技术股份有限公司 Living body detection method, living body detection device, computer equipment and storage medium
CN112070158A (en) * 2020-09-08 2020-12-11 哈尔滨工业大学(威海) Facial flaw detection method based on convolutional neural network and bilateral filtering

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ABDULKADIR ŞENGÜR,等: "Deep Feature Extraction for Face Liveness Detection", 《2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP)》, 24 January 2019 (2019-01-24), pages 4 *
KOSHY, RANJANA,等: "Enhanced Deep Learning Architectures for Face Liveness Detection for Static and Video Sequences", 《ENTROPY》, vol. 22, no. 10, 21 October 2020 (2020-10-21), pages 1186 *
佟越洋: "基于卷积神经网络的活体人脸检测算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 6, 15 June 2019 (2019-06-15), pages 138 - 614 *
吴晓丽: "基于视觉的人脸活体检测方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 3, 15 March 2022 (2022-03-15), pages 138 - 2421 *
姜辛魁: "结合人脸检测的活体识别方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 8, 15 August 2019 (2019-08-15), pages 138 - 1071 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990090A (en) * 2021-04-09 2021-06-18 北京华捷艾米科技有限公司 Face living body detection method and device
CN113052144A (en) * 2021-04-30 2021-06-29 平安科技(深圳)有限公司 Training method, device and equipment of living human face detection model and storage medium
CN113052144B (en) * 2021-04-30 2023-02-28 平安科技(深圳)有限公司 Training method, device and equipment of living human face detection model and storage medium
CN113221842A (en) * 2021-06-04 2021-08-06 第六镜科技(北京)有限公司 Model training method, image recognition method, device, equipment and medium
CN113221842B (en) * 2021-06-04 2023-12-29 第六镜科技(北京)集团有限责任公司 Model training method, image recognition method, device, equipment and medium
CN113283376A (en) * 2021-06-10 2021-08-20 泰康保险集团股份有限公司 Face living body detection method, face living body detection device, medium and equipment
CN113283376B (en) * 2021-06-10 2024-02-09 泰康保险集团股份有限公司 Face living body detection method, face living body detection device, medium and equipment
WO2023000792A1 (en) * 2021-07-22 2023-01-26 京东科技控股股份有限公司 Methods and apparatuses for constructing living body identification model and for living body identification, device and medium
CN114821823A (en) * 2022-04-12 2022-07-29 马上消费金融股份有限公司 Image processing, training of human face anti-counterfeiting model and living body detection method and device
CN114821823B (en) * 2022-04-12 2023-07-25 马上消费金融股份有限公司 Image processing, training of human face anti-counterfeiting model and living body detection method and device
CN115131880A (en) * 2022-05-30 2022-09-30 上海大学 Multi-scale attention fusion double-supervision human face in-vivo detection method

Similar Documents

Publication Publication Date Title
CN112597885A (en) Face living body detection method and device, electronic equipment and computer storage medium
CN109902546B (en) Face recognition method, face recognition device and computer readable medium
CN108062562B (en) Object re-recognition method and device
US20170345181A1 (en) Video monitoring method and video monitoring system
CN110705478A (en) Face tracking method, device, equipment and storage medium
US11816880B2 (en) Face recognition method and apparatus, computer device, and storage medium
CN109766785B (en) Living body detection method and device for human face
CN108491848B (en) Image saliency detection method and device based on depth information
WO2020199611A1 (en) Liveness detection method and apparatus, electronic device, and storage medium
CN111368672A (en) Construction method and device for genetic disease facial recognition model
CN106709404A (en) Image processing device and image processing method
CN111368751A (en) Image processing method, image processing device, storage medium and electronic equipment
US10915739B2 (en) Face recognition device, face recognition method, and computer readable storage medium
CN109784277B (en) Emotion recognition method based on intelligent glasses
CN104850857B (en) Across the video camera pedestrian target matching process of view-based access control model spatial saliency constraint
CN111325107B (en) Detection model training method, device, electronic equipment and readable storage medium
CN111062263A (en) Method, device, computer device and storage medium for hand pose estimation
CN107704813A (en) A kind of face vivo identification method and system
CN110728242A (en) Image matching method and device based on portrait recognition, storage medium and application
Xia et al. Face recognition and application of film and television actors based on Dlib
CN113128428B (en) Depth map prediction-based in vivo detection method and related equipment
CN113205072A (en) Object association method and device and electronic equipment
CN113793251A (en) Pose determination method and device, electronic equipment and readable storage medium
CN110929583A (en) High-detection-precision face recognition method
CN112990009A (en) End-to-end-based lane line detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination