CN114299567A - Model training method, living body detection method, electronic device, and storage medium - Google Patents

Model training method, living body detection method, electronic device, and storage medium Download PDF

Info

Publication number
CN114299567A
CN114299567A CN202111463661.4A CN202111463661A CN114299567A CN 114299567 A CN114299567 A CN 114299567A CN 202111463661 A CN202111463661 A CN 202111463661A CN 114299567 A CN114299567 A CN 114299567A
Authority
CN
China
Prior art keywords
feature vector
class
living body
image
original images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111463661.4A
Other languages
Chinese (zh)
Other versions
CN114299567B (en
Inventor
王军华
付贤强
朱海涛
户磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Dilusense Technology Co Ltd
Original Assignee
Beijing Dilusense Technology Co Ltd
Hefei Dilusense Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dilusense Technology Co Ltd, Hefei Dilusense Technology Co Ltd filed Critical Beijing Dilusense Technology Co Ltd
Priority to CN202111463661.4A priority Critical patent/CN114299567B/en
Publication of CN114299567A publication Critical patent/CN114299567A/en
Application granted granted Critical
Publication of CN114299567B publication Critical patent/CN114299567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The embodiment of the invention relates to the field of image processing, and discloses a model training method, a living body detection method, an electronic device and a storage medium, wherein image samples of a living body and a non-living body containing a human face and class labels of the image samples are obtained; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies; taking the image sample as input and the feature vector of the image sample as output to construct a feature extraction model; taking a feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier; and performing combined training on the feature extraction model and the classifier to obtain the trained feature extraction model and the trained classifier. According to the scheme, the generalization capability of the trained model is greatly improved through the skillfully designed data label method and the new loss function.

Description

Model training method, living body detection method, electronic device, and storage medium
Technical Field
The present invention relates to the field of image processing, and in particular, to a method for model training and in-vivo detection, an electronic device, and a storage medium.
Background
The existing human face living body detection algorithm is mainly based on a deep learning living body detection algorithm, and a deep learning model is trained by pre-collecting sample data of a living body and a prosthesis and using the sample data to help the deep learning model to learn the capability of extracting characteristics capable of distinguishing the living body from non-living body.
The conventional training method is as follows: dividing sample data into two types of living bodies and non-living bodies, and giving L-2 different types of labels, such as 0 and 1, as supervision information during training; and E rounds of training are summed, M data in all samples are used in each round, the M data are randomly divided into K batches in one round, each batch contains B data, and then the K batches are used for training the feature extraction model in sequence. For each datum, the probability of belonging to different labels can be obtained by the features extracted by the feature extraction model through a classifier, and then the difference between the prediction probability and the actual situation is measured by using cross entropy loss so as to optimize the model parameters.
The training thought simply divides the human face into a living body and a prosthesis, neglects the subdivision condition of the living body and the prosthesis, for example, the difference between different age groups and human species in the living body is very large; the non-living bodies are more diverse due to a variety of types, such as A4 paper, photos, mobile phone screen photos, clothes, latex head covers, silica gel head covers, plastic masks, and the like. Some prostheses can easily acquire a large amount of training data, while some prostheses have difficulty acquiring a large amount of training data. In such a situation, the training effect of the conventional training method on a small number of prostheses or on prosthesis types that do not appear in training data is often poor, so that the generalization effect of the trained human face living body detection algorithm on the prosthesis of the unknown type is poor.
Disclosure of Invention
The embodiment of the invention aims to provide a model training method, a living body detection method, an electronic device and a storage medium, and the generalization capability of a trained model is greatly improved through a skillfully designed data label method and a new loss function.
In order to solve the above technical problem, an embodiment of the present invention provides a model training method, including:
acquiring image samples of living bodies and non-living bodies containing human faces and class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies;
taking the image sample as input and the feature vector of the image sample as output to construct a feature extraction model;
taking a feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier;
and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label.
The embodiment of the invention also provides a living body detection method, which comprises the following steps:
processing the face image to be detected by adopting the feature extraction model and the classifier obtained by the model training method through combined training to obtain the feature vector of the face image and the probability of the feature vector belonging to each class of labels;
determining a first probability that the face image belongs to the living body based on a cosine value of an included angle between the feature vector and class center feature vectors of various classes of labels belonging to the living body;
determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body;
and determining the final probability that the face image belongs to the living body based on the first probability and the second probability.
An embodiment of the present invention also provides an electronic device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a model training method as described above, and a liveness detection method as described above.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the model training method as described above, and the in-vivo detection method as described above.
Compared with the prior art, the method and the device have the advantages that the image samples of the living body and the non-living body containing the human face and the class labels of the image samples are obtained; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies; taking an image sample as input and a feature vector of the image sample as output, and constructing a feature extraction model; taking a feature vector output by the feature extraction model as input, taking the probability that the feature vector belongs to each class of label as output, and constructing a classifier; and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label. The scheme is different from the traditional category labels with living bodies and non-living bodies as the image samples, and uses a plurality of category labels belonging to the living bodies and a plurality of category labels belonging to the non-living bodies as the category labels for the image samples, so that the subdivision conditions of the living bodies and the non-living bodies can be further mined. Meanwhile, when the feature extraction model and the classifier are jointly trained, a loss function of the joint training is constructed according to a first loss between the feature vector extracted by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs and a second loss between the prediction class output by the classifier and the class label, so that the feature extraction capability and the classification prediction capability of the model can be greatly improved, the generalization capability of the trained model is further improved, and the judgment capability of the non-living body type with less or even no samples in the training set is remarkably improved.
Drawings
FIG. 1 is a detailed flow diagram of a model training method according to an embodiment of the invention;
FIG. 2 is a detailed flow chart of an image sample acquisition method according to an embodiment of the invention;
fig. 3 is a detailed flowchart of a first loss acquisition method according to an embodiment of the present invention;
FIG. 4 is a detailed flowchart of a biopsy method according to an embodiment of the invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
An embodiment of the present invention relates to a model training method, and as shown in fig. 1, the model training method provided in this embodiment includes the following steps.
Step 101: acquiring living body and non-living body image samples containing human faces and class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies.
Specifically, original images of living bodies and non-living bodies including human faces can be acquired by photographing and the like, image samples for model training are formed, and each image sample labels the category of the image sample in advance to obtain a label. Unlike the conventional live body detection algorithm training process, the definition of the class labels in the present embodiment not only defines two class labels according to the live body and the non-live body to which the image sample belongs, but defines a plurality of class labels refined under the live body and the non-live body classes, that is, the class labels include a plurality of class labels belonging to the live body and a plurality of class labels belonging to the non-live body. The plurality of category labels belonging to the living body may be divided according to one or more classification dimensions, the plurality of category labels belonging to the non-living body may be divided according to one or more classification dimensions, and the classification dimension used by the living body and the classification dimension used by the non-living body may not be identical.
In one example, as shown in FIG. 2, this step may be implemented by the following substeps.
Substep 1011: an original image of a living body and a non-living body including a face of a person is acquired.
Specifically, the original images of the living body and the non-living body including the face of a person can be acquired by photographing or the like.
Substep 1012: an original image belonging to a living body is labeled based on a plurality of category labels defined in advance by the age group of the living body.
Specifically, for a living body, it is possible to divide into 5 categories by age group to which the living body belongs, and set a category label for each category, for example: 0 to 6 is 100, 7 to 12 is 101, 13 to 50 is 102, 51 to 70 is 103, and 70 or more is 104. Then, for an original image belonging to a living body, a category label is set for the original image by the age to which the face in the original image belongs.
Substep 1013: the original image belonging to the non-living body is labeled based on a plurality of category labels defined in advance by the material of the non-living body.
Specifically, for non-living objects (i.e., "prostheses"), it is possible to classify the non-living objects into a plurality of categories by their material quality, and to set a category label for each category, for example: the 2D prostheses are classified into 7 classes according to material (the specific number depends on the actual material): the unknown material is 200, the color A4 paper is 201, the black-and-white A4 paper is 202, the color photo is 203, the black-and-white photo is 204, the color coated paper is 205, and the black-and-white coated paper is 206; the 3D prostheses are classified into 5 classes according to material (the specific number depends on the actual material): the material is unknown 300, the plastic 3D prosthesis is 301, the latex 3D prosthesis is 302, the silica gel 3D prosthesis is 303, and the resin 3D prosthesis is 304. Then, for the original image belonging to the non-living body, a category label is set for the original image according to the material to which the face in the original image belongs.
Substep 1014: extracting a specified number of original images from the original images as image samples; the original images with the specified number cover all the category labels, and the number of the original images corresponding to each category label is the same.
Specifically, after the original image is acquired and the class label corresponding to the original image is marked, the original image for model training needs to be extracted from the original image as an image sample. In order to improve the generalization capability of the model, a specified number of images can be extracted from the original images, the category labels corresponding to the extracted original images need to cover all predefined category labels, and the number of the original images corresponding to each category label is the same.
The process of extracting an image sample can be realized, for example, by the following steps.
The method comprises the following steps: and aiming at the original image of any category label, randomly taking m original images from the original image, and judging the magnitude relation between m and a quotient value obtained by dividing the designated number by the total category label number.
Specifically, assume that when training the model, the number of image samples required for one round of training is designated as M, and the number of classes covered in all the original images is c (and thus the total number of class labels). In order to improve the generalization capability of the model, the same number of images need to be extracted from the original images corresponding to each category label as image samples, so the number of the original images extracted corresponding to each category label should be M/c.
When extracting an original image for any category label, M original images belonging to the category label can be randomly taken from the original image, and then the size relationship between M and M/c can be judged.
Step two: if m is larger than the quotient value, deleting part of original images from the m original images to enable the number of the remaining original images to be equal to the quotient value.
Specifically, when M is greater than M/c, the number of original images extracted from the current category label is greater than the number of images extracted, and at this time, a partial number (M-M/c) of original images needs to be deleted from the extracted original images, so that the number of remaining original images extracted from the current category label is equal to M/c.
Step three: if m is smaller than the quotient value, randomly taking part of original images from the original images of the labels of the category again, and enabling the total number of the original images to be equal to the quotient value.
Specifically, when M is smaller than M/c, the number of original images extracted from the current category label is less than the number of extracted images, and at this time, a partial number (M/c-M) of original images needs to be extracted again from (all) original images of the category label, so that the number of extracted total original images from the current category label after re-extraction is equal to M/c. When the original image is extracted again, it can be acquired by random sampling.
Step four: and taking the selected M original images of all categories as M image samples to randomly disorder the sequence, and averagely dividing the M original images into M/B batches, wherein B is the batch size.
Specifically, for each category label, extracting a specified number (M/c) of original images under the corresponding category label by adopting the method from the first step to the third step, thereby obtaining the total number M of the original images as M image samples; and randomly disordering the extracted M image samples, and forming M/B batch samples by taking every B data as a batch, wherein B is the batch size. In the subsequent model training, different batches of image samples can be selected according to batches for model training, and the training process of each batch of image samples is used as a training period.
Step 102: and taking the image sample as input, and taking the feature vector of the image sample as output to construct a feature extraction model.
Specifically, a conventional deep learning network E (referred to as a "model E" for short) is constructed as a feature extraction model, trainable parameters of the model E are recorded as WE, the input of the model E is the image sample containing the face, and the output is an n-dimensional feature vector v. Where n is a hyperparameter, it is set empirically, for example, taking n as 128.
Step 103: and taking the feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier.
Specifically, a conventional deep learning network C (referred to as "model C" for short) is constructed as a classifier, and trainable parameters of the model C are denoted as WC. The input of the model C is the feature extraction model in step 102, i.e. the n-dimensional feature vector v output by the model E, and the output is the C (the size is the same as the total class label number) dimensional vector p, wherein p isj,iAnd representing the probability value of the feature vector corresponding to the jth image sample belonging to the ith class label.
Step 104: and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label.
Specifically, the constructed feature extraction model (model E) and the classifier (model C) are jointly trained by using an image sample to obtain the trained feature extraction model and classifier. The loss function in the process of performing the joint training can be constructed based on a first loss between the feature vector output by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs, and a second loss between the prediction class output by the classifier and the class label.
The methods of constructing the first loss and the second loss will be described below, respectively.
As shown in fig. 3, the first loss construction process can be implemented as follows.
Step 201: calculating a feature vector v and a class center feature vector v of a class label to which the feature vector belongs by the following formula (1)CFD (v) from the first to the second.
Figure BDA0003389535240000051
Specifically, each class label may be defined with a central feature with n-128 dimensions (dimensions of the feature vector output by the feature extraction model), that is, a class-center feature vector. Class-centric feature vectors such as the ith class label may be denoted as
Figure BDA0003389535240000052
The initial value is a random value. If the number of the image samples in a batch is B, the feature vector v epsilon R output after the feature extraction modeln×BWherein R is a real number.
Obtaining each feature vector v and class center feature vector v of class label to which the feature vector belongsCFThe distance D (v) between the two vectors is obtained by calculation according to the formula (1).
In one example, before performing step 201, the central feature vectors v of each class can be processed by the following stepsCFAnd (6) updating.
For various central characteristic vectors v used in the current training periodCFThe update is performed by the following formula.
Figure BDA0003389535240000053
Figure BDA0003389535240000054
Wherein v isCFFor the updated class-centered feature vector,
Figure BDA0003389535240000055
a is a class center feature vector before updating, a is a hyper-parameter, b is the number of image samples under the same class label, eKFeature vector v of kth image sample under same class labelKThe labels corresponding to the category
Figure BDA0003389535240000056
The vector difference between them.
Specifically, suppose that in the image sample number B used in the current training period (batch), ct category labels (ct is less than or equal to c) are involved, and c is the total number of category labels; for the image samples under the various class labels of the B image samples, respectively updating the new class central feature vector under the class label to be v through a formula 2CF. Wherein
Figure BDA0003389535240000057
For the old class center feature vector under the class label,
i.e. the class-centric feature vector used in the last training period.
The feature vector v of the kth image sample under the same class label can be calculated by the following formulaKCorresponding to the category label
Figure BDA0003389535240000058
The vector difference between them.
Figure BDA0003389535240000059
Step 202: the first loss is calculated by the following formula (5).
Figure BDA00033895352400000510
Wherein, WEExtracting trainable parameters, L, of the model E for the featuresE(WE) For the first loss, B is the batch size of the image sample, D (v)j) Is the jth image sample v in a batch of image samplesjThe corresponding distance.
Specifically, a feature vector v and a class center feature vector v of a class label to which the feature vector belongs are obtainedCFAfter the distance d (v), the first loss can be calculated by equation (5). First loss may be for trainable parameters W of the feature extraction modelEConstraining to train trainable parameters W in the model E along the direction of high similarity between the class center feature vector of a certain feature vector and the class center feature vector of the class label of the certain feature vector and low similarity between the class center feature vectors of the class labels of the certain feature vector and the class center feature vector of the class labels of the non-certain feature vectorE
The second loss construction process can be realized by the following steps.
The second loss is calculated by equation (6).
Figure BDA0003389535240000061
Wherein, WCAs trainable parameters of the classifier C, LC(WC) For the second loss, B is the batch size of the image sample, c is the total class label number, yj,iIs the actual probability, p, of the ith class label to which the jth image sample belongs in a batch of image samplesj,iThe prediction probability of the ith class label to which the jth image sample belongs in the image samples is obtained.
The second penalty may be on trainable parameters W of classifier CCConstraining a feature vector to train a trainable parameter W in a model C in a direction having a high prediction probability for a label class to which the feature vector belongs and a low prediction probability for a label class to which the feature vector does not belongC
On the basis, when the feature extraction model and the classifier are subjected to combined training, the loss function during the adopted combined training can be constructed by the formula (7).
loss=LE(WE)+LC(WC)………………………(7)
Wherein loss is the loss value during joint training, LE(WE) Is the first loss, LC(WC) The second loss.
Specifically, when the loss of the joint training is calculated by using the formula (7), the parameters (W) of the model E and the model C are optimized according to the conventional deep learning network optimization methodE、WC) Namely:
Figure BDA0003389535240000062
compared with the related art, the embodiment obtains the image samples of the living body and the non-living body containing the human face and the class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies; taking an image sample as input and a feature vector of the image sample as output, and constructing a feature extraction model; taking a feature vector output by the feature extraction model as input, taking the probability that the feature vector belongs to each class of label as output, and constructing a classifier; and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label. The scheme is different from the traditional category labels with living bodies and non-living bodies as the image samples, and uses a plurality of category labels belonging to the living bodies and a plurality of category labels belonging to the non-living bodies as the category labels for the image samples, so that the subdivision conditions of the living bodies and the non-living bodies can be further mined. Meanwhile, when the feature extraction model and the classifier are jointly trained, a loss function of the joint training is constructed according to a first loss between the feature vector extracted by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs and a second loss between the prediction class output by the classifier and the class label, so that the feature extraction capability and the classification prediction capability of the model can be greatly improved, the generalization capability of the trained model is further improved, and the judgment capability of the non-living body type with less or even no samples in the training set is remarkably improved.
Another embodiment of the present invention relates to a living body detection method implemented based on the above-described model training method. As shown in fig. 4, the living body detecting method includes the following steps.
Step 301: and processing the face image to be detected by adopting a feature extraction model and a classifier obtained by joint training of a model training method to obtain a feature vector of the face image and the probability of the feature vector belonging to each class of labels.
Specifically, the feature extraction model E obtained by training with the model training method is used to perform feature extraction on the face image to be detected, so as to obtain a feature vector v corresponding to the face image. And classifying the feature vector v corresponding to the face image to be detected output by the feature extraction model E by using the classifier C obtained by training by using the model training method to obtain the probability p that the feature vector belongs to each class of label.
Step 302: and determining a first probability that the face image belongs to the living body based on the cosine value of an included angle between the feature vector and the class center feature vector of each class of label belonging to the living body.
Specifically, feature vectors v obtained by feature extraction and class center feature vectors of various types of labels belonging to living bodies are processed to obtain cosine values of included angles between the feature vectors v and the class center feature vectors
Figure BDA0003389535240000063
Wherein
Figure BDA0003389535240000064
Class center feature vectors that are the i-th class labels belonging to living subjects, namely: the ith class label belongs to the class labels 100, 101, 102, 103, 104.
Thereafter, the first probability p is calculated according to the following formula (8)E
Figure BDA0003389535240000071
Therein, maxiAmong a plurality of category labels belonging to the living body,
Figure BDA0003389535240000072
is measured.
Specifically, cosine values of included angles between feature vectors v of a face image to be detected and class center feature vectors of various classes of labels belonging to living bodies are obtained
Figure BDA0003389535240000073
Then, the first probability p can be obtained according to the formula (8)E
Step 303: and determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body.
Specifically, the probability p of each class label of the living body to which the feature vector belongs is obtainediThereafter, the second probability p can be calculated according to the following equation (9)C
pC=∑pi………………………(9)
Wherein the ith class label belongs to the class labels {100, 101, 102, 103, 104 }.
Step 304: and determining the final probability that the face image belongs to the living body based on the first probability and the second probability.
Specifically, the first probability p is obtainedEAnd a second probability pCThereafter, the final probability P may be calculated according to the following equation (10).
P=d×pE+(1-d)×pC………………………(10)
Where d is a hyperparameter, empirically set to 0.1.
Compared with the prior art, the embodiment of the invention processes the face image to be detected through the feature extraction model and the classifier obtained by the combined training of the model training method to obtain the feature vector of the face image and the probability of the feature vector belonging to each class of labels; determining a first probability that the face image belongs to the living body based on cosine values of included angles between the feature vectors and class center feature vectors of various classes of labels belonging to the living body; determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body; and determining the final probability that the face image belongs to the living body based on the first probability and the second probability.
In the scheme, the adopted feature extraction model and the classifier are obtained by joint training of the image samples marked by the plurality of category labels belonging to the living body and the plurality of category labels belonging to the non-living body, so that the subdivision conditions of the living body and the non-living body can be further detected. Meanwhile, when the feature extraction model and the classifier are jointly trained, a loss function of the joint training is constructed according to a first loss between the feature vector extracted by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs and a second loss between the prediction class output by the classifier and the class label, so that the feature extraction capability and the classification prediction capability of the model can be greatly improved, the generalization capability of the trained model is further improved, and the judgment capability of the non-living body type with less or even no samples in the training set is remarkably improved. On the basis, when the living body detection is carried out, the probability that the face to be detected belongs to the living body is judged together based on the first probability obtained by the feature extraction model and the second probability obtained by the classifier, so that the accuracy of the living body detection is improved.
Another embodiment of the invention relates to an electronic device, as shown in FIG. 5, comprising at least one processor 402; and a memory 401 communicatively coupled to the at least one processor 402; wherein the memory 401 stores instructions executable by the at least one processor 402, the instructions being executable by the at least one processor 402 to enable the at least one processor 402 to perform any of the method embodiments described above.
Where the memory 401 and the processor 402 are coupled by a bus, which may include any number of interconnected buses and bridges that couple one or more of the various circuits of the processor 402 and the memory 401 together. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 402 is transmitted over a wireless medium through an antenna, which further receives the data and transmits the data to the processor 402.
The processor 402 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 401 may be used to store data used by processor 402 in performing operations.
Another embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes any of the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims (10)

1. A method of model training, comprising:
acquiring image samples of living bodies and non-living bodies containing human faces and class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies;
taking the image sample as input and the feature vector of the image sample as output to construct a feature extraction model;
taking a feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier;
and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label.
2. The method of claim 1, wherein the obtaining of image samples of living and non-living subjects including a human face and class labels of the image samples comprises:
acquiring original images of a living body and a non-living body containing a human face;
labeling an original image belonging to a living body based on a plurality of category labels predefined according to the age group of the living body;
labeling an original image belonging to a non-living body based on a plurality of category labels predefined according to non-living body materials;
extracting a specified number of original images from the original images as the image samples;
and the original images with the specified number cover all the class labels, and the number of the original images corresponding to each class label is the same.
3. The method of claim 2, wherein the extracting a specified number of original images from the original images as the image samples comprises:
randomly taking m original images from the original images aiming at the original images of any category of labels, and judging the magnitude relation between m and a quotient value obtained by dividing the designated number by the total category label number;
if the m is larger than the quotient value, deleting partial original images from the m original images to enable the number of the remaining original images to be equal to the quotient value;
if m is smaller than the quotient value, randomly taking part of original images from the original images of the class labels again to enable the total number of the original images to be equal to the quotient value;
and taking the M original images of all the selected categories as M image samples to randomly disorder the sequence, and averagely dividing the M original images into M/B batches, wherein B is the batch size.
4. The method of claim 1, wherein the first loss is constructed by:
by passing
Figure FDA0003389535230000011
Calculating the feature vector v and the class center feature vector v of the class label to which the feature vector belongsCFThe distance d (v) therebetween;
by passing
Figure FDA0003389535230000012
Calculating the first loss;
wherein, WEExtracting trainable parameters, L, of the model E for the featuresE(WE) For the first loss, B is the batch size of the image sample, D (v)j) Is the jth image sample v in a batch of image samplesjThe corresponding distance.
5. The method of claim 4, wherein said passing is by
Figure FDA0003389535230000013
Calculating the feature vector v and the class center feature vector v of the class label to which the feature vector belongsCFBefore d (v), comprising:
for various central characteristic vectors v used in the current training periodCFThe update is performed by the following formula:
Figure FDA0003389535230000021
Figure FDA0003389535230000022
wherein v isCFFor the updated class-centered feature vector,
Figure FDA0003389535230000023
a is a class center feature vector before updating, a is a hyper-parameter, b is the number of image samples under the same class label, eKFeature vector v of kth image sample under same class labelKThe labels corresponding to the category
Figure FDA0003389535230000024
The vector difference between them.
6. The method of claim 4, wherein the second loss is constructed by:
by passing
Figure FDA0003389535230000025
Calculating the second loss;
wherein, WCAs trainable parameters of the classifier C, LC(WC) For the second loss, B is the batch size of the image sample, C is the total class label number, yj,iThe fact that the jth image sample in a batch of image samples belongs to the ith class label isProbability of interstation, pj,iThe prediction probability of the ith class label to which the jth image sample belongs in the image samples is obtained.
7. The method of claim 1, wherein the loss function in the joint training is constructed by the following formula:
loss=LE(WE)+LC(WC)
wherein loss is the loss value during joint training, LE(WE) For the first loss, LC(WC) Is the second loss.
8. A method of in vivo detection, comprising:
processing a face image to be detected by using a feature extraction model and a classifier obtained by joint training according to the model training method of any one of claims 1 to 7 to obtain a feature vector of the face image and the probability of the feature vector belonging to each class of labels;
determining a first probability that the face image belongs to the living body based on a cosine value of an included angle between the feature vector and class center feature vectors of various classes of labels belonging to the living body;
determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body;
and determining the final probability that the face image belongs to the living body based on the first probability and the second probability.
9. An electronic device, comprising:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the model training method of any one of claims 1 to 7 and the liveness detection method of claim 8.
10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the model training method according to any one of claims 1 to 7 and the in-vivo detection method according to claim 8.
CN202111463661.4A 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium Active CN114299567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111463661.4A CN114299567B (en) 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111463661.4A CN114299567B (en) 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium

Publications (2)

Publication Number Publication Date
CN114299567A true CN114299567A (en) 2022-04-08
CN114299567B CN114299567B (en) 2022-11-18

Family

ID=80965390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111463661.4A Active CN114299567B (en) 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium

Country Status (1)

Country Link
CN (1) CN114299567B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761411A (en) * 2022-11-24 2023-03-07 北京的卢铭视科技有限公司 Model training method, living body detection method, electronic device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329696A (en) * 2020-11-18 2021-02-05 携程计算机技术(上海)有限公司 Face living body detection method, system, equipment and storage medium
CN113609944A (en) * 2021-07-27 2021-11-05 东南大学 Silent in-vivo detection method
CN113705383A (en) * 2021-08-12 2021-11-26 南京英诺森软件科技有限公司 Cross-age face recognition method and system based on ternary constraint

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329696A (en) * 2020-11-18 2021-02-05 携程计算机技术(上海)有限公司 Face living body detection method, system, equipment and storage medium
CN113609944A (en) * 2021-07-27 2021-11-05 东南大学 Silent in-vivo detection method
CN113705383A (en) * 2021-08-12 2021-11-26 南京英诺森软件科技有限公司 Cross-age face recognition method and system based on ternary constraint

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIANG XU ET AL.: "On Improving Temporal Consistency for Online Face Liveness Detection SysOn Improving Temporal Consistency for Online Face Liveness Detection Systemtem", 《2021ICCVW》 *
游锦成: "基于深度学习的人脸识别技术的研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761411A (en) * 2022-11-24 2023-03-07 北京的卢铭视科技有限公司 Model training method, living body detection method, electronic device, and storage medium
CN115761411B (en) * 2022-11-24 2023-09-01 北京的卢铭视科技有限公司 Model training method, living body detection method, electronic device, and storage medium

Also Published As

Publication number Publication date
CN114299567B (en) 2022-11-18

Similar Documents

Publication Publication Date Title
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN107506761B (en) Brain image segmentation method and system based on significance learning convolutional neural network
CN108182441B (en) Parallel multichannel convolutional neural network, construction method and image feature extraction method
CN109558942B (en) Neural network migration method based on shallow learning
WO2019015246A1 (en) Image feature acquisition
CN108090472B (en) Pedestrian re-identification method and system based on multi-channel consistency characteristics
CN109376796A (en) Image classification method based on active semi-supervised learning
CN112308862A (en) Image semantic segmentation model training method, image semantic segmentation model training device, image semantic segmentation model segmentation method, image semantic segmentation model segmentation device and storage medium
CN109145944B (en) Classification method based on longitudinal three-dimensional image deep learning features
CN104268552B (en) One kind is based on the polygonal fine classification sorting technique of part
CN112633382A (en) Mutual-neighbor-based few-sample image classification method and system
Song et al. Hybrid deep autoencoder with Curvature Gaussian for detection of various types of cells in bone marrow trephine biopsy images
CN110705489B (en) Training method and device for target recognition network, computer equipment and storage medium
CN110414541B (en) Method, apparatus, and computer-readable storage medium for identifying an object
CN112819821A (en) Cell nucleus image detection method
CN114299567B (en) Model training method, living body detection method, electronic device, and storage medium
CN107729863B (en) Human finger vein recognition method
CN113870254A (en) Target object detection method and device, electronic equipment and storage medium
CN113313169A (en) Training material intelligent identification method, device and equipment based on deep learning
CN108460406B (en) Scene image attribute identification method based on minimum simplex fusion feature learning
CN116188428A (en) Bridging multi-source domain self-adaptive cross-domain histopathological image recognition method
CN114913404A (en) Model training method, face image living body detection method, electronic device and storage medium
CN112347879B (en) Theme mining and behavior analysis method for video moving target
CN115063374A (en) Model training method, face image quality scoring method, electronic device and storage medium
CN113420636A (en) Nematode identification method based on deep learning and threshold segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220627

Address after: 230091 room 611-217, R & D center building, China (Hefei) international intelligent voice Industrial Park, 3333 Xiyou Road, high tech Zone, Hefei, Anhui Province

Applicant after: Hefei lushenshi Technology Co.,Ltd.

Address before: 100083 room 3032, North B, bungalow, building 2, A5 Xueyuan Road, Haidian District, Beijing

Applicant before: BEIJING DILUSENSE TECHNOLOGY CO.,LTD.

Applicant before: Hefei lushenshi Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant