CN114299567B - Model training method, living body detection method, electronic device, and storage medium - Google Patents

Model training method, living body detection method, electronic device, and storage medium Download PDF

Info

Publication number
CN114299567B
CN114299567B CN202111463661.4A CN202111463661A CN114299567B CN 114299567 B CN114299567 B CN 114299567B CN 202111463661 A CN202111463661 A CN 202111463661A CN 114299567 B CN114299567 B CN 114299567B
Authority
CN
China
Prior art keywords
feature vector
class
image
living body
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111463661.4A
Other languages
Chinese (zh)
Other versions
CN114299567A (en
Inventor
王军华
付贤强
朱海涛
户磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Dilusense Technology Co Ltd
Original Assignee
Hefei Dilusense Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Dilusense Technology Co Ltd filed Critical Hefei Dilusense Technology Co Ltd
Priority to CN202111463661.4A priority Critical patent/CN114299567B/en
Publication of CN114299567A publication Critical patent/CN114299567A/en
Application granted granted Critical
Publication of CN114299567B publication Critical patent/CN114299567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the field of image processing, and discloses a model training method, a living body detection method, an electronic device and a storage medium, wherein image samples of a living body and a non-living body containing a human face and class labels of the image samples are obtained; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies; taking the image sample as input, and taking the feature vector of the image sample as output to construct a feature extraction model; taking a feature vector output by the feature extraction model as input, and taking the probability of the feature vector belonging to each class label as output to construct a classifier; and performing combined training on the feature extraction model and the classifier to obtain the trained feature extraction model and the trained classifier. According to the scheme, the generalization capability of the trained model is greatly improved through a skillfully designed data label method and a new loss function.

Description

Model training method, living body detection method, electronic device, and storage medium
Technical Field
The present invention relates to the field of image processing, and in particular, to a method for model training and in-vivo detection, an electronic device, and a storage medium.
Background
The existing human face living body detection algorithm is mainly based on a deep learning living body detection algorithm, and the ability of extracting characteristics capable of distinguishing living bodies from non-living bodies is learned by collecting sample data of living bodies and false bodies in advance and training a deep learning model by using the sample data.
The conventional training method is as follows: dividing sample data into two types of living bodies and non-living bodies, and giving L =2 different types of labels, such as 0 and 1, as supervision information during training; and E rounds of training are summed, M data in all samples are used in each round, the M data are randomly divided into K batches in one round, each batch comprises B data, and then the K batches of data are used for training the feature extraction model in sequence. For each datum, the probability of belonging to different labels can be obtained by the features extracted by the feature extraction model through a classifier, and then the difference between the prediction probability and the actual situation is measured by using cross entropy loss so as to optimize the model parameters.
The training thought simply divides the human face into a living body and a false body, ignores the subdivision condition of the living body and the false body, and has great difference of different age groups and types of human in the living body; the non-living bodies are different due to a plurality of types, such as A4 paper, photos, mobile phone screen photos, clothes, latex head covers, silica gel head covers, plastic masks and the like. Some prostheses can easily acquire a large amount of training data, while some prostheses have difficulty acquiring a large amount of training data. For such a situation, the conventional training method is often poor in training effect on a small number of prostheses or on types of prostheses that do not appear in training data, so that the generalization effect of the trained face in-vivo detection algorithm on the prostheses of unknown types is poor.
Disclosure of Invention
The embodiment of the invention aims to provide a model training method, a living body detection method, an electronic device and a storage medium, and the generalization capability of a trained model is greatly improved through a skillfully designed data label method and a new loss function.
In order to solve the above technical problem, an embodiment of the present invention provides a model training method, including:
acquiring image samples of living bodies and non-living bodies containing human faces and class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies;
taking the image sample as input, and taking the feature vector of the image sample as output to construct a feature extraction model;
taking a feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier;
and performing combined training on the feature extraction model and the classifier, wherein a loss function during the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label.
The embodiment of the invention also provides a living body detection method, which comprises the following steps:
processing the face image to be detected by adopting the feature extraction model and the classifier obtained by the combined training of the model training method to obtain the feature vector of the face image and the probability of the feature vector belonging to each class of labels;
determining a first probability that the face image belongs to the living body based on a cosine value of an included angle between the feature vector and class center feature vectors of various classes of labels belonging to the living body;
determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body;
and determining a final probability that the face image belongs to the living body based on the first probability and the second probability.
An embodiment of the present invention also provides an electronic device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a model training method as described above, and a liveness detection method as described above.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the model training method as described above, and the in-vivo detection method as described above.
Compared with the prior art, the method and the device have the advantages that the image samples of the living body and the non-living body containing the human face and the class labels of the image samples are obtained; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies; taking an image sample as input, taking a feature vector of the image sample as output, and constructing a feature extraction model; taking a feature vector output by the feature extraction model as input, taking the probability of the feature vector belonging to each class of labels as output, and constructing a classifier; and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label. The scheme is different from the traditional category labels with living bodies and non-living bodies as the image samples, and uses a plurality of category labels belonging to the living bodies and a plurality of category labels belonging to the non-living bodies as the category labels for the image samples, so that the subdivision conditions of the living bodies and the non-living bodies can be further mined. Meanwhile, when the feature extraction model and the classifier are jointly trained, a loss function of the joint training is constructed according to a first loss between the feature vector extracted by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs and a second loss between the prediction class output by the classifier and the class label, so that the feature extraction capability and the classification prediction capability of the model can be greatly improved, the generalization capability of the trained model is further improved, and the judgment capability of the non-living body type with less or even no samples in the training set is remarkably improved.
Drawings
FIG. 1 is a detailed flow diagram of a model training method according to an embodiment of the invention;
FIG. 2 is a detailed flow diagram of an image sample acquisition method according to an embodiment of the invention;
fig. 3 is a detailed flowchart of a first loss acquisition method according to an embodiment of the present invention;
FIG. 4 is a detailed flowchart of a biopsy method according to an embodiment of the invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
An embodiment of the present invention relates to a model training method, and as shown in fig. 1, the model training method provided in this embodiment includes the following steps.
Step 101: acquiring living body and non-living body image samples containing human faces and class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies.
Specifically, original images of a living body and a non-living body including a human face can be acquired by photographing and the like, image samples for model training are formed, and each image sample labels the type of the image sample in advance to obtain a label. Unlike the conventional live body detection algorithm training process, the definition of the category label in the present embodiment not only defines two category labels according to the live body and the non-live body to which the image sample belongs, but defines a plurality of category labels detailed under the live body and the non-live body categories, that is, the category label includes a plurality of category labels belonging to the live body and a plurality of category labels belonging to the non-live body. The plurality of category labels belonging to the living body may be divided according to one or more classification dimensions, the plurality of category labels belonging to the non-living body may be divided according to one or more classification dimensions, and the classification dimension used by the living body and the classification dimension used by the non-living body may not be identical.
In one example, as shown in FIG. 2, this step may be implemented by the following substeps.
Substep 1011: an original image of a living body and a non-living body including a face of a person is acquired.
Specifically, the original images of the living body and the non-living body including the face of a person can be acquired by photographing or the like.
Sub-step 1012: an original image belonging to a living body is labeled based on a plurality of category labels defined in advance by the age group of the living body.
Specifically, for a living body, it is possible to divide into 5 categories by age groups to which the living body belongs, and set a category label for each category, for example: 0 to 6 is 100,7 to 12 is 101, 13 to 50 is 102, 51 to 70 is 103, and 70 or more is 104. Then, for an original image belonging to a living body, a category label is set for the original image by the age to which the face in the original image belongs.
Substep 1013: the original image belonging to the non-living body is labeled based on a plurality of category labels defined in advance by the material of the non-living body.
Specifically, for non-living objects (i.e., "prostheses"), it is possible to classify the non-living objects into a plurality of categories by their material quality, and to set a category label for each category, for example: the 2D prostheses are classified into 7 classes according to material (the specific number depends on the actual material): the unknown material is 200, the color A4 paper is 201, the black-and-white A4 paper is 202, the color photo is 203, the black-and-white photo is 204, the color coated paper is 205, and the black-and-white coated paper is 206; the 3D prosthesis is classified into 5 types according to material quality (the specific quantity depends on the actual material quality): the material is unknown 300, the plastic 3D prosthesis is 301, the latex 3D prosthesis is 302, the silica gel 3D prosthesis is 303, and the resin 3D prosthesis is 304. Then, for the original image belonging to the non-living body, a category label is set for the original image according to the material to which the face in the original image belongs.
Substep 1014: extracting a specified number of original images from the original images as image samples; the original images with the specified number cover all the category labels, and the number of the original images corresponding to each category label is the same.
Specifically, after the original image is acquired and the category label corresponding to the original image is marked, the original image used for model training needs to be extracted from the original image as an image sample. In order to improve the generalization capability of the model, a specified number of images can be extracted from the original images, the category labels corresponding to the extracted original images need to cover all the predefined category labels, and the number of the original images corresponding to each category label is the same.
For example, the process of extracting the image sample may be implemented by the following steps.
The method comprises the following steps: and aiming at the original image of any type of label, randomly taking m original images from the original image, and judging the magnitude relation between m and a quotient value obtained by dividing the designated number by the total type label number.
Specifically, assume that the number of image samples required for one round of training is designated as M and the number of classes covered in all the original images is c (and thus the total class label number) when training the model. In order to improve the generalization capability of the model, the same number of images need to be extracted from the original images corresponding to each category label as image samples, so the number of the original images extracted corresponding to each category label should be M/c.
When extracting an original image for any category label, M original images belonging to the category label can be randomly taken from the original image, and then the size relationship between M and M/c can be judged.
Step two: if m is larger than the quotient value, deleting part of original images from the m original images to enable the number of the remaining original images to be equal to the quotient value.
Specifically, when M is greater than M/c, the number of original images extracted from the current category label is greater than the number of images extracted, and at this time, a partial number (M-M/c) of original images needs to be deleted from the extracted original images, so that the number of remaining original images extracted from the current category label is equal to M/c.
Step three: if m is smaller than the quotient value, randomly taking part of original images from the original images of the type labels again to enable the total number of the taken original images to be equal to the quotient value.
Specifically, when M is smaller than M/c, the number of original images extracted from the current category label is less than the specified number of extracted images, and at this time, a partial number (M/c-M) of original images needs to be extracted again from (all) original images of the category label, so that the number of total original images extracted from the current category label after re-extraction is equal to M/c. When the original image is extracted again, it can be acquired by random sampling.
Step four: and taking the selected M original images of all categories as M image samples to randomly disorder the sequence, and averagely dividing the M original images into M/B batches, wherein B is the batch size.
Specifically, for each class label, extracting a specified number (M/c) of original images under the corresponding class label by adopting the method from the first step to the third step, thereby obtaining the total number of the original images of the specified number M as M image samples; and randomly disordering the extracted M image samples, and forming M/B batch samples by taking every B data as a batch, wherein B is the batch size. In the subsequent model training, different batches of image samples can be selected according to batches for model training, and the training process of each batch of image samples is used as a training period.
Step 102: and taking the image sample as input, and taking the feature vector of the image sample as output to construct a feature extraction model.
Specifically, a conventional deep learning network E (referred to as a "model E" for short) is constructed as a feature extraction model, trainable parameters of the model E are recorded as WE, the input of the model E is the image sample containing the face, and the output is an n-dimensional feature vector v. Where n is a hyperparameter, set empirically, for example, taking n =128.
Step 103: and taking the feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier.
Specifically, a conventional deep learning network C (called a model C for short) is constructed as a classifier, and trainable parameters of the model C are recorded as WC. The input of the model C is the feature extraction model in step 102, i.e. the n-dimensional feature vector v output by the model E, and the output is the C (the size is the same as the total class label number) dimensional vector p, wherein p is j,i Representing the jth image sampleThe probability value that the corresponding feature vector belongs to the ith class label.
Step 104: and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label.
Specifically, the constructed feature extraction model (model E) and the classifier (model C) are jointly trained by using an image sample to obtain the trained feature extraction model and classifier. The loss function in the process of performing the joint training can be constructed based on a first loss between the feature vector output by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs, and a second loss between the prediction class output by the classifier and the class label.
The methods of constructing the first loss and the second loss will be described below, respectively.
As shown in fig. 3, the first loss construction process can be implemented as follows.
Step 201: calculating a feature vector v and a class center feature vector v of a class label to which the feature vector belongs by the following formula (1) CF D (v) from the top.
Figure BDA0003389535240000051
Specifically, a central feature with n =128 dimensions (the dimensions of the feature vector output by the feature extraction model), i.e., a class-center feature vector, may be defined for each class label. Class-centric feature vectors such as the ith class label may be denoted as
Figure BDA0003389535240000052
Its initial value is a random value. If the number of the image samples in a batch is B, the feature vector v epsilon R output after the feature extraction model n×B Wherein R isReal numbers.
Obtaining each feature vector v and class center feature vector v of class label to which the feature vector belongs CF The distance D (v) between the two vectors is obtained through calculation of the formula (1).
In one example, before performing step 201, the central feature vectors v of each class can be processed by the following steps CF And (4) updating.
For various central characteristic vectors v used in the current training period CF The update is performed by the following formula.
Figure BDA0003389535240000053
Figure BDA0003389535240000054
Wherein v is CF For the updated class-center feature vector,
Figure BDA0003389535240000055
a is a class center feature vector before updating, a is a hyper-parameter, b is the number of image samples under the same class label, e K Feature vector v of kth image sample under same class label K The labels corresponding to the category
Figure BDA0003389535240000056
The vector difference between them.
Specifically, suppose that in the image sample number B used in the current training period (batch), ct category labels (ct is less than or equal to c) are involved, and c is the total number of category labels; for the image samples under the various class labels belonging to the B image samples, respectively updating the new class central feature vector under the class label as v through a formula 2 CF . Wherein
Figure BDA0003389535240000057
For the old class center feature vector under the class label,
i.e. the class-centric feature vector used in the last training period.
The feature vector v of the kth image sample under the same class label can be calculated by the following formula K Corresponding to the category label
Figure BDA0003389535240000058
The vector difference between them.
Figure BDA0003389535240000059
Step 202: the first loss is calculated by the following formula (5).
Figure BDA00033895352400000510
Wherein, W E Extracting trainable parameters, L, of the model E for the features E (W E ) For the first loss, B is the batch size of the image sample, D (v) j ) Is the jth image sample v in a batch of image samples j The corresponding distance.
Specifically, a feature vector v and a class center feature vector v of a class label to which the feature vector belongs are obtained CF After the distance D (v) therebetween, the first loss can be calculated by equation (5). First loss may be to trainable parameters W of the feature extraction model E Constraint is carried out, so that the similarity between a certain feature vector and the class center feature vector under the class label of the feature vector is high, and the similarity between the feature vector and the class center feature vector of the class of a non-belonging label is low, and the trainable parameters W in the model E are trained E
The second loss construction process can be achieved by the following steps.
The second loss is calculated by equation (6).
Figure BDA0003389535240000061
Wherein, W C Trainable parameters for classifier C, L C (W C ) For the second loss, B is the batch size of the image sample, c is the total class label count, y j,i Is the actual probability, p, of the ith class label to which the jth image sample belongs in a batch of image samples j,i The prediction probability of the ith class label to which the jth image sample in the image samples belongs is obtained.
The second penalty may be on trainable parameters W of classifier C C Constraining a feature vector to train a trainable parameter W in a model C in a direction having a high prediction probability for a label class to which the feature vector belongs and a low prediction probability for a label class to which the feature vector does not belong C
On the basis, when the feature extraction model and the classifier are subjected to combined training, the loss function during the adopted combined training can be constructed by the formula (7).
loss=L E (W E )+L C (W C )………………………(7)
Wherein loss is the loss value during joint training, L E (W E ) For the first loss, L C (W C ) The second loss.
Specifically, when the loss of the joint training is calculated by using the formula (7), the parameters (W) of the model E and the model C are optimized according to the conventional deep learning network optimization method E 、W C ) Namely:
Figure BDA0003389535240000062
compared with the related art, the embodiment obtains the image samples of the living body and the non-living body containing the human face and the class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies; taking an image sample as input, taking a feature vector of the image sample as output, and constructing a feature extraction model; taking a feature vector output by the feature extraction model as input, taking the probability of the feature vector belonging to each class of labels as output, and constructing a classifier; and performing combined training on the feature extraction model and the classifier, wherein a loss function in the combined training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label. The scheme is different from the traditional method that living bodies and non-living bodies are used as class labels of the image samples, and a plurality of class labels belonging to the living bodies and a plurality of class labels belonging to the non-living bodies are used as the class labels of the image samples, so that the subdivision conditions of the living bodies and the non-living bodies can be further mined. Meanwhile, when the feature extraction model and the classifier are jointly trained, a loss function of the joint training is constructed according to a first loss between the feature vector extracted by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs and a second loss between the prediction class output by the classifier and the class label, so that the feature extraction capability and the classification prediction capability of the model can be greatly improved, the generalization capability of the trained model is further improved, and the judgment capability of the non-living body type with less or even no samples in the training set is remarkably improved.
Another embodiment of the present invention relates to a living body detection method, which is implemented based on the above-described model training method. As shown in fig. 4, the living body detecting method includes the following steps.
Step 301: and processing the face image to be detected by adopting a feature extraction model and a classifier obtained by joint training of a model training method to obtain a feature vector of the face image and the probability of the feature vector belonging to each class of labels.
Specifically, the feature extraction model E obtained by training with the model training method is used to perform feature extraction on the face image to be detected, so as to obtain a feature vector v corresponding to the face image. And classifying the feature vector v corresponding to the face image to be detected output by the feature extraction model E by using the classifier C obtained by training by using the model training method to obtain the probability p that the feature vector belongs to each class of label.
Step 302: and determining a first probability that the face image belongs to the living body based on cosine values of included angles between the feature vectors and class center feature vectors of various classes of labels belonging to the living body.
Specifically, feature vectors v obtained by feature extraction and class center feature vectors of various types of labels belonging to living bodies are processed to obtain cosine values of included angles between the feature vectors v and the class center feature vectors
Figure BDA0003389535240000063
In which
Figure BDA0003389535240000064
Class center feature vectors that are the i-th class labels belonging to living subjects, namely: the ith class label belongs to class labels 100, 101, 102, 103, 104.
Then, the first probability p is calculated according to the following formula (8) E
Figure BDA0003389535240000071
Therein, max i Among a plurality of category labels belonging to the living body,
Figure BDA0003389535240000072
of (c) is calculated.
Specifically, cosine values of included angles between feature vectors v of a face image to be detected and class center feature vectors of various classes of labels belonging to living bodies are obtained
Figure BDA0003389535240000073
Then, the first probability p can be obtained according to the formula (8) E
Step 303: and determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body.
Specifically, the probability p of each class label of the living body to which the feature vector belongs is obtained i Thereafter, the second probability p can be calculated according to the following equation (9) C
p C =∑p i ………………………(9)
Wherein the ith class label belongs to the class labels {100, 101, 102, 103, 104}.
Step 304: and determining the final probability that the face image belongs to the living body based on the first probability and the second probability.
Specifically, a first probability p is obtained E And a second probability p C Thereafter, the final probability P can be calculated according to the following equation (10).
P=d×p E +(1-d)×p C ………………………(10)
Where d is a hyperparameter, empirically set to 0.1.
Compared with the prior art, the embodiment of the invention processes the face image to be detected through the feature extraction model and the classifier obtained by the combined training of the model training method to obtain the feature vector of the face image and the probability of the feature vector belonging to each class of labels; determining a first probability that the face image belongs to the living body based on cosine values of included angles between the feature vectors and class center feature vectors of various classes of labels belonging to the living body; determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body; and determining the final probability that the face image belongs to the living body based on the first probability and the second probability.
In the scheme, the adopted feature extraction model and the classifier are obtained by joint training of the image samples marked by the plurality of category labels belonging to the living body and the plurality of category labels belonging to the non-living body, so that the subdivision conditions of the living body and the non-living body can be further detected. Meanwhile, when the feature extraction model and the classifier are jointly trained, a loss function of the joint training is constructed by using a first loss between the feature vector extracted by the feature extraction model and the class center feature vector of the class label to which the feature vector belongs and a second loss between the prediction class output by the classifier and the class label, so that the feature extraction capability and the classification prediction capability of the model can be greatly improved, the generalization capability of the trained model is further improved, and the judgment capability of the non-living body type with less or even no samples in the training set is remarkably improved. On the basis, when the living body detection is carried out, the probability that the face to be detected belongs to the living body is judged together based on the first probability obtained by the characteristic extraction model and the second probability obtained by the classifier, so that the accuracy of the living body detection is improved.
Another embodiment of the invention relates to an electronic device, as shown in FIG. 5, comprising at least one processor 402; and a memory 401 communicatively coupled to the at least one processor 402; the memory 401 stores instructions executable by the at least one processor 402, and the instructions are executed by the at least one processor 402 to enable the at least one processor 402 to perform any one of the method embodiments described above.
Where the memory 401 and the processor 402 are coupled by a bus, which may include any number of interconnected buses and bridges that couple one or more of the various circuits of the processor 402 and the memory 401 together. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 402 is transmitted over a wireless medium through an antenna, which further receives the data and transmits the data to the processor 402.
The processor 402 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 401 may be used to store data used by processor 402 in performing operations.
Another embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes any of the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims (8)

1. A method of model training, comprising:
acquiring image samples of living bodies and non-living bodies containing human faces and class labels of the image samples; the category labels include a plurality of category labels belonging to living bodies, and a plurality of category labels belonging to non-living bodies;
taking the image sample as input, and taking the feature vector of the image sample as output to construct a feature extraction model;
taking a feature vector output by the feature extraction model as input, and taking the probability that the feature vector belongs to each class label as output to construct a classifier;
performing joint training on the feature extraction model and the classifier, wherein a loss function in the joint training is constructed on the basis of a first loss between a feature vector output by the feature extraction model and a class center feature vector of a class label to which the feature vector belongs and a second loss between a prediction class output by the classifier and the class label;
the first loss is constructed by the following method:
by passing
Figure FDA0003765318820000011
Calculating characteristicsFeature vector v and class center feature vector v of class label to which the feature vector belongs CF D (v) in between;
by passing
Figure FDA0003765318820000012
Calculating the first loss;
wherein, W E Extracting trainable parameters of the model E for the feature, L E (W E ) For the first loss, B is the batch size of the image sample, D (v) j ) Is the jth image sample v in a batch of image samples j The corresponding distance;
before the first loss is constructed, the method further comprises the following steps:
various central characteristic vectors v used for the current training period CF The update is performed by the following formula:
Figure FDA0003765318820000013
Figure FDA0003765318820000014
wherein v is CF For the updated class-centered feature vector,
Figure FDA0003765318820000015
a is a class center feature vector before updating, a is a hyper-parameter, b is the number of image samples under the same class label, e K Feature vector v for kth image sample under same class label K The labels corresponding to the category
Figure FDA0003765318820000016
The vector difference between them.
2. The method of claim 1, wherein the obtaining of image samples of living and non-living subjects including a human face and class labels of the image samples comprises:
acquiring original images of a living body and a non-living body containing a human face;
labeling an original image belonging to a living body based on a plurality of category labels predefined according to the age group of the living body;
labeling an original image belonging to a non-living body based on a plurality of category labels predefined according to non-living body materials;
extracting a specified number of original images from the original images as the image samples;
and the original images with the specified number cover all the class labels, and the number of the original images corresponding to each class label is the same.
3. The method according to claim 2, wherein the extracting a specified number of original images from the original images as the image samples comprises:
randomly taking m original images from the original images aiming at the original images of any category of labels, and judging the magnitude relation between m and a quotient value obtained by dividing the designated number by the total category label number;
if the m is larger than the quotient value, deleting partial original images from the m original images to enable the number of the remaining original images to be equal to the quotient value;
if m is smaller than the quotient value, randomly taking part of original images from the original images of the class labels again to enable the total number of the original images to be equal to the quotient value;
and randomly disordering the selected M original images of all categories as M image samples, and averagely dividing the M original images into M/B batches, wherein B is the batch size.
4. The method of claim 1, wherein the second loss is constructed by:
by passing
Figure FDA0003765318820000021
Calculating the second loss;
wherein, W C Trainable parameters for classifier C, L C (W C ) For the second loss, B is the batch size of the image sample, C is the total class label count, y j,i Is the actual probability, p, of the ith class label to which the jth image sample belongs in a batch of image samples j,i The prediction probability of the ith class label to which the jth image sample belongs in the image samples is obtained.
5. The method of claim 1, wherein the loss function in the joint training is constructed by the following formula:
loss=L E (W E )+L C (W C )
wherein loss is the loss value during joint training, L E (W E ) For the first loss, L C (W C ) Is the second loss.
6. A method of in vivo detection, comprising:
processing a face image to be detected by adopting a feature extraction model and a classifier obtained by the combined training of the model training method according to any one of claims 1 to 5 to obtain a feature vector of the face image and the probability of the feature vector belonging to each class of labels;
determining a first probability that the face image belongs to the living body based on a cosine value of an included angle between the feature vector and class center feature vectors of various classes of labels belonging to the living body;
determining a second probability that the face image belongs to the living body based on the probability that the feature vector belongs to each class label of the living body;
and determining a final probability that the face image belongs to the living body based on the first probability and the second probability.
7. An electronic device, comprising:
at least one processor; and (c) a second step of,
a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the model training method of any one of claims 1 to 5 and the liveness detection method of claim 6.
8. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the model training method according to any one of claims 1 to 5 and the in-vivo detection method according to claim 6.
CN202111463661.4A 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium Active CN114299567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111463661.4A CN114299567B (en) 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111463661.4A CN114299567B (en) 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium

Publications (2)

Publication Number Publication Date
CN114299567A CN114299567A (en) 2022-04-08
CN114299567B true CN114299567B (en) 2022-11-18

Family

ID=80965390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111463661.4A Active CN114299567B (en) 2021-12-02 2021-12-02 Model training method, living body detection method, electronic device, and storage medium

Country Status (1)

Country Link
CN (1) CN114299567B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761411B (en) * 2022-11-24 2023-09-01 北京的卢铭视科技有限公司 Model training method, living body detection method, electronic device, and storage medium
CN116077066A (en) * 2023-02-10 2023-05-09 北京安芯测科技有限公司 Training method and device for electrocardiosignal classification model and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705383A (en) * 2021-08-12 2021-11-26 南京英诺森软件科技有限公司 Cross-age face recognition method and system based on ternary constraint

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329696A (en) * 2020-11-18 2021-02-05 携程计算机技术(上海)有限公司 Face living body detection method, system, equipment and storage medium
CN113609944A (en) * 2021-07-27 2021-11-05 东南大学 Silent in-vivo detection method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705383A (en) * 2021-08-12 2021-11-26 南京英诺森软件科技有限公司 Cross-age face recognition method and system based on ternary constraint

Also Published As

Publication number Publication date
CN114299567A (en) 2022-04-08

Similar Documents

Publication Publication Date Title
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
US11593943B2 (en) RECIST assessment of tumour progression
CN105144239B (en) Image processing apparatus, image processing method
WO2019015246A1 (en) Image feature acquisition
CN114299567B (en) Model training method, living body detection method, electronic device, and storage medium
US9330336B2 (en) Systems, methods, and media for on-line boosting of a classifier
CN112016464A (en) Method and device for detecting face shielding, electronic equipment and storage medium
CN109376796A (en) Image classification method based on active semi-supervised learning
CN104850860A (en) Cell image recognition method and cell image recognition device
CN110705489B (en) Training method and device for target recognition network, computer equipment and storage medium
CN113449704B (en) Face recognition model training method and device, electronic equipment and storage medium
Song et al. Hybrid deep autoencoder with Curvature Gaussian for detection of various types of cells in bone marrow trephine biopsy images
CN113870254B (en) Target object detection method and device, electronic equipment and storage medium
CN112101114B (en) Video target detection method, device, equipment and storage medium
CN112819821A (en) Cell nucleus image detection method
CN116844217B (en) Image processing system and method for generating face data
CN111931867A (en) New coronary pneumonia X-ray image classification method and system based on lightweight model
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN107729863B (en) Human finger vein recognition method
CN113486202A (en) Method for classifying small sample images
Gunawan et al. Fuzzy Region Merging Using Fuzzy Similarity Measurement on Image Segmentation
CN114913404A (en) Model training method, face image living body detection method, electronic device and storage medium
CN112347879B (en) Theme mining and behavior analysis method for video moving target
CN115410250A (en) Array type human face beauty prediction method, equipment and storage medium
CN114463574A (en) Scene classification method and device for remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220627

Address after: 230091 room 611-217, R & D center building, China (Hefei) international intelligent voice Industrial Park, 3333 Xiyou Road, high tech Zone, Hefei, Anhui Province

Applicant after: Hefei lushenshi Technology Co.,Ltd.

Address before: 100083 room 3032, North B, bungalow, building 2, A5 Xueyuan Road, Haidian District, Beijing

Applicant before: BEIJING DILUSENSE TECHNOLOGY CO.,LTD.

Applicant before: Hefei lushenshi Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant