CN111881815A - Human face in-vivo detection method based on multi-model feature migration - Google Patents

Human face in-vivo detection method based on multi-model feature migration Download PDF

Info

Publication number
CN111881815A
CN111881815A CN202010728371.7A CN202010728371A CN111881815A CN 111881815 A CN111881815 A CN 111881815A CN 202010728371 A CN202010728371 A CN 202010728371A CN 111881815 A CN111881815 A CN 111881815A
Authority
CN
China
Prior art keywords
model
probability
living body
yuv
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010728371.7A
Other languages
Chinese (zh)
Inventor
凌康杰
王祥雪
林焕凯
刘双广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gosuncn Technology Group Co Ltd
Original Assignee
Gosuncn Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gosuncn Technology Group Co Ltd filed Critical Gosuncn Technology Group Co Ltd
Priority to CN202010728371.7A priority Critical patent/CN111881815A/en
Publication of CN111881815A publication Critical patent/CN111881815A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/162Detection; Localisation; Normalisation using pixel segmentation or colour matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a human face living body detection method based on multi-model feature migration. In the training stage, visible light images on an open source or private data set are fused, and an RGB model and a YUV model are respectively trained simultaneously until the models are converged after face detection, alignment and cutting; in the prediction stage, the collected visible light images are respectively input into the trained RGB model and YUV model to respectively obtain the results of the two models, the final score is obtained through a model score fusion strategy, and finally the in-vivo detection result is judged according to the score. The method has good generalization performance and high precision, and is suitable for deployment and use in industry.

Description

Human face in-vivo detection method based on multi-model feature migration
Technical Field
The invention belongs to the technical field of computer vision, pattern recognition, machine learning, convolutional neural network and face recognition, and particularly relates to a face living body detection method based on multi-model feature migration.
Background
The face recognition technology has been widely applied to the fields of security monitoring, man-machine exchange, electronic commerce, mobile payment and the like, and face living body detection is the first threshold of face recognition and is also the premise of the application of the face recognition technology. In the current living body detection, the main technical solutions are interactive living body detection, multi-source information fusion living body detection and static image living body detection. The interactive in-vivo detection requires the cooperation of users, is very inconvenient, has complicated steps, is easy to conflict by the users and has low efficiency; the multisource information fusion in-vivo detection usually needs to add a depth camera, an infrared camera, a 3D camera, a microphone and the like, so that not only is the hardware overhead increased, but also a large amount of complex 3D modeling calculation is brought; static image biopsy is often a low-cost and quick biopsy method, but due to the current insufficiency of data sets and the inefficiency of a model construction method, the model generalization is often insufficient.
In the current static image living body detection method, a living body detection model is often constructed aiming at a limited few attack types under a single scene represented by a single data set, the method can meet the ideal precision requirement in a laboratory stage, but the method is very complex in actual scenes in industrial and practical use, not only is the illumination and the background brought by the diversity of the scenes diversified, but also the diversity of attack types and attack means can exist, and the method brings serious challenges to the fact that the living body detection can be used on the ground. In common living body attack detection, the type of a printer, the printing quality, the type, the resolution, the size and the like of different display screens brought by different display devices, even the angle, the distance, the brightness of the display screen, whether the display devices are pasted with films and the like of the attack are all influenced on the living body detection. The diversity of attack types and the image distribution difference in the same kind of attack are large, so that the generalization capability of the model is low in a real scene. Aiming at the problem that the generalization capability of the traditional in-vivo detection method is insufficient, the in-vivo detection method based on multi-model migration is provided, and through constructing a heterogeneous data set, the in-vivo training and prediction are carried out by adopting a multi-model feature migration fusion method under various color spaces.
The patent CN109840467A uses a kind of generation-based countermeasure network (GAN) to generate a new training set of negative samples (the negative samples are attack images), and the purpose of the patent is to solve the problem of too few negative samples when a deep learning training network is adopted. However, on a given data set, the GAN method can only learn a limited sample probability distribution on the data set. Therefore, for a brand new attack scene and means, the image data generated by the network has limited representativeness and insufficient generalization capability.
Patent CN110472519A adopts a multi-model fusion method for living body detection, but its model requires that natural light images and infrared light images are used as training sets to be input to the network at the same time. In the method, the infrared image acquisition method is complex, an infrared camera is needed, the hardware cost is increased, and the rapid upgrade of some existing face detection and recognition equipment is not facilitated.
In the prior art, due to the fact that training data are single in distribution and a model building method is low in efficiency, a static living body detection model is often insufficient in generalization and cannot be used in industrial and real production scenes generally.
Disclosure of Invention
The invention provides a human face in-vivo detection method based on multi-model feature migration, aiming at the problem of insufficient generalization existing in the current static in-vivo detection model.
The invention is realized by the following technical scheme:
a human face living body detection method of multi-model feature migration comprises the following steps:
s1, acquiring a visible light image, wherein the visible light comprises a human face;
s2, identifying the visible light image by using a first RGB model to obtain a first living body probability;
s3, identifying the visible light image by using a second YUV model to obtain a second living body probability;
s4, determining a third living body probability according to the first living body probability and the second living body probability; and S5, judging whether the living body is the living body according to the third living body probability.
Further, the first live body probability includes: negative sample probability p1 and positive sample probability p 2; the second live body probability includes: negative sample probability p3 and positive sample probability p 4.
Further, step S4 further includes: the third live body probability is the mean of the probabilities of the respective models, i.e., the final negative sample probability is α × (p1+ p3) and the final positive sample probability is β × (p2+ p4), where 0 ≦ α, β ≦ 1, and α + β ≦ 1.
Further, step S4 further includes: step-by-step judgment, setting a first RGB model threshold value as T1 and a second YUV model threshold value as T2, wherein T1 is more than or equal to 0.5 and less than 1, and T2 is more than or equal to 0.5 and less than 1; specifically, the output of the first RGB model is firstly determined, if p2 is less than 1-T1 or p2 is greater than T1, the finally output third live body probability is the first live body probability output by the first RGB model, and if not, the output of the second YUV model is determined; if p4 < 1-T2 or p4 > T2 in the second YUV model, the final output third living body probability is the second living body probability output by the second YUV model, and if not, the third living body probability is the average value of the model probabilities, namely the final negative sample probability is alpha x (p1+ p3) and the final positive sample probability is beta x (p2+ p4), wherein alpha is greater than or equal to 0, beta is less than or equal to 1, and alpha + beta is 1.
Further, step S5 includes: setting the living body judgment threshold value as T3, wherein T3 is more than or equal to 0.5 and less than 1, and if the probability of the final positive sample is more than or equal to T3, judging the image as the positive sample; if the value is less than T3, the result is judged to be a negative sample.
Further, step S0 is included before step S1, in which the training phase is to fuse the visible light images on the heterogeneous data set, perform face detection, alignment, and clipping, and train the first RGB model and the second YUV model respectively until the models converge.
Further, in step S0, the training phase includes the steps of:
s101: constructing a heterogeneous data set, collecting the heterogeneous data set, and only selecting images or videos under visible light to form the heterogeneous data set; the positive sample is a real sample in the heterogeneous data set, and the negative sample is an attack sample in the heterogeneous data set;
s102: data preprocessing, comprising 3 steps:
a: face detection, namely performing face detection once every n frames for video data, if a face is detected, performing the next step, and if not, performing the face detection of the next n frames; directly carrying out face detection on image data, if a face is detected, carrying out the next step, and otherwise, carrying out face detection on the next image;
b: b, face alignment, which is characterized in that the face detected in the step A is aligned by adopting similarity transformation;
c: b, cutting the face aligned in the step B to an input size suitable for both the first RGB model and the second YUV model;
s103: training a first RGB model, and inputting the preprocessed face image of S102 into the first RGB model for training;
s104: training a second YUV model, converting the preprocessed face image of S102 from an RGB color space to a YUV color space through color space conversion, and inputting the face image into the second YUV model for training;
s105: and respectively training the two models in the S103 and the S104, and when the models converge and reach the expected accuracy on the verification set or the test set, representing that the model training is finished, and entering the step S1.
Further, in step S101, the heterogeneous data set includes a public or private data set, where the public data set includes: Replay-Attack, Print-Attack, Yale-Recampled, CASIA-MFSD, MSU-MFSD, Replay-Mobile, Mspoof, SiW, Ouu-NPU, VAD, NUAA, or CASIA-SURF.
A computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, performs the steps of a method for live face detection with multi-model feature migration.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the live human face detection method of multi-model feature migration when executing the program.
The key points of the invention are as follows:
1. the construction strategy and method of the heterogeneous data set are as follows: the construction strategy of the heterogeneous data set and the multiple models in the multiple color spaces complement each other, and if the heterogeneous data set does not exist, the characteristics learned by the multiple models in the multiple color spaces are limited and single, and the generalization capability is low;
2. the model construction scheme of multi-model feature migration under RGB and YUV multiple color spaces is as follows: the method has the advantages that multiple models in multiple color spaces are lacked, and complex features in heterogeneous data sets are difficult to be fully learned by a single model;
3. model score fusion strategy: the model score fusion strategy is a key factor influencing the final effect of the multiple models, and if the model score fusion strategy is not matched with the actually constructed multiple models, the final effect of the multiple models is often worse than that of a single model.
Compared with the prior art, the invention has at least the following beneficial effects or advantages: firstly, the scheme has the advantages of low cost and small calculated amount, and can rapidly deploy and upgrade the existing face detection and recognition equipment with a plurality of single cameras. The scheme adopts a static image in-vivo detection technology, only needs a single camera, has no cost caused by additional hardware such as a depth camera, an infrared camera, a 3D camera, a microphone and the like, has no 3D modeling calculation with a large amount of complexity, and has the characteristics of low cost and small calculated amount. The backbone network adopted by the scheme can be replaced by lightweight networks such as MobileNet V1, MobileNet V2 and EfficientNet according to actual requirements, so that the reasoning calculation is further accelerated. Secondly, the living body detection model established by the scheme has the advantages of strong generalization and high precision, and the generalization and the precision of the model are obviously improved because the training set uses the heterogeneous data set and constructs the multi-model under various color spaces.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings;
FIG. 1 is a flow chart of the training phase of the present invention;
FIG. 2 is a flow chart of the prediction phase of the present invention;
fig. 3 is a flow chart of the step-by-step determination of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Aiming at the problem that the existing static living body detection model based on deep learning often has insufficient generalization, the living body detection method based on multi-model feature migration is provided. By constructing and fusing heterogeneous data sets, the living body training is carried out by adopting a multi-model feature migration and fusion method under various color spaces, and the precision and the generalization capability of a living body detection model are improved. The method comprises the steps of fusing visible light images on an open source or private data set in a training stage, and simultaneously training an RGB (red, green and blue) model and a YUV (Luma and Luma) model through face detection, alignment and cutting; in the prediction stage, the collected visible light images are respectively input into an RGB model and a YUV model to respectively obtain the results of the two models, a final score is obtained through a threshold fusion scheme, and finally the in-vivo detection result is judged according to the score. The method has good generalization performance and high precision, and is suitable for deployment and use in industry.
In an embodiment of the present invention, a living human face detection method with multi-model feature migration is provided, which includes the steps of:
s1, acquiring a visible light image, wherein the visible light comprises a human face;
s2, identifying the visible light image by using a first RGB model to obtain a first living body probability;
s3, identifying the visible light image by using a second YUV model to obtain a second living body probability;
s4, determining a third living body probability according to the first living body probability and the second living body probability; and S5, judging whether the living body is the living body according to the third living body probability.
Further, the first live body probability includes: negative sample probability p1 and positive sample probability p 2; the second live body probability includes: negative sample probability p3 and positive sample probability p 4.
Further, step S4 further includes: the third live body probability is the mean of the probabilities of the respective models, i.e., the final negative sample probability is α × (p1+ p3) and the final positive sample probability is β × (p2+ p4), where 0 ≦ α, β ≦ 1, and α + β ≦ 1.
Further, step S4 further includes: step-by-step judgment, setting a first RGB model threshold value as T1 and a second YUV model threshold value as T2, wherein T1 is more than or equal to 0.5 and less than 1, and T2 is more than or equal to 0.5 and less than 1; specifically, the output of the first RGB model is firstly determined, if p2 is less than 1-T1 or p2 is greater than T1, the finally output third live body probability is the first live body probability output by the first RGB model, and if not, the output of the second YUV model is determined; if p4 < 1-T2 or p4 > T2 in the second YUV model, the final output third living body probability is the second living body probability output by the second YUV model, and if not, the third living body probability is the average value of the model probabilities, namely the final negative sample probability is alpha x (p1+ p3) and the final positive sample probability is beta x (p2+ p4), wherein alpha is greater than or equal to 0, beta is less than or equal to 1, and alpha + beta is 1.
Further, step S5 includes: setting the living body judgment threshold value as T3, wherein T3 is more than or equal to 0.5 and less than 1, and if the probability of the final positive sample is more than or equal to T3, judging the image as the positive sample; if the value is less than T3, the result is judged to be a negative sample.
Further, step S0 is included before step S1, in which the training phase is to fuse the visible light images on the heterogeneous data set, perform face detection, alignment, and clipping, and train the first RGB model and the second YUV model respectively until the models converge.
In another embodiment, the method is divided into two steps, wherein the first step is the construction of a living body detection model, which corresponds to a training stage; the second step is the deployment of the in vivo detection model, corresponding to the prediction phase. The scheme of the training phase is shown in fig. 1, and the scheme of the prediction phase is shown in fig. 2.
The training phase is characterized in that a heterogeneous data set is constructed, two or more depth models with different frameworks are trained simultaneously, and each depth learning model adopts different color spaces.
S101: and (5) constructing a heterogeneous data set. Public or private data sets are collected first, and only images or videos under visible light are selected to form heterogeneous data sets. Public data sets include, but are not limited to, the following data sets: Replay-Attack, Print-Attack, Yale-Recampled, CASIA-MFSD, MSU-MFSD, Replay-Mobile, Mspoof, SiW, Ouu-NPU, VAD, NUAA, CASIA-SURF, and the like. For private specific scenarios, private data sets are collected and joined. The positive sample is a real sample in the heterogeneous data set, the negative sample is an attack sample in the heterogeneous data set, and the negative sample type refers to 2D attack types such as print picture attack, tablet attack, mobile phone attack, display attack and the like in the negative sample. The construction strategy of the heterogeneous data set comprises the following 2 strategies:
let the collected data sets be S, and be recorded as { D1,D2,D3,…,DS}. Let mth data set DmThe number of positive samples contained in (A) is MmNumber of negative samples is NmA negative sample type has OmWherein m ═ { m |1 ≦ m ≦ S, m ∈ N*}. Let gamma be the equilibrium factor, wherein gamma is more than or equal to 0 and less than or equal to 1.
Strategy one: in the mth data set, a negative sample type is randomly (hierarchically) decimated
Figure BDA0002599413870000091
One sample, then negative samples are taken in total of γ NmAnd (4) respectively. If γ Nm>MmPositive sample random (tiered) charging MmA plurality of; if γ Nm≤MmPositive samples are drawn at random (in layers) γ NmAnd (4) respectively.
And (2) strategy two: the positive samples in all data sets are classified into one class, the negative samples are classified according to the negative sample type, and the negative sample type is set as a common OoClasses, each class having PoA negative example. For example, if the print image attack occurs at the same time in { D1,D2,D5Data set, then { D }1,D2,D5And classifying the print picture attacks in the negative samples of the three data sets. The second specific construction method of the strategy is to randomly (hierarchically) extract gamma P aiming at each type of negative sampleoAnd (4) respectively. Then the total number of negative samples extracted is γ P over all dataoOoAnd (4) respectively. If it is not
Figure BDA0002599413870000092
Positive sample random (hierarchical) extraction
Figure BDA0002599413870000093
A plurality of; if it is not
Figure BDA0002599413870000094
Figure BDA0002599413870000095
Positive sample random (stratified) extraction of gamma PoOoAnd (4) respectively.
S102: and (4) preprocessing data. The data pre-processing comprises 3 steps. The first step is face detection, wherein the face detection is carried out once every n (n is more than 1) frames aiming at video data, if the face is detected, the second step is carried out, otherwise, the face detection of the next n frames is carried out; and directly carrying out face detection on the image data, if a face is detected, carrying out the second step, and otherwise, carrying out the face detection on the next image. And the second step is face alignment, which is characterized in that the face detected in the first step is aligned by adopting similarity transformation. And thirdly, cutting the face aligned in the second step to an input size suitable for both the RGB model and the YUV model.
S103: the RGB model is trained. And inputting the preprocessed image of the S102 into an RGB model for training. The RGB model refers to a deep learning model with RGB color space of the input image, and the specific network backbone of the RGB model may be, but is not limited to, convolutional neural networks such as VGG, google net, ResNet, DenseNet, MobileNet V1, MobileNet V2, MobileFaceNet, EfficientNet, ShuffleNet, and the like, and variants thereof. In particular, the RGB model here is pre-trained with the ImageNet dataset.
S104: and training a YUV model. And (4) converting the preprocessed image in the S102 into a color space, converting the image from the RGB color space to the YUV color space, and inputting the image into a YUV model for training. The YUV model is a deep learning model in which a color space of an input image is YUV, and the YUV color space specifically includes, but is not limited to, YCrCb, YCbCr, YPbPr, YDbDr, and the like. The specific network backbone of the YUV model can be, but is not limited to, convolutional neural networks such as VGG, google lenet, ResNet, densnet, MobileNet V1, MobileNet V2, MobileFaceNet, EfficientNet, ShuffleNet, and variants thereof. In particular, the YUV color space here refers to YCrCb, with the YUV model pre-trained with the ImageNet data set.
S105: and finishing the model training. And respectively training the two models in the S103 and the S104, and when the models are converged and reach the expected precision on the verification set or the test set, representing that the model training is finished, and deploying the next prediction stage.
In the prediction stage, after the input visible light image is preprocessed, the preprocessed visible light image is simultaneously input into an RGB model and input into a YUV model after color space conversion, scores are respectively obtained, and the final score is obtained through a score fusion scheme, so that the final in vivo detection model result is judged. The inference stage comprises the following specific steps:
s201: a visible light image is input. An RGB image containing a human face is acquired by a visible light camera, and the image is an input of S202.
S202: and (5) image preprocessing. And (4) preprocessing the RGB image acquired in the step (S201), wherein the preprocessing method is consistent with that in the step (S102).
S203: and (5) forward calculation of the RGB model. And sending the image preprocessed in the step S202 to the trained RGB model for forward calculation.
S204: and obtaining the RGB model score. And (3) acquiring the network output result after forward calculation in S203, and setting the output negative sample probability as p1 and the positive sample probability as p2 as (p1, p2), wherein p1 is more than or equal to 0 and less than or equal to 1, p2 is more than or equal to 0 and less than or equal to 1, and p1+ p2 is equal to 1.
S205: and (4) color space conversion. And converting the image preprocessed in the step S202 into a YUV color space through color space conversion. In particular, here YUV color space refers to YCrCb.
S206: and (4) forward computing a YUV model. And sending the image subjected to color space conversion in the step S205 to a trained YUV model for forward calculation.
S207: and obtaining the YUV model score. And (2) acquiring a network output result after forward calculation in S206, and setting the output negative sample probability as p3 and the positive sample probability as p4 as (p3, p4), wherein p3 is more than or equal to 0 and less than or equal to 1, p4 is more than or equal to 0 and less than or equal to 1, and p3+ p4 is equal to 1.
S208: and fusing the model scores. The model score fusion strategy comprises the following 2 types:
strategy one: the final probability output is the mean of the probabilities of the individual models. Specifically, the final negative sample probability is α × (p1+ p3) and the positive sample probability is β × (p2+ p4), denoted as (α × (p1+ p3), β × (p2+ p4)), where 0 ≦ α, β ≦ 1, and α + β ≦ 1. Typically, α is 0.5 and β is 0.5.
And (2) strategy two: and (5) judging step by step. The flow is shown in figure 3. Let the RGB model threshold be T1(0.5 ≤ T1 < 1), and YUV model threshold be T2(0.5 ≤ T2 < 1). The specific process is that the output of the RGB model is judged firstly, if p2 is less than 1-T1 or p2 is more than T1, the final output is the output of the RGB model; and if not, judging the output of the YUV model. If p4 is less than 1-T2 or p4 is more than T2 in the YUV model, the final output is the YUV model output; if not, the probability output equivalent to the first strategy is output in the final model, that is, the final output negative sample probability is α × (p1+ p3), and the positive sample probability is β × (p2+ p4), and these probabilities are denoted by (α × (p1+ p3), β × (p2+ p4)), where 0 is ≦ α, β is ≦ 1, and α + β is 1. Typically, α is 0.5 and β is 0.5.
S209: and judging the result of the living body detection model. Assuming that the living body judgment threshold is T3 (0.5. ltoreq. T3 < 1), if the probability of a positive sample in the output of S208 is greater than or equal to T3, the image is judged as a living body image (positive sample); if the value is less than T3, the image is judged to be an attack image (negative sample). For the model thresholds T1, T2 and T3 described above, typically T1-T2-T3.
The present invention also provides a computer-readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, performs the steps of the method for live human face detection with multi-model feature migration.
The invention also provides computer equipment which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the human face living body detection method of multi-model feature migration.
The above-mentioned embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the invention are also within the protection scope of the invention.

Claims (10)

1. A human face living body detection method of multi-model feature migration is characterized by comprising the following steps:
s1, acquiring a visible light image, wherein the visible light comprises a human face;
s2, identifying the visible light image by using a first RGB model to obtain a first living body probability;
s3, identifying the visible light image by using a second YUV model to obtain a second living body probability;
s4, determining a third living body probability according to the first living body probability and the second living body probability;
and S5, judging whether the living body is the living body according to the third living body probability.
2. The method of claim 1, wherein the first live probability comprises: negative sample probability p1 and positive sample probability p 2; the second live body probability includes: negative sample probability p3 and positive sample probability p 4.
3. The method according to claim 2, wherein step S4 further comprises: the third live body probability is the mean of the probabilities of the respective models, i.e., the final negative sample probability is α × (p1+ p3) and the final positive sample probability is β × (p2+ p4), where 0 ≦ α, β ≦ 1, and α + β ≦ 1.
4. The method according to claim 2, wherein step S4 further comprises: step-by-step judgment, setting a first RGB model threshold value as T1 and a second YUV model threshold value as T2, wherein T1 is more than or equal to 0.5 and less than 1, and T2 is more than or equal to 0.5 and less than 1; specifically, the output of the first RGB model is firstly determined, if p2 is less than 1-T1 or p2 is greater than T1, the finally output third live body probability is the first live body probability output by the first RGB model, and if not, the output of the second YUV model is determined; if p4 < 1-T2 or p4 > T2 in the second YUV model, the final output third living body probability is the second living body probability output by the second YUV model, and if not, the third living body probability is the average value of the model probabilities, namely the final negative sample probability is alpha x (p1+ p3) and the final positive sample probability is beta x (p2+ p4), wherein alpha is greater than or equal to 0, beta is less than or equal to 1, and alpha + beta is 1.
5. The method according to claim 1, wherein step S5 includes: setting the living body judgment threshold value as T3, wherein T3 is more than or equal to 0.5 and less than 1, and if the probability of the final positive sample is more than or equal to T3, judging the image as the positive sample; if the value is less than T3, the result is judged to be a negative sample.
6. The method of claim 1, further comprising a step S0 before the step S1, wherein the training phase is a step of fusing visible light images on the heterogeneous data set, and performing face detection, alignment and cropping, and training the first RGB model and the second YUV model respectively until the models converge.
7. The method of claim 6, wherein in step S0, the training phase comprises the steps of:
s101: constructing a heterogeneous data set, collecting the heterogeneous data set, and only selecting images or videos under visible light to form the heterogeneous data set; the positive sample is a real sample in the heterogeneous data set, and the negative sample is an attack sample in the heterogeneous data set;
s102: data preprocessing, comprising 3 steps:
a: face detection, namely performing face detection once every n frames for video data, if a face is detected, performing the next step, and if not, performing the face detection of the next n frames; directly carrying out face detection on image data, if a face is detected, carrying out the next step, and otherwise, carrying out face detection on the next image;
b: b, face alignment, which is characterized in that the face detected in the step A is aligned by adopting similarity transformation;
c: b, cutting the face aligned in the step B to an input size suitable for both the first RGB model and the second YUV model;
s103: training a first RGB model, and inputting the preprocessed face image of S102 into the first RGB model for training;
s104: training a second YUV model, converting the preprocessed face image of S102 from an RGB color space to a YUV color space through color space conversion, and inputting the face image into the second YUV model for training;
s105: and respectively training the two models in the S103 and the S104, and when the models converge and reach the expected accuracy on the verification set or the test set, representing that the model training is finished, and entering the step S1.
8. The method according to claim 7, wherein in step S101, the heterogeneous data set comprises a public or private data set, wherein the public data set comprises: Replay-Attack, Print-Attack, Yale-Recampled, CASIA-MFSD, MSU-MFSD, Replay-Mobile, Mspoof, SiW, Ouu-NPU, VAD, NUAA, or CASIA-SURF.
9. A computer-readable storage medium, having stored thereon a computer program, wherein the program, when executed by a processor, performs the steps of the method for living human face detection with multi-model feature migration according to any one of claims 1-8.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the steps of the method for living human face detection with multi-model feature migration according to any one of claims 1-8.
CN202010728371.7A 2020-07-23 2020-07-23 Human face in-vivo detection method based on multi-model feature migration Pending CN111881815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010728371.7A CN111881815A (en) 2020-07-23 2020-07-23 Human face in-vivo detection method based on multi-model feature migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010728371.7A CN111881815A (en) 2020-07-23 2020-07-23 Human face in-vivo detection method based on multi-model feature migration

Publications (1)

Publication Number Publication Date
CN111881815A true CN111881815A (en) 2020-11-03

Family

ID=73201464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010728371.7A Pending CN111881815A (en) 2020-07-23 2020-07-23 Human face in-vivo detection method based on multi-model feature migration

Country Status (1)

Country Link
CN (1) CN111881815A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6295369B1 (en) * 1998-11-06 2001-09-25 Shapiro Consulting, Inc. Multi-dimensional color image mapping apparatus and method
US20140270409A1 (en) * 2013-03-15 2014-09-18 Eyelock, Inc. Efficient prevention of fraud
US20180025217A1 (en) * 2016-07-22 2018-01-25 Nec Laboratories America, Inc. Liveness detection for antispoof face recognition
CN109034102A (en) * 2018-08-14 2018-12-18 腾讯科技(深圳)有限公司 Human face in-vivo detection method, device, equipment and storage medium
CN109598242A (en) * 2018-12-06 2019-04-09 中科视拓(北京)科技有限公司 A kind of novel biopsy method
CN109840467A (en) * 2018-12-13 2019-06-04 北京飞搜科技有限公司 A kind of in-vivo detection method and system
CN110008783A (en) * 2018-01-04 2019-07-12 杭州海康威视数字技术股份有限公司 Human face in-vivo detection method, device and electronic equipment based on neural network model
US20200005061A1 (en) * 2018-06-28 2020-01-02 Beijing Kuangshi Technology Co., Ltd. Living body detection method and system, computer-readable storage medium
CN111368731A (en) * 2020-03-04 2020-07-03 上海东普信息科技有限公司 Silent in-vivo detection method, silent in-vivo detection device, silent in-vivo detection equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6295369B1 (en) * 1998-11-06 2001-09-25 Shapiro Consulting, Inc. Multi-dimensional color image mapping apparatus and method
US20140270409A1 (en) * 2013-03-15 2014-09-18 Eyelock, Inc. Efficient prevention of fraud
US20180025217A1 (en) * 2016-07-22 2018-01-25 Nec Laboratories America, Inc. Liveness detection for antispoof face recognition
CN110008783A (en) * 2018-01-04 2019-07-12 杭州海康威视数字技术股份有限公司 Human face in-vivo detection method, device and electronic equipment based on neural network model
US20200005061A1 (en) * 2018-06-28 2020-01-02 Beijing Kuangshi Technology Co., Ltd. Living body detection method and system, computer-readable storage medium
CN109034102A (en) * 2018-08-14 2018-12-18 腾讯科技(深圳)有限公司 Human face in-vivo detection method, device, equipment and storage medium
CN109598242A (en) * 2018-12-06 2019-04-09 中科视拓(北京)科技有限公司 A kind of novel biopsy method
CN109840467A (en) * 2018-12-13 2019-06-04 北京飞搜科技有限公司 A kind of in-vivo detection method and system
CN111368731A (en) * 2020-03-04 2020-07-03 上海东普信息科技有限公司 Silent in-vivo detection method, silent in-vivo detection device, silent in-vivo detection equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
汪亚航;宋晓宁;吴小俊;: "结合混合池化的双流人脸活体检测网络", 中国图象图形学报, no. 07 *
牛德姣, 詹永照, 宋顺林: "实时视频图像中的人脸检测与跟踪", 计算机应用, no. 06 *

Similar Documents

Publication Publication Date Title
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
Li et al. Building-a-nets: Robust building extraction from high-resolution remote sensing images with adversarial networks
CN111767882A (en) Multi-mode pedestrian detection method based on improved YOLO model
CN109614921B (en) Cell segmentation method based on semi-supervised learning of confrontation generation network
Costea et al. Creating roadmaps in aerial images with generative adversarial networks and smoothing-based optimization
CN110379020B (en) Laser point cloud coloring method and device based on generation countermeasure network
CN110929569A (en) Face recognition method, device, equipment and storage medium
CN109657715B (en) Semantic segmentation method, device, equipment and medium
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN106295645B (en) A kind of license plate character recognition method and device
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN112036260B (en) Expression recognition method and system for multi-scale sub-block aggregation in natural environment
CN111985281B (en) Image generation model generation method and device and image generation method and device
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN112489143A (en) Color identification method, device, equipment and storage medium
CN113128481A (en) Face living body detection method, device, equipment and storage medium
CN110348358B (en) Skin color detection system, method, medium and computing device
CN112836625A (en) Face living body detection method and device and electronic equipment
CN113065568A (en) Target detection, attribute identification and tracking method and system
CN113128308B (en) Pedestrian detection method, device, equipment and medium in port scene
CN109740527B (en) Image processing method in video frame
CN115147936A (en) Living body detection method, electronic device, storage medium, and program product
CN111126155A (en) Pedestrian re-identification method for generating confrontation network based on semantic constraint
CN114372931A (en) Target object blurring method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination