CN114092994A - Human face living body detection method based on multi-view feature learning - Google Patents

Human face living body detection method based on multi-view feature learning Download PDF

Info

Publication number
CN114092994A
CN114092994A CN202111192064.2A CN202111192064A CN114092994A CN 114092994 A CN114092994 A CN 114092994A CN 202111192064 A CN202111192064 A CN 202111192064A CN 114092994 A CN114092994 A CN 114092994A
Authority
CN
China
Prior art keywords
face
training
attack
model
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111192064.2A
Other languages
Chinese (zh)
Inventor
毋立芳
王竹铭
徐姚文
简萌
石戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202111192064.2A priority Critical patent/CN114092994A/en
Publication of CN114092994A publication Critical patent/CN114092994A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a human face in-vivo detection method based on multi-view feature learning, which is characterized in that group classification is carried out from multiple views such as 'true + mask' and 'true + video', feature extraction models of multiple different views are trained for extracting detection features with discriminative power, and the detection features are fused through a binary classification model and then used for authenticity classification. The invention can relieve the problem that the identification performance of the existing face living body detection method is reduced after the attack types are expanded, and enhance the capability of defending the attack, thereby improving the precision of face living body detection.

Description

Human face living body detection method based on multi-view feature learning
Technical Field
The invention belongs to the technical field of human face in-vivo detection, and relates to a human face in-vivo detection method based on multi-view feature learning.
Background
In recent years, with the wide application of face recognition technology in identity authentication systems such as financial payment and door access unlocking, face forgery attacks are more and more performed on the face recognition technology. An attacker can easily deceive the face recognition system by using identification means such as photo printing, video playback, 3D masks and the like, thereby causing serious threats to the personal property and even social public security of users. The face live detection technology is generated accordingly.
However, in the existing method, the human face living body detection problem is generally regarded as a binary classification problem between a real human face and a forged human face, and classification is only performed at the visual angle of' real vs. In fact, different attack types present different forgery characteristics, while forgery characteristics that are well able to detect one attack type (which has characteristics not possessed by a real face) may not exist in another attack type, trying to find some common forgery characteristics that can detect multiple attack types may only result in a compromise of characteristics of different attack types, rather than an optimal choice for each attack type. On the other hand, from different perspectives, there are common features between the attack type and the real human face. For example, real faces and faces have depth information, while photos and videos do not; real faces and videos have dynamic information, while photos and masks do not.
Thus, existing approaches can produce significant performance degradation after the attack type is extended. For example, a model that can achieve good performance when only two attack types of photos and videos with similar false clues are considered, and the performance of the model is significantly reduced after a mask attack with a large difference from the photo and video attacks with the similar false clues is introduced.
Disclosure of Invention
In order to solve the problems, the invention provides a human face living body detection method based on multi-view characteristic learning, which carries out group-level classification training model and extracts characteristics by using a plurality of views of 'true + given attack type vs. other attack types', and further fuses and realizes true and false human face identification. The invention considers the difference between different attack types and the commonality between the different attack types and the real face, can relieve the problem that the identification performance of the existing face living body detection method is reduced after the attack types are expanded, and enhances the capability of defending the attack, thereby improving the precision of the face living body detection.
The method comprises the following specific steps:
(1) selecting training samples and labeling: collecting non-living human face samples for training, such as photos, videos, masks and the like, and labeling according to attack types; collecting a living body face sample, and marking the living body face sample as a real face class;
(2) group-level classification training based on "true + given attack type vs. remaining attack types" perspective: carrying out group-level classification training by taking a real face sample and a given type of attack sample as a positive sample group and the other types of attack samples as negative sample groups to obtain a feature extraction model based on a 'real + given attack type' visual angle;
(3) multi-view based feature extraction: selecting a plurality of different visual angles for training and obtaining corresponding feature extraction models based on the group classification training method in the step (2), and extracting detection features of the corresponding visual angles;
(4) training a binary classification model: taking a plurality of detection features based on different visual angles obtained in the step (3) as the input of a binary classification model, and training the binary classification model by taking a real face class as a positive sample and all non-living faces as negative samples;
(5) face living body detection based on multi-view feature learning: and (4) taking a face image to be detected, obtaining a plurality of detection characteristics based on different visual angles through the models obtained in the step (3), and inputting the binary classification model obtained in the step (4) to classify the true face and the false face.
Further, in the step (1), the non-living human face sample for training is collected, wherein the non-living human face sample comprises three attack types of a photo, a video and a mask.
Further, the method comprises two forms of specific model design:
1) in the step (3), only feature extraction models of two visual angles of 'true + mask' and 'true + video' are selected;
2) the following steps are added between the step (2) and the step (3): carrying out classification training by taking a real face sample as a positive sample and all types of attack samples as negative samples to obtain a characteristic extraction model of a real vs. all attack types visual angle; in the step (3), a feature extraction model of three visual angles, namely ' true + mask ', ' true + video ', ' true vs.
Further, in the method, a CDCN model designed by Zitong Yu et al in 2020 is adopted as a backbone network in a feature extraction model based on two visual angles of 'true + mask' and 'true vs. all attack types'; a 3D-CDCN model designed by Yaowen Xu et al in 2021 is adopted as a backbone network based on a feature extraction model of a real + video visual angle; the binary classification model is a three-layer convolution network which is designed independently, the sizes of convolution kernels are all 3 x 3, stride is 1, padding is 1, each layer of the first two layers is connected with a BatchNorm layer and a Relu layer, and the layer of the third layer is connected with the Relu layer.
Further, in the step (5), the judgment method of the human face living body detection is as follows: and taking the output of the binary classification model as a score map, taking the average value of all elements of the score map as a classification score, and if the score is greater than a classification threshold value, judging that the face image to be detected is a living face.
The invention has the following advantages:
1) by adopting a multi-view characteristic learning strategy, the method pays attention to the differences among different attack types and the commonalities between the different attack types and the real human face, and avoids the reduction of characteristic identification caused by simultaneous identification of multiple attacks.
2) The method selects the feature extraction models of two visual angles of 'real + mask' and 'real + video', gives consideration to the depth and dynamic information of the real face, describes the real face more comprehensively and meticulously, and enhances the discrimination of face living body detection.
Description of the drawings:
FIG. 1 is a schematic diagram of a model framework of the present invention.
FIG. 2 is a diagram of group level classification from multiple views.
Detailed Description
The invention provides a human face living body detection method based on multi-view feature learning. The following describes specific implementation steps of the present invention with reference to specific examples.
Referring to fig. 1, in this example, a photo, a video, and a mask attack are selected as a non-living human face sample, and two viewing angles of "true + mask" and "true + video" are selected for feature extraction. The method comprises the following specific steps:
(1) selecting training samples and labeling: collecting non-living human face samples for training, wherein the non-living human face samples comprise three types of photos, videos and masks, and are sequentially marked as
Figure BDA0003301590790000041
Collecting living human face sample, marking as real human face class IR(ii) a The samples are sequences of eight frames long.
(2) Group-level classification training based on "true + given attack type vs. remaining attack types" perspective: carrying out group-level classification training by taking a real face sample and a given type of attack sample as a positive sample group and the other types of attack samples as negative sample groups to obtain a feature extraction model based on a 'real + given attack type' visual angle; if "true + mask" is taken as the visual angle, then
Figure BDA0003301590790000042
In the case of a positive sample,
Figure BDA0003301590790000043
is a negative sample; with "true + video" as the view angle, then IR
Figure BDA0003301590790000044
In the case of a positive sample,
Figure BDA0003301590790000045
is a negative sample;
(3) training a multi-view-based feature extraction model: training and obtaining a feature extraction model E of two visual angles of 'true + mask' and 'true + video' based on the group classification training method in the step (2)rmAnd ErvFor extracting detection feature P based on two visual angles of ' true + mask ' and ' true + videormAnd Prv
As shown in fig. 2, the conventional method only uses the view angle of "true vs. all attack types" for classification, while the present invention uses additional view angles such as "true + mask", "true + video", etc. Wherein E isrmA CDCN model designed by Zitong Yu et al in 2020 is used as a backbone network, the output form and the supervision mode of the CDCN model are consistent with those of the CDCN model, and the output of the penultimate layer of the model is additionally taken as Prm;ErvA3D-CDCN model designed by Yaowen Xu et al in 2021 is used as a backbone network, the output form and the supervision mode of the 3D-CDCN model are consistent with those of the 3D-CDCN model, the output of the model in the penultimate layer is additionally taken to be averaged according to the time sequence and is sampled to be equal to PrmThe same shape is taken as Prv. I.e. PrmAnd PrvAll the shapes of (1) and (2) are 64 x 32. ErmThe input of (a) is the first frame of an eight-frame image sequence, ErvThe input of (2) is a sequence of eight frame images.
(4) Training a binary classification model: detecting characteristics P obtained in the step (3) based on two visual angles of ' real + mask ' and ' real + videormAnd PrvAs input to a binary classification model F, with a real face class IRAs positive sample, non-living human face
Figure BDA0003301590790000046
Training a binary classification model F for the negative samples;
(5) face living body detection based on multi-view feature learning: taking a face image sequence to be detected with the length of eight frames, and respectively inputting the two models E obtained in the step (3)rmAnd ErvObtaining two detection characteristics P based on different visual anglesrmAnd PrvThen inputting the binary classification model F obtained in the step (4) to obtain binary classification inputGo out of PfThe face authentication method is used for face authentication.
The binary classification model F is of a three-layer convolution structure, the convolution kernel size is 3 x 3, stride is 1, padding is 1, each of the first two layers is connected with a BatchNorm layer and a Relu layer, and the third layer is connected with a Relu layer. PfThe shape of (2) is 32 x 32, the average value of all the elements of the face image is taken as a classification score, and if the score is greater than a classification threshold value, the face image to be detected is judged to be a living face.
The classification threshold value selection mode is as follows: and searching a threshold from 0 to 1 by taking the false face classification error rate (APCER) and the living face classification error rate (BPCER) in the standard file ISO/IEC 30107-3 in the aspect of biological recognition false attack prevention as performance evaluation indexes, and taking the threshold when the APCER and the BPCER in a verification set (or a training set) are equal as a classification threshold used in a testing stage. The threshold is determined by a specific experimental scene, from experimental empirical analysis, 0.5 can be selected as a general classification threshold, and compared with the performance when the optimal threshold is selected, the classification error rate is increased within 2%.
In order to prove the effectiveness of the invention, the invention is tested on a common test data set, and the result shows that the method can obtain good detection performance.
In addition to the routine performance evaluation experiments, we performed an additional set of comparative tests: introducing a sample of a new attack type into the existing method for retraining, and finding that the classification performance is remarkably reduced; the method is optimized by the thought of multi-view learning, the model structure is not changed, only the group-level classification labels are modified, and the performance reduction condition is obviously relieved. A comparison experiment proves that the method can effectively solve the problem that the identification performance of the existing face living body detection method is reduced after the attack types are expanded, enhance the attack defense capability and improve the face living body detection precision.

Claims (8)

1. A human face living body detection method based on multi-view feature learning is characterized by comprising the following steps:
(1) selecting training samples and labeling: collecting a non-living body face sample for training, and labeling according to an attack type; collecting a living body face sample, and marking the living body face sample as a real face class;
(2) group-level classification training based on "true + given attack type vs. remaining attack types" perspective: carrying out group-level classification training by taking a real face sample and a given type of attack sample as a positive sample group and the other types of attack samples as negative sample groups to obtain a feature extraction model based on a 'real + given attack type' visual angle;
(3) multi-view based feature extraction: selecting a plurality of different visual angles for training and obtaining corresponding feature extraction models based on the group classification training method in the step (2), and extracting detection features of the corresponding visual angles;
(4) training a binary classification model: taking a plurality of detection features based on different visual angles obtained in the step (3) as the input of a binary classification model, and training the binary classification model by taking a real face class as a positive sample and all non-living faces as negative samples;
(5) face living body detection based on multi-view feature learning: and (4) taking a face image to be detected, obtaining a plurality of detection characteristics based on different visual angles through the models obtained in the step (3), and inputting the binary classification model obtained in the step (4) to classify the true face and the false face.
2. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: in the step (1), the collected non-living human face samples for training comprise three attack types of photos, videos and masks.
3. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: in the step (3), only the feature extraction models of the two visual angles of 'true + mask' and 'true + video' are selected.
4. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: the following steps are added between the step (2) and the step (3): carrying out classification training by taking a real face sample as a positive sample and all types of attack samples as negative samples to obtain a characteristic extraction model of a real vs. all attack types visual angle; in the step (3), a feature extraction model of three visual angles, namely ' true + mask ', ' true + video ', ' true vs.
5. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: in the step (2), a CDCN model is adopted as a backbone network by a feature extraction model based on 'real + mask' visual angle training; and the feature extraction model based on the 'real + video' view angle adopts a 3D-CDCN model as a backbone network.
6. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: in the step (4), the binary classification model is a three-layer convolution network which is designed independently, the sizes of convolution kernels are all 3 × 3, stride is 1, padding is 1, each of the first two layers is connected with a BatchNorm layer and a Relu layer, and the third layer is connected with the Relu layer.
7. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: in the step (5), the judgment method of the human face living body detection is as follows: and taking the output of the binary classification model as a score map, taking the average value of all elements of the score map as a classification score, and if the score is greater than a classification threshold value, selecting 0.5 as the classification threshold value, and judging that the face image to be detected is a living face.
8. The face live detection method based on multi-view feature learning according to claim 1, characterized in that: in the step (2), the CDCN model is adopted as a backbone network by the feature extraction model based on the visual angle training of the real vs. all attack types.
CN202111192064.2A 2021-10-13 2021-10-13 Human face living body detection method based on multi-view feature learning Pending CN114092994A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111192064.2A CN114092994A (en) 2021-10-13 2021-10-13 Human face living body detection method based on multi-view feature learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111192064.2A CN114092994A (en) 2021-10-13 2021-10-13 Human face living body detection method based on multi-view feature learning

Publications (1)

Publication Number Publication Date
CN114092994A true CN114092994A (en) 2022-02-25

Family

ID=80296811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111192064.2A Pending CN114092994A (en) 2021-10-13 2021-10-13 Human face living body detection method based on multi-view feature learning

Country Status (1)

Country Link
CN (1) CN114092994A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122709A (en) * 2017-03-17 2017-09-01 上海云从企业发展有限公司 Biopsy method and device
CN108960086A (en) * 2018-06-20 2018-12-07 电子科技大学 Based on the multi-pose human body target tracking method for generating confrontation network positive sample enhancing
CN109117755A (en) * 2018-07-25 2019-01-01 北京飞搜科技有限公司 A kind of human face in-vivo detection method, system and equipment
US20190014999A1 (en) * 2017-07-14 2019-01-17 Hong Kong Baptist University 3d mask face anti-spoofing with remote photoplethysmography
WO2019152983A2 (en) * 2018-02-05 2019-08-08 Board Of Trustees Of Michigan State University System and apparatus for face anti-spoofing via auxiliary supervision
CN110472519A (en) * 2019-07-24 2019-11-19 杭州晟元数据安全技术股份有限公司 A kind of human face in-vivo detection method based on multi-model
CN111160313A (en) * 2020-01-02 2020-05-15 华南理工大学 Face representation attack detection method based on LBP-VAE anomaly detection model
US20200175260A1 (en) * 2018-11-30 2020-06-04 Qualcomm Incorporated Depth image based face anti-spoofing
CN112990347A (en) * 2021-04-08 2021-06-18 清华大学 Sample classification method and device based on unbiased sample learning algorithm PU _ AUL
CN113312965A (en) * 2021-04-14 2021-08-27 重庆邮电大学 Method and system for detecting unknown face spoofing attack living body

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122709A (en) * 2017-03-17 2017-09-01 上海云从企业发展有限公司 Biopsy method and device
US20190014999A1 (en) * 2017-07-14 2019-01-17 Hong Kong Baptist University 3d mask face anti-spoofing with remote photoplethysmography
WO2019152983A2 (en) * 2018-02-05 2019-08-08 Board Of Trustees Of Michigan State University System and apparatus for face anti-spoofing via auxiliary supervision
CN108960086A (en) * 2018-06-20 2018-12-07 电子科技大学 Based on the multi-pose human body target tracking method for generating confrontation network positive sample enhancing
CN109117755A (en) * 2018-07-25 2019-01-01 北京飞搜科技有限公司 A kind of human face in-vivo detection method, system and equipment
US20200175260A1 (en) * 2018-11-30 2020-06-04 Qualcomm Incorporated Depth image based face anti-spoofing
CN110472519A (en) * 2019-07-24 2019-11-19 杭州晟元数据安全技术股份有限公司 A kind of human face in-vivo detection method based on multi-model
CN111160313A (en) * 2020-01-02 2020-05-15 华南理工大学 Face representation attack detection method based on LBP-VAE anomaly detection model
CN112990347A (en) * 2021-04-08 2021-06-18 清华大学 Sample classification method and device based on unbiased sample learning algorithm PU _ AUL
CN113312965A (en) * 2021-04-14 2021-08-27 重庆邮电大学 Method and system for detecting unknown face spoofing attack living body

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王竹铭: "基于类间关系学习的人脸活体检测算法研究", 中国优秀硕士学位论文全文数据库 信息科技辑, 15 March 2024 (2024-03-15), pages 138 - 1102 *

Similar Documents

Publication Publication Date Title
Nguyen et al. Modular convolutional neural network for discriminating between computer-generated images and photographic images
CN110516616A (en) A kind of double authentication face method for anti-counterfeit based on extensive RGB and near-infrared data set
CN111160286B (en) Video authenticity identification method
Abidin et al. Copy-move image forgery detection using deep learning methods: a review
Kharrazi et al. Improving steganalysis by fusion techniques: A case study with image steganography
CN113312965B (en) Face unknown spoofing attack living body detection method and system
CN114067444A (en) Face spoofing detection method and system based on meta-pseudo label and illumination invariant feature
CN1760887A (en) The robust features of iris image extracts and recognition methods
CN113128481A (en) Face living body detection method, device, equipment and storage medium
CN114663986B (en) Living body detection method and system based on double decoupling generation and semi-supervised learning
CN113361474B (en) Double-current network image counterfeiting detection method and system based on image block feature extraction
CN114842524A (en) Face false distinguishing method based on irregular significant pixel cluster
Chen et al. A study on the photo response non-uniformity noise pattern based image forensics in real-world applications
Peng et al. Face morphing attack detection and attacker identification based on a watchlist
CN112200075A (en) Face anti-counterfeiting method based on anomaly detection
CN114092994A (en) Human face living body detection method based on multi-view feature learning
US20230084980A1 (en) System for detecting face liveliness in an image
CN113723215B (en) Training method of living body detection network, living body detection method and device
CN115187789A (en) Confrontation image detection method and device based on convolutional layer activation difference
Patel et al. An optimized convolution neural network based inter-frame forgery detection model—a multi-feature extraction framework
CN117496601B (en) Face living body detection system and method based on fine classification and antibody domain generalization
Abrahim et al. Image Splicing Forgery Detection Scheme Using New Local Binary Pattern Varient
He et al. Dynamic Residual Distillation Network for Face Anti-Spoofing With Feature Attention Learning
Jia et al. Enhanced face morphing attack detection using error-level analysis and efficient selective kernel network
CN113158838B (en) Full-size depth map supervision-based face representation attack detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination