CN115601818B - Lightweight visible light living body detection method and device - Google Patents
Lightweight visible light living body detection method and device Download PDFInfo
- Publication number
- CN115601818B CN115601818B CN202211503095.XA CN202211503095A CN115601818B CN 115601818 B CN115601818 B CN 115601818B CN 202211503095 A CN202211503095 A CN 202211503095A CN 115601818 B CN115601818 B CN 115601818B
- Authority
- CN
- China
- Prior art keywords
- living body
- target
- face
- visible light
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 60
- 238000013528 artificial neural network Methods 0.000 claims abstract description 42
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000012544 monitoring process Methods 0.000 claims abstract description 15
- 239000000284 extract Substances 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 64
- 238000000034 method Methods 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 21
- 230000003595 spectral effect Effects 0.000 claims description 13
- 108091006146 Channels Proteins 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 11
- 238000001228 spectrum Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 101001093143 Homo sapiens Protein transport protein Sec61 subunit gamma Proteins 0.000 claims description 3
- 101000694017 Homo sapiens Sodium channel protein type 5 subunit alpha Proteins 0.000 claims description 3
- 101100120905 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) TDH1 gene Proteins 0.000 claims description 3
- 102100027198 Sodium channel protein type 5 subunit alpha Human genes 0.000 claims description 3
- 238000013459 approach Methods 0.000 claims description 3
- 230000017531 blood circulation Effects 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims 1
- 239000008280 blood Substances 0.000 abstract description 2
- 210000004369 blood Anatomy 0.000 abstract description 2
- 238000001574 biopsy Methods 0.000 description 13
- 238000003384 imaging method Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 238000001727 in vivo Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000004397 blinking Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003331 infrared imaging Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000000741 silica gel Substances 0.000 description 1
- 229910002027 silica gel Inorganic materials 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/70—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in livestock or poultry
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a lightweight visible light living body detection method and a device, wherein the lightweight visible light living body detection method utilizes a visible light living body detection model to carry out living body discrimination on a human face in a visible light original image, the visible light living body detection model comprises a deep neural network, a first full-connection network and a second full-connection network, and an auxiliary monitoring network for carrying out auxiliary learning on green light intensity characteristics is introduced during training of the visible light living body detection model. Based on the principle that blood in a living body has certain intensity distribution in the green light direction when flowing through skin, the deep neural network is trained, the auxiliary monitoring network is arranged, and the auxiliary monitoring network assists the deep neural network to accurately extract living body characteristics of a human face, so that the problem that a silent living body detection method based on a visible light image in the prior art cannot resist 3D non-living body attack is solved, the living body detection accuracy is improved, and the light weight is achieved.
Description
Technical Field
The invention belongs to the technical field of image processing and target identification, and particularly relates to a light-weight visible light living body detection method and device.
Background
The liveness detection technology mainly discriminates whether a face appearing in front of a machine is real or fake, wherein faces appearing by means of other media can be defined as false faces, including printed photos, screen images, silica gel masks, stereoscopic 3D (three-dimensional) figures and the like. Currently, mainstream biopsy schemes include coordinated biopsy and uncoordinated biopsy (silent biopsy), and the like. The fitting type living body detection requires a user to complete a specified action according to a prompt and then perform living body verification, and can be called dynamic living body detection. The silent living body detection is opposite to the dynamic living body detection, and whether the living body is a real living body is judged mainly under the condition that a series of actions such as blinking, mouth opening and the like are not matched. Therefore, the technical realization difficulty of the silent live body detection is higher, the requirement on accuracy in practical application is higher, and meanwhile, the silent live body detection directly carries out live body verification under the condition that a user does not feel, so that better user experience is achieved.
The silent in-vivo detection is generally divided into three technical routes of infrared images, 3D structural light and visible light images according to different imaging sources: the infrared image filters light rays in a specific wave band, and the false face attack based on screen imaging is naturally resisted; depth information is introduced into the 3D structured light, and false face attacks of 2D media such as paper photos, screen imaging and the like can be easily distinguished; the visible light picture is mainly distinguished through Moire patterns, paper photo reflection and other detailed information which appear in screen shooting. Based on the above analysis, it is found that compared with the other two methods, the living body detection based on the visible light image can only perform discrimination by the information of the image itself, and thus the method faces a greater challenge in an actual open scene.
However, the silent in-vivo detection based on the visible light image has the advantages of high identification speed, simplicity and convenience in operation, non-contact type and the like, and in addition, compared with an infrared imaging device and a 3D structured light imaging device, the visible light imaging device is lower in cost and high in integration level, and the main flow directions of the existing face identification system are all visible light imaging devices, so that the method has important value in research on a method for performing in-vivo detection based on visible light imaging. Meanwhile, with the popularization of technologies such as 5G and AI, the world of everything interconnection has come, so that the face recognition technology has been widely applied to various types of interconnection devices, including various interconnection devices at the edge, and the application of the face recognition technology in the interconnection devices at the edge needs to consider the calculation power and power consumption of the interconnection devices at the edge, so that how to realize the lightweight of an algorithm process is also a problem that needs to be considered in the research of a method for performing living body detection based on visible light imaging, and the face recognition technology is suitable for the interconnection devices at the edge with extremely limited calculation power.
Disclosure of Invention
The invention aims to overcome one or more defects in the prior art and provide a light-weight visible light living body detection method and device.
The purpose of the invention is realized by the following technical scheme:
first aspect
The invention provides a light-weight visible light living body detection method, which comprises the following steps:
s1, acquiring a visible light original image to be processed;
s2, recognizing a human face target from a visible light original image to be processed by utilizing a pre-constructed visible light living body detection model, and determining that the human face target is a living body or a non-living body;
the construction process of the visible light living body detection model is as follows:
SS1, constructing a deep neural network, wherein the deep neural network is used for acquiring a historical visible light original image, extracting a target feature in the historical visible light original image and generating a target feature matrix, the target feature comprises a green light intensity feature, and the green light intensity feature is an intensity distribution feature of green light when blood flows through skin;
SS2, constructing a first fully-connected network, wherein the first fully-connected network is used for receiving the target feature matrix and identifying the position and the size of a human face target in the target feature matrix;
SS3, extracting a face feature matrix in a target feature matrix based on the position and the size of the face target, and performing global maximization processing on the face feature matrix to obtain living body distinguishing feature vectors after the global maximization processing;
SS4, constructing a second fully-connected network, wherein the second fully-connected network is used for receiving the living body distinguishing feature vector and determining that the current face target is a living body or a non-living body according to the living body distinguishing feature vector;
SS5, training the deep neural network, the first fully-connected network and the second fully-connected network by using training samples, introducing an auxiliary monitoring network when the deep neural network is trained, taking a loss function as training constraint, obtaining network parameters of the deep neural network, the first fully-connected network, the second fully-connected network and the auxiliary monitoring network after training is finished, and then generating a visible light living body detection model based on the network parameters of the deep neural network, the first fully-connected network and the second fully-connected network;
the auxiliary supervision network is used for auxiliary supervision when the deep neural network extracts the green light intensity characteristics.
Preferably, in the SS2, the position of the human face target in the target feature matrix is identified based on a non-maximum suppression algorithm.
Preferably, the SS3 specifically includes the following sub-steps:
SS31, and face feature matrix F in the extracted target feature matrix based on the position and size of the face target H ×F W ×N;
SS32, respectively for N F H ×F W The x 1 matrix finds the maximum value, and generates a living body discrimination feature vector from the N maximum values obtained.
Preferably, in the SS4, determining that the current face target is a living body or a non-living body according to the living body discrimination feature vector specifically includes the following sub-steps:
SS41, a second full-connection network classifies the obtained living body distinguishing feature vector and outputs the probability that the current face target is a living body and the probability that the current face target is a non-living body;
SS42, if the probability that the current face target is a living body is larger than the probability that the current face target is a non-living body, determining that the current face target is a living body; and if the probability that the current face target is a living body is smaller than the probability that the current face target is a non-living body, determining that the current face target is the non-living body.
Preferably, the auxiliary supervised network comprises a supervised learning network and a first spectral feature extraction network;
the first spectral feature extraction network is used for intercepting a face image from a historical visible light original image according to the position and the size of a face target, extracting green light intensity components of the face image, and then generating green light component spatial spectral features of the face image based on Fourier transform;
and the supervised learning network is used for receiving the target feature matrix, extracting a single face feature matrix in the target feature matrix based on the position and the size of a face target, then performing learning supervision, and enabling the green light intensity feature in the single face feature matrix to approach the green light component spatial spectrum feature after the learning supervision.
Preferably, the visible light original image is an RGB three-channel image; the method for extracting the green light intensity component of the face image specifically comprises the following substeps: based on a first formula, the face image I f The RGB three-channel numerical values of each pixel point are all converted into single green component numerical values, and the first formula is as follows:
representing a face image I f The value of the 0 th channel of the nth column pixel point in the mth row->Representing a face image I f The value of the 1 st channel of the nth pixel point in the mth row->Representing a face image I f The value of the 2 nd channel of the nth pixel point in the mth row->Representing a face image I f Middle mth row and nth column pixel pointThe transformed values.
Preferably, the fourier transform-based generation of the spatial spectral feature of the green light component of the face image specifically includes the following sub-steps:
SSS1, performing Fourier transform on the face image after the green light intensity component is extracted;
and SSS2, performing normalization calculation by taking a Fourier transform module, and then obtaining the green light component spatial spectrum characteristic of the face image.
Preferably, in the SS1, before extracting a target feature matrix in the historical visible light original image, the deep neural network scales the received historical visible light original image, where the scaled visible light original image is 256 × 256 × 3, and the target feature matrix is 8 × 8 × 128; in the SS2, when the first fully-connected network identifies the position of the face target in the target feature matrix, the preset prior frame sizes include 192 × 192, 128 × 128, and 32 × 32.
Preferably, the loss function,/>Is a preset first weight coefficient,is a preset second weight factor, < > based on the comparison>Is a preset third weight coefficient>Is a preset fourth weight factor, < > based on>Represents the classification loss when the human face and the non-human face are distinguished, and is judged>Represents a loss in regression for the location of a human face target>Represents the classification loss in the discrimination of a living body and a non-living body>Representing a green light intensity feature learning loss;
wherein,n is the number of training samples; />Indicating whether the jth prior frame in the ith grid is responsible for detecting the true value of the face, if->=1, it means that the jth prior frame in the ith grid is responsible for detecting the face, if & ->=0, it means that the jth prior frame in the ith grid is not responsible for detecting the face; />Whether the ith grid contains the true value of the central point of the jth prior frame or not is shown, if so=1, then indicate that the ith grid contains the jth prior frame center point, if £ is present>If =0, it means that the ith grid does not contain the jth prior frame center point; the number of the grids is 64, and each grid corresponds to each characteristic diagram in the target characteristic matrix one by one; />An output value representing whether the ith grid contains the jth prior frame center point, if >>=1, then indicate that the ith grid contains the jth prior frame center point, if £ is present>If =0, it means that the ith grid does not contain the jth prior frame center point;
in which>A center point coordinate estimate representing the jth prior frame in the ith grid,/>>Represents the true value of the center point coordinate of the jth prior frame in the ith grid, and/or is greater than the true value of the center point coordinate of the jth prior frame in the ith grid>Represents the width estimate, based on the value of the jth prior frame in the ith trellis>Represents a height estimate, based on the jth prior frame in the ith grid>Actual value representing the width of the jth prior box in the ith grid>Representing the true height value of the jth prior box in the ith grid;
wherein is present>A true label representing the training sample, device for selecting or keeping>Indicates the probability that the current face target is not a living body>Representing the probability that the current face target is a living body;
wherein is present>Represents the output value of the auxiliary monitoring network after the auxiliary deep neural network learns the green light intensity characteristic, and is/is selected>Represents the true value of the spatial spectral feature of the green light component, and->Represents->And &>The distance between them.
The first aspect of the invention brings the following beneficial effects:
(1) On the basis of the principle that blood in a living body has certain intensity distribution in the green light direction when flowing through skin, an auxiliary monitoring network is arranged when a deep neural network is trained, and the auxiliary monitoring network assists the deep neural network to accurately extract living body characteristics (green light intensity characteristics) of the human face, so that the living body judgment based on the living body characteristics of the human face is further realized on the basis that a generated visible light living body detection model carries out living body judgment through some detail information such as moire patterns, paper photo reflection and the like generated by screen shooting, the problem that a silent living body detection method based on a visible light image in the prior art cannot resist 3D non-living body attack is solved, and the accuracy of living body detection is improved;
(2) The backbone network of the visible light living body detection model only comprises a deep neural network, and simultaneously, the tasks of face detection and living body identification are completed, the calculated amount of the face living body detection process is reduced, and the light weight is realized, so that the requirement of the visible light living body detection method realized by the embodiment of the invention on the calculation resource is reduced, and the delay of the living body detection process is also reduced, so that the living body detection speed is improved on the basis of improving the living body detection accuracy;
(3) The face living body detection is light, so that the visible light living body detection method realized by the embodiment of the invention can be suitable for interconnection equipment at the edge end, and meanwhile, the cost, the volume, the power consumption and the delay of the interconnection equipment in the face living body recognition are reduced.
Second aspect of the invention
A second aspect of the present invention provides a light-weight visible light biopsy device, comprising a memory for storing the light-weight visible light biopsy method according to the first aspect of the present invention, and a processor for calling the light-weight visible light biopsy method stored in the memory to perform biopsy.
The second aspect of the present invention brings about the same advantageous effects as the first aspect, and will not be described in detail herein.
Drawings
FIG. 1 is a flow chart of a method for light-weighted visible light biopsy;
FIG. 2 is a flow chart of a construction of a visible light living body detection model;
FIG. 3 is a schematic diagram of a visible light living body detection model;
fig. 4 is a schematic diagram of a deep neural network.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of protection of the present invention.
Example one
Referring to fig. 1 to 4, the embodiment provides a light-weight visible light living body detection method, including the following steps:
s1, acquiring a visible light original image to be processed. In this embodiment, the visible light original image is obtained from the visible light imaging device and is an RGB three-channel image.
And S2, recognizing a human face target from the original visible light image to be processed by utilizing a pre-constructed visible light living body detection model, and determining that the human face target is a living body or a non-living body.
The construction process of the visible light living body detection model is as follows:
and SS1, constructing a deep neural network, wherein the deep neural network is used for acquiring a historical visible light original image, extracting target characteristics in the historical visible light original image and generating a target characteristic matrix, the target characteristics comprise green light intensity characteristics, and the green light intensity characteristics are intensity distribution characteristics of green light when blood flows through skin.
Before the visible light original image is input into the deep neural network, scaling is carried out, the size of the scaled visible light original image is 256 multiplied by 3, and the scaled visible light original image is used as an input image of the deep neural network. In this embodiment, the deep neural network Net (x) includes eight feature extraction modules, and the eight feature extraction modules extract the target features and then generate a target feature matrix, where the size of the target feature matrix is 8 × 8 × 128, and a feature vector of 1 × 1 × 128 represents the target features in an image block of 16 × 16 size in the input image. As shown in fig. 4, the eight feature extraction modules include a first 3 × 3 channel separable convolution, a first 1 × 1 convolution, a first active layer, a second 3 × 3 channel separable convolution, a second 1 × 1 convolution, a second active layer, a max-pooling layer, a channel expansion layer, and an addition layer.
And SS2, constructing a first full-connection network, wherein the first full-connection network is used for receiving the target feature matrix and identifying the position and the size of the human face target in the target feature matrix.
Specifically, the first fully-connected network FC1 (x) receives the target feature matrix, regresses the category, position, and size of each target, and outputs the category, position, and size of each target, where the output of the first fully-connected network FC1 (x) is denoted by Dd (i), and i =0 to 14. Since the aspect ratio of the human face in the input image is close to 1:1, when the position of the human face object in the object feature matrix is identified through the first fully connected network FC1 (x), the prior frame sizes are set to 192 × 192, 128 × 128, and 32 × 32, and the position of the human face object in the object feature matrix is identified based on the non-maximum suppression algorithm.
And SS3, extracting a face feature matrix in the target feature matrix based on the position and size of the face target, performing global maximization processing on the face feature matrix, and obtaining a living body distinguishing feature vector after the global maximization processing.
Optionally, SS3 specifically includes the following sub-steps:
SS31, and face feature matrix F in the target feature matrix extracted based on position and size of face target H ×F W ×N;
SS32, respectively for N F H ×F W Obtaining the maximum value of the matrix x 1, and generating living body distinguishing feature vectors according to the obtained N maximum values; wherein the value of N is 128.
And SS4, constructing a second fully-connected network, wherein the second fully-connected network is used for receiving the living body distinguishing feature vector and determining that the current face target is a living body or a non-living body according to the living body distinguishing feature vector.
Optionally, in SS4, determining that the current face target is a living body or a non-living body according to the living body discrimination feature vector includes the following specific sub-steps:
the SS41 and the second fully-connected network FC2 (f) classify the acquired living body discrimination feature vectors, and output the probability that the current face target is a living body and the probability that the current face target is a non-living body. In this embodiment, the second fully-connected network FC2 (f) preferably outputs the probability that the current face target is a living body and the probability that the current face target is a non-living body through a softmax function, where the probability that the current face target is a living body is expressed asThe probability that the current face target is not a living body is expressed as ≥>。
SS42, if the probability that the current face target is a living body is larger than the probability that the current face target is a non-living body, determining that the current face target is a living body; and if the probability that the current face target is a living body is smaller than the probability that the current face target is a non-living body, determining that the current face target is the non-living body.
SS5, training the deep neural network, the first fully-connected network and the second fully-connected network by using a training sample, introducing an auxiliary monitoring network when the deep neural network is trained, taking a loss function as the training constraint, obtaining network parameters of the deep neural network, the first fully-connected network, the second fully-connected network and the auxiliary monitoring network after the training is finished, and then generating a visible light living body detection model based on the obtained network parameters of the deep neural network, the first fully-connected network and the second fully-connected network, wherein the visible light living body detection model comprises the deep neural network, the first fully-connected network and the second fully-connected network; the auxiliary monitoring network is used for auxiliary monitoring when the deep neural network extracts the green light intensity characteristics.
Optionally, the auxiliary supervised network includes a supervised learning network and a first spectral feature extraction network.
The first spectral feature extraction network is used for intercepting a face image from a historical visible light original image according to the position and the size of a face target,and extracting green light intensity components of the face image, and then generating green light component spatial spectrum characteristics of the face image based on Fourier transform. In this embodiment, the cut-out face image is scaled and then the green light intensity component is extracted, and the scaled face image has a size of 256 × 256 × 3, which is represented as a face image I f 。
And the supervised learning network is used for receiving the target feature matrix, extracting a single face feature matrix in the target feature matrix based on the position and the size of the face target, then performing learning supervision, and enabling the green light intensity feature in the single face feature matrix to approach the green light component spatial spectrum feature after the learning supervision. In this embodiment, before learning and supervision are performed, the extracted single face feature matrix is further scaled, the size of the scaled single face feature matrix is 8 × 8 × 128, then the single face feature matrix is input into a supervised convolution network C (V), 1 × 1 convolution operation is performed by the supervised convolution network C (V), and the size of the single face feature matrix after 1 × 1 convolution operation is 8 × 8 × 1.
Optionally, face image I f Extracting green light intensity component to obtain face image I H Corresponding, face image I H Has a size of 256 × 256 × 1. Wherein, the extraction of the green light intensity component specifically comprises:
based on a first formula, the face image I f The RGB three-channel numerical values of each pixel point are all converted into single green component numerical values, wherein the first formula is as follows:
representing a face image I f The value of the 1 st channel of the mth row and nth column pixel point in the middle row>Representing a face image I f The value of the 2 nd channel of the nth pixel point in the mth row->Representing a face image I f And (4) converting the pixel point of the mth row and the nth column.
Optionally, the face image I is generated based on fourier transform H The spatial spectrum characteristic of the green light component specifically comprises the following sub-steps:
SSS1, face image I after green light intensity component is extracted H Performing Fourier transform;
and SSS2, performing normalization calculation by taking a Fourier transform module, scaling the human face image obtained after the normalization calculation, and obtaining the green light component spatial spectrum characteristic of the human face image after scaling. Wherein the size of the spatial spectral feature of the green light component is 8 × 8 × 1.
Optionally, a loss function,/>Is a preset first weight factor, < > based on the comparison>Is a preset second weight coefficient>Is a preset third weight factor, < > based on the comparison>Is a preset fourth weight factor, < > based on>Represents the classification loss when the human face and the non-human face are distinguished, and is judged>Represents a loss in the regression of the position of the human face target>Represents the classification loss in the discrimination of a living body and a non-living body>Indicating a loss of learning of the green intensity features.
Wherein,n is the number of training samples; />Indicating whether the jth prior frame in the ith grid is responsible for detecting the true value of the face, if->=1, it means that the jth prior frame in the ith grid is responsible for detecting the face, if & ->=0, it means that the jth prior frame in the ith grid is not responsible for detecting the face; />Indicating whether the ith mesh contains the true value of the jth prior box center point,if/or>=1, then indicate that the ith grid includes the jth a priori frame center point, if &>If =0, it means that the ith grid does not contain the jth prior frame center point; the number of the grids is 64, and each grid corresponds to each characteristic diagram in the target characteristic matrix one by one;an output value representing whether the ith grid contains the jth prior frame center point if >>=1, then indicate that the ith grid contains the jth prior frame center point, if £ is present>And =0, it means that the ith mesh does not contain the jth prior frame center point.
,/>A center point coordinate estimate representing the jth prior frame in the ith grid,/>>Represents the true value of the center point coordinate of the jth prior frame in the ith grid, and/or is greater than the true value of the center point coordinate of the jth prior frame in the ith grid>Represents the width estimate, based on the value of the jth prior frame in the ith trellis>Represents a height estimate for the jth prior frame in the ith grid>Represents the true value, based on the width of the jth prior box in the ith trellis>Representing the true height value of the jth prior box in the ith mesh.
,/>An output value representing the learning of the green light intensity characteristic by the auxiliary deep neural network of the auxiliary supervision network is combined>Represents the true value of the spatial spectral feature of the green light component, and->Represents->And/or>The distance between them.
Example two
The embodiment provides a light-weight visible light biopsy device, which comprises a memory and a processor, wherein the memory is used for storing the light-weight visible light biopsy method in the first embodiment, and the processor is used for calling the light-weight visible light biopsy method stored in the memory to perform biopsy.
The lightweight visible light living body detection method realizes the rapid and accurate judgment of whether the human face target in the visible light original image is a living body, and is based on the following principle:
the first visible light living body detection model construction stage:
after a first full-connection network FC1 (x), a second full-connection network FC2 (f) and a deep neural network Net (x) in a backbone are built, when the deep neural network Net (x), the first full-connection network FC1 (x) and the second full-connection network FC2 (f) are trained, an auxiliary monitoring network is introduced, based on the assistance of the auxiliary monitoring network, the deep neural network accurately learns the green light intensity characteristics of the human face, when the human face is classified as a living body, the second full-connection network FC2 (f) classifies whether the current human face is the living body or not based on the green light intensity characteristics of the human face in a target characteristic matrix, and then the probability that the classified output current human face is the living body is compared with the probability that the current human face is the non-living body, so that the judgment result whether the current human face is the living body or not is obtained.
The green light intensity characteristic of the human face is a living body characteristic of the human face, and the green light intensity characteristic is used as a supplement of basic characteristics such as moire and reflection based on a traditional visible light living body detection method.
Secondly, loading a visible light living body detection model on the interconnection equipment:
an online visible light living body detection model is arranged in the interconnection equipment, and comprises a deep neural network Net (x), a first fully-connected network FC1 (x) and a second fully-connected network FC2 (f) which are positioned in the backbone;
interconnection equipment carries out the live body and detects, specifically includes: the method comprises the steps that a deep neural network Net (x) obtains an input image, target feature extraction is conducted, and a target feature matrix is generated; the first fully-connected network FC1 (x) is directed to the class, location and size of each object in the object feature matrixPerforming regression, and outputting the category, position and size of each target so as to obtain the position and size of the human face target; based on the position and size of a human face target, a human face feature matrix is intercepted from a target feature matrix, and a living body distinguishing feature vector is generated after the global maximization operation is carried out on the human face feature matrix; inputting the living body discrimination feature vector into a second full-connection network FC2 (f) to classify the current face as a living body or not, and then outputting the probability P that the current face is the living body t Probability of non-living body P f If P is t >P f Then the current face is live, if P t <P f And the current face is a non-living body.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (7)
1. A light-weight visible light living body detection method is characterized by comprising the following steps:
s1, acquiring a visible light original image to be processed;
s2, recognizing a human face target from a visible light original image to be processed by utilizing a pre-constructed visible light living body detection model, and determining that the human face target is a living body or a non-living body;
the construction process of the visible light living body detection model is as follows:
SS1, constructing a deep neural network, wherein the deep neural network is used for acquiring a historical visible light original image, extracting target features in the historical visible light original image and generating a target feature matrix, the target features comprise green light intensity features, and the green light intensity features are intensity distribution features of green light when blood flows through skin;
SS2, constructing a first fully-connected network, wherein the first fully-connected network is used for receiving the target feature matrix and identifying the position and the size of a human face target in the target feature matrix;
SS3, extracting a face feature matrix in a target feature matrix based on the position and the size of the face target, and performing global maximization processing on the face feature matrix to obtain living body distinguishing feature vectors after the global maximization processing;
SS4, constructing a second fully-connected network, wherein the second fully-connected network is used for receiving the living body distinguishing feature vector and determining that the current face target is a living body or a non-living body according to the living body distinguishing feature vector;
SS5, training the deep neural network, the first fully-connected network and the second fully-connected network by using a training sample, introducing an auxiliary supervision network when the deep neural network is trained, taking a loss function as the training constraint, obtaining network parameters of the deep neural network, the first fully-connected network, the second fully-connected network and the auxiliary supervision network after the training is finished, and then generating a visible light living body detection model based on the network parameters of the deep neural network, the first fully-connected network and the second fully-connected network;
the auxiliary supervision network is used for auxiliary supervision when the deep neural network extracts the green light intensity characteristics;
the auxiliary supervision network comprises a supervision learning network and a first spectral feature extraction network;
the first spectral feature extraction network is used for intercepting a face image from a historical visible light original image according to the position and the size of a face target, extracting green light intensity components of the face image, and then generating green light component spatial spectral features of the face image based on Fourier transform;
the supervised learning network is used for receiving the target feature matrix, extracting a single face feature matrix in the target feature matrix based on the position and the size of a face target, then performing learning supervision, and enabling the green light intensity feature in the single face feature matrix to approach the green light component spatial spectrum feature after the learning supervision;
the visible light original image is an RGB three-channel image;
the method for extracting the green light intensity component of the face image specifically comprises the following substeps: based on a first formula, the face image I f The RGB three-channel numerical values of each pixel point are converted into single green component numerical values, and the first formula is as follows:
representing a face image I f The value of the 0 th channel of the nth column pixel point in the mth row->Representing a face image I f The value of the 1 st channel of the nth pixel point in the mth row->Representing a face image I f The value of the 2 nd channel of the nth pixel point in the mth row->Representing a face image I f The value of the m-th row and n-th column of pixel points after transformation;
the Fourier transform-based generation of the green light component spatial spectrum feature of the face image specifically comprises the following sub-steps:
SSS1, performing Fourier transform on the face image with the green light intensity component extracted;
and SSS2, performing normalization calculation by taking a Fourier transform module, and then obtaining the green light component spatial spectrum characteristic of the face image.
2. The light weight visible light living body detection method according to claim 1, wherein in the SS2, a position of the human face target in the target feature matrix is identified based on a non-maximum suppression algorithm.
3. The method for detecting a light-weighted visible light living body according to claim 1, wherein the SS3 includes the following steps:
SS31, and face feature matrix F in the extracted target feature matrix based on the position and size of the face target H ×F W ×N;
SS32, respectively for N F H ×F W The x 1 matrix finds the maximum value, and generates a living body discrimination feature vector from the N maximum values obtained.
4. The method for detecting a light-weighted visible light living body according to claim 1, wherein the SS4 determines whether the current human face target is a living body or a non-living body according to the living body discrimination feature vector, and specifically comprises the following sub-steps:
SS41, a second full-connection network classifies the obtained living body distinguishing feature vector and outputs the probability that the current face target is a living body and the probability that the current face target is a non-living body;
SS42, if the probability that the current face target is a living body is larger than the probability that the current face target is a non-living body, determining that the current face target is a living body; and if the probability that the current face target is a living body is smaller than the probability that the current face target is a non-living body, determining that the current face target is the non-living body.
5. The method for detecting a lightweight visible light living body according to claim 1,
in the SS1, before a target feature matrix in a historical visible light original image is extracted by a deep neural network, scaling the received historical visible light original image, wherein the size of the scaled visible light original image is 256 × 256 × 3, and the size of the target feature matrix is 8 × 8 × 128;
in the SS2, when the first fully-connected network identifies the position of the face target in the target feature matrix, the preset prior frame sizes include 192 × 192, 128 × 128, and 32 × 32.
6. The method for detecting a lightweight visible light living body according to claim 5,
said loss function,/>Is a preset first weight factor, < > based on the comparison>Is a preset second weight factor, < > based on the comparison>Is a preset third weight coefficient>Is a preset fourth weight factor, < > based on>Represents the classification loss in the discrimination of a face and a non-face>Represents a loss in the regression of the position of the human face target>Represents the classification loss in the discrimination of a living body and a non-living body>Representing a green light intensity feature learning loss;
wherein,n is the number of training samples; />Indicating whether the jth prior frame in the ith grid is responsible for detecting the true value of the face, if->=1, this means that the jth prior frame in the ith grid is responsible for detecting a face, and if ÷>=0, it means that the jth prior frame in the ith grid is not responsible for detecting the face; />Whether the ith grid contains the true value of the jth prior frame central point is represented, if so, the real value of the jth prior frame central point is represented=1, then indicate that the ith grid contains the jth prior frame center point, if £ is present>If =0, it means that the ith grid does not contain the jth prior frame center point; the number of the grids is 64, and each grid corresponds to each characteristic diagram in the target characteristic matrix one by one; />An output value representing whether the ith grid contains the jth prior frame center point, if >>=1,It indicates that the ith grid contains the jth prior frame center point, if >>=0, then it means that the ith grid does not contain the jth prior frame center point;
whereinrepresents an estimate of the coordinates of the center point of the jth prior frame in the ith grid, </or>Represents the true value of the center point coordinate of the jth prior frame in the ith grid, and/or is greater than the true value of the center point coordinate of the jth prior frame in the ith grid>Represents the width estimate, based on the value of the jth prior frame in the ith trellis>Represents a height estimate, based on the jth prior frame in the ith grid>Represents the true value, based on the width of the jth prior box in the ith trellis>Representing the true height value of the jth prior box in the ith grid;
wherein is present>A true label representing the training sample, device for selecting or keeping>Indicates the probability that the current face target is not a living body>Representing the probability that the current face target is a living body;
wherein is present>Represents the output value of the auxiliary monitoring network after the auxiliary deep neural network learns the green light intensity characteristic, and is/is selected>Represents the true value of the spatial spectral feature of the green light component, and->Represents->And/or>The distance between them.
7. A lightweight visible-light living-body detection device comprising a memory for storing the lightweight visible-light living-body detection method according to any one of claims 1 to 6, and a processor for calling the lightweight visible-light living-body detection method stored in the memory to perform living-body detection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211503095.XA CN115601818B (en) | 2022-11-29 | 2022-11-29 | Lightweight visible light living body detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211503095.XA CN115601818B (en) | 2022-11-29 | 2022-11-29 | Lightweight visible light living body detection method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115601818A CN115601818A (en) | 2023-01-13 |
CN115601818B true CN115601818B (en) | 2023-04-07 |
Family
ID=84852200
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211503095.XA Active CN115601818B (en) | 2022-11-29 | 2022-11-29 | Lightweight visible light living body detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115601818B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117576488B (en) * | 2024-01-17 | 2024-04-05 | 海豚乐智科技(成都)有限责任公司 | Infrared dim target detection method based on target image reconstruction |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2023127B1 (en) * | 2006-05-31 | 2017-12-20 | Olympus Corporation | Biological specimen imaging method and biological specimen imaging apparatus |
CN111488756B (en) * | 2019-01-25 | 2023-10-03 | 杭州海康威视数字技术股份有限公司 | Face recognition-based living body detection method, electronic device, and storage medium |
CN110462632A (en) * | 2019-06-27 | 2019-11-15 | 深圳市汇顶科技股份有限公司 | The method, apparatus and electronic equipment of recognition of face |
CN110941986B (en) * | 2019-10-10 | 2023-08-01 | 平安科技(深圳)有限公司 | Living body detection model training method, living body detection model training device, computer equipment and storage medium |
CN110765923B (en) * | 2019-10-18 | 2024-05-24 | 腾讯科技(深圳)有限公司 | Face living body detection method, device, equipment and storage medium |
CN111191521B (en) * | 2019-12-11 | 2022-08-12 | 智慧眼科技股份有限公司 | Face living body detection method and device, computer equipment and storage medium |
CN111767900B (en) * | 2020-07-28 | 2024-01-26 | 腾讯科技(深圳)有限公司 | Face living body detection method, device, computer equipment and storage medium |
CN112329696A (en) * | 2020-11-18 | 2021-02-05 | 携程计算机技术(上海)有限公司 | Face living body detection method, system, equipment and storage medium |
CN114663985A (en) * | 2020-12-23 | 2022-06-24 | 北京眼神智能科技有限公司 | Face silence living body detection method and device, readable storage medium and equipment |
CN113128481A (en) * | 2021-05-19 | 2021-07-16 | 济南博观智能科技有限公司 | Face living body detection method, device, equipment and storage medium |
CN113378715B (en) * | 2021-06-10 | 2024-01-05 | 北京华捷艾米科技有限公司 | Living body detection method based on color face image and related equipment |
CN113496215B (en) * | 2021-07-07 | 2024-07-02 | 浙江大华技术股份有限公司 | Method and device for detecting living human face and electronic equipment |
CN115131880B (en) * | 2022-05-30 | 2024-05-10 | 上海大学 | Multi-scale attention fusion double-supervision human face living body detection method |
CN115082994A (en) * | 2022-06-27 | 2022-09-20 | 平安银行股份有限公司 | Face living body detection method, and training method and device of living body detection network model |
CN115273184B (en) * | 2022-07-15 | 2023-05-05 | 北京百度网讯科技有限公司 | Training method and device for human face living body detection model |
-
2022
- 2022-11-29 CN CN202211503095.XA patent/CN115601818B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN115601818A (en) | 2023-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109543606B (en) | Human face recognition method with attention mechanism | |
CN112949565B (en) | Single-sample partially-shielded face recognition method and system based on attention mechanism | |
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
CN108921100B (en) | Face recognition method and system based on visible light image and infrared image fusion | |
CN108537743B (en) | Face image enhancement method based on generation countermeasure network | |
CN105740780B (en) | Method and device for detecting living human face | |
CN108345818B (en) | Face living body detection method and device | |
CN112801015B (en) | Multi-mode face recognition method based on attention mechanism | |
CN107423701A (en) | The non-supervisory feature learning method and device of face based on production confrontation network | |
CN108446617A (en) | The human face quick detection method of anti-side face interference | |
CN114758288B (en) | Power distribution network engineering safety control detection method and device | |
CN107871101A (en) | A kind of method for detecting human face and device | |
KR20130048076A (en) | Face recognition apparatus and control method for the same | |
CN110263768A (en) | A kind of face identification method based on depth residual error network | |
CN109993103A (en) | A kind of Human bodys' response method based on point cloud data | |
CN111626113A (en) | Facial expression recognition method and device based on facial action unit | |
CN106709418A (en) | Face identification method based on scene photo and identification photo and identification apparatus thereof | |
CN115601818B (en) | Lightweight visible light living body detection method and device | |
CN105956570A (en) | Lip characteristic and deep learning based smiling face recognition method | |
CN117423134A (en) | Human body target detection and analysis multitasking cooperative network and training method thereof | |
CN111881803B (en) | Face recognition method based on improved YOLOv3 | |
CN104318267B (en) | A kind of automatic identification system of Tibetan mastiff pup purity | |
CN111950586B (en) | Target detection method for introducing bidirectional attention | |
CN111553202B (en) | Training method, detection method and device for neural network for living body detection | |
CN114360034A (en) | Method, system and equipment for detecting deeply forged human face based on triplet network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |