CN112766205B - Robustness silence living body detection method based on color mode image - Google Patents
Robustness silence living body detection method based on color mode image Download PDFInfo
- Publication number
- CN112766205B CN112766205B CN202110116023.9A CN202110116023A CN112766205B CN 112766205 B CN112766205 B CN 112766205B CN 202110116023 A CN202110116023 A CN 202110116023A CN 112766205 B CN112766205 B CN 112766205B
- Authority
- CN
- China
- Prior art keywords
- image
- color mode
- color
- face
- mode image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 45
- 238000000034 method Methods 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 17
- 238000005286 illumination Methods 0.000 claims description 9
- 238000005516 engineering process Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 239000013589 supplement Substances 0.000 claims description 3
- 238000013526 transfer learning Methods 0.000 claims description 3
- 238000001727 in vivo Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 238000001559 infrared map Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000004397 blinking Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/187—Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Human Computer Interaction (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a robustness silence living body detection method based on color mode images. The method comprises the steps of shooting a human face picture under high-energy visible light; acquiring a color modality image, a saturation image and a brightness image based on the picture; carrying out face detection and secondary cutting on the obtained color modal image; binarizing the cut color mode image, and taking a maximum connected region; combining the processed color mode image with the unprocessed saturation image and the luminance image to obtain a three-channel image; establishing a CMNet network model, and training the CMNet network model by using the obtained three-channel image; and (5) performing silent in-vivo detection by adopting a trained CMNet network model. The invention aims to solve the problems that the prior art can not effectively solve the unknown living body attack and is high in cost, utilizes the specular reflection component in the image to distinguish the real person from the deceptive face, and has the advantages of simplicity and high efficiency.
Description
Technical Field
The invention relates to the field of face recognition, in particular to a robustness silence living body detection method based on color mode images.
Background
In recent years, face recognition technology has been developed. However, in many applications, such as face recognition mobile payment, video witness account opening, etc., when a face image is verified, it is necessary to determine whether the face image is a face image of a living body, a photograph, or a face image in a recorded video.
The existing face living body judgment technology can be roughly divided into two types, namely a silent living body and an action living body. The action living body mainly refers to various living body judgment based on actions, and requires a user to complete specified facial actions such as mouth opening and blinking before taking a lens. However, on the one hand, these facial actions can also be easily accomplished by various face synthesis software, the security level is not high enough, and on the other hand, because it requires the cooperation of the user, the user experience is extremely bad, and therefore, is gradually being replaced by a silent living body.
The silent living bodies can be simply classified into three categories according to the data used for living body judgment: silence liveness detection based on a single frame RGB image, silence liveness detection based on a multi-frame image, and silence liveness detection based on a multi-modality. The method has the characteristics of simplicity and high efficiency, but because the static RGB face image is very easy to obtain, and the textures of a real person and a deceptive face are greatly influenced by the environment, deceptive media and the like, the detection method is very easy to crack, and the robustness is not high. Researchers subsequently propose to use images of multiple frames to perform silence live body detection, so that more information, such as subtle movements of human faces, can be introduced to detect spoofing attacks. However, this method has a serious disadvantage that when an attacker uses real-person video playback to attack, there is also slight movement of the human face, and at this time, multi-frame-based live body detection may fail. So, other researchers have proposed that the detection accuracy can be improved by introducing data of other modalities, such as depth map and infrared map. However, the method has two defects, namely, the method cannot be effective to 3D living attack, and the scheme often needs a special camera to acquire a depth map and an infrared map, but the camera is expensive in manufacturing cost and is difficult to popularize in practical application scenes.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a robust silence living body detection method based on color mode images.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a robust silence live body detection method based on color mode images comprises the following steps:
s1, shooting a human face photo under the light supplement of high-energy visible light;
s2, detecting the mirror component of the photo shot in the step S1, and acquiring a color mode image, a saturation image and a brightness image;
s3, carrying out face detection and secondary cutting on the color mode image acquired in the step S2;
s4, binarizing the color mode image cut in the step S3, and taking a maximum connected region;
s5, combining the color mode image processed in the step S4 with the unprocessed saturation image and the luminance image to obtain a three-channel image;
s6, building a CMNet network model, and training the CMNet network model by using the three-channel image obtained in the step S5;
and S7, performing silent live body detection by adopting the trained CMNet network model.
The invention has the following beneficial effects: the method has the advantages that the specular reflection component in the image is utilized to distinguish the real person and the deceptive person face, only a single RGB image for obtaining the person face under the irradiation of monochromatic light is needed, the problem of in-vivo detection can be solved without utilizing data of multi-frame images or other modes, the problems of poor user experience and easiness in playback attack deception existing in the multi-frame images are avoided, meanwhile, the silence in-vivo detection of a single RGB picture which is far from common in robustness is ensured, the performance is not lost while the user experience is ensured, and the detection effect on unknown deceptive attacks is superior to that of the existing in-vivo detection algorithm. Meanwhile, the method overcomes the requirement of expensive peripherals in the multi-mode-based silence living body detection method, and is easy to deploy in the existing face recognition system.
Preferably, step S2 specifically includes:
the photograph taken in step S1 is transferred from the RGB color space to the HSV space, the color and illumination of the image are separated, and a color mode image, a saturation image, and a brightness image are acquired.
The preferred scheme has the following beneficial effects: compared with an RGB color space which represents an image by three primary colors, the HSV space decomposes the image into an H component representing color, an S component representing saturation and a V component representing brightness, and can more conveniently acquire and process the color components of the image.
Preferably, step S3 includes the following substeps:
s31, carrying out face detection on the color mode image;
and S32, extracting key points of the detected face, and secondarily cutting the face by using the coordinates of the key points, so as to reduce the width of the face and keep the height unchanged.
The preferred scheme has the following beneficial effects: the influence caused by interference information, namely the environment around the human face, is reduced.
Preferably, step S4 includes the following substeps:
s41, performing binarization processing on the color mode image cut in the step S3 by adopting an adaptive threshold value Dajin binarization method, specifically including firstly calculating to obtain a gray level histogram and a probability histogram of the original image, then traversing all possible threshold values t, finding a corresponding threshold value when the inter-class variance is maximum, and performing binarization processing on the color mode image by using the threshold value;
and S42, acquiring connected regions contained in the color mode image subjected to the binarization processing in the step S41, calculating the areas of all the connected regions, sequencing the connected regions in sequence, reserving the connected region with the largest area, and deleting other connected regions to obtain a binarized color mode image only containing one connected region.
The preferred scheme has the following beneficial effects: the binaryzation removes the specular reflection of the ambient light, only the specular reflection component of the irradiated light is left, the robustness of color modal data is further improved, the connected region is solved, the maximum connected region is reserved, the uncontrollable specular reflection component of the ambient light existing in other regions except the irradiated region is removed, and after the step, the other specular reflection components of the ambient light are all processed except the specular reflection component of the irradiated light.
Preferably, step S6 includes the following substeps:
s61, extracting features by taking DenseNet as a backbone network, then classifying by utilizing the extracted features by a full connection layer, and judging whether the image is a real image or a deceptive image;
s62, combining a pixel supervision technology, taking BCELoss and Cross Engine Loss as Loss functions, and supervising the training of the model to establish a CMNet network model;
and S63, selecting a random gradient descent optimizer as the optimizer in the training process, inputting the three-channel image obtained in the step S5 into the CMNet network model, and training the CMNet network model in a transfer learning mode.
The preferred scheme has the following beneficial effects: the DenseNet takes the shallow feature as input and inputs the shallow feature into a high-level network, so that the shallow feature can be well utilized, and the DenseNet is suitable for the condition that the data volume is insufficient; compared to the conventional solution of treating liveness detection as a binary classification, pixel supervision introduces supervision information at a pixel level before binary classification. If the photo is a live photo, the photo corresponds to a 14x14 supervision matrix of all 1, and the spoofed photo corresponds to a 14x14 supervision matrix of all 0, so that the characteristics can be better aggregated, and the secondary classification is assisted to obtain more stable output; BCELoss is a loss function commonly used in neural network classification, and has a good effect, and the training of inputting images processed according to the preprocessing method described in the steps S1 to S6 into the network can improve the generalization of the model as much as possible under the condition of insufficient data.
Drawings
FIG. 1 is a flow chart of a robust silence liveness detection method based on color modality images in accordance with the present invention;
FIG. 2 is a schematic diagram of imaging results of a real person and a deceptive person face under fill-in of purple light in the embodiment of the invention;
FIG. 3 is a schematic diagram of color mode images of a real person and a deceptive person face under fill-in of purple light in the embodiment of the invention;
FIG. 4 is a schematic diagram of key points of a face according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a color modality image cropping result according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a result of binarization and maximum connected region of a color mode image according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a CMNet network model structure in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, the present invention provides a robust silence live detection method based on color mode images, comprising the following steps:
s1, shooting a human face photo under the light supplement of high-energy visible light;
and (3) adopting purple light with the strongest energy as additional supplementary light to irradiate the human face, and then detecting the human face and taking a picture. According to the global illumination rendering equation, the energy intensity of a certain point in the reflection image is as follows:
wherein L is0(p,ω0) Is the intensity of the last observed light at a point, p is the position of the point, woIs the direction of this point, Le(p,ω0) Is the emergent intensity, xi, of the reflecting surface2Is the respective direction of the hemisphere, frIs a scattering function, LiIs the incident intensity, wiIs the direction of incidence and theta is the angle of the incoming direction with the normal.
The image I (x, y) can be simply split into the sum of the illumination of the reflecting surface itself, the specular component, and the diffuse component.
I(x,y)=Ii(x,y)+Is(x,y)+Id(x,y)
Is(x,y)=ks*E(x,y)
Id(x,y)=kd*E(x,y)
E(x,y)=Ea(x,y)+Ei(x,y)
Wherein Ii(x, y) is the illumination of the reflecting surface itself, if the reflecting surface itself is to emit light, such as a screen, then Ii(x, y) is the light intensity of the screen, otherwise it is 0. I iss(x, y) represents a specular reflection component, which is defined by incident light energy E (x, y) and a specular reflection coefficient k of the materialsAnd (6) determining. I isd(x, y) represents a diffuse reflection component, which is defined by incident light energy E (x, y) and diffuse reflection coefficient k of the materialdAnd (6) determining. And the energy E (x, y) of the incident light is the ambient light energy Ea(x, y) and irradiation light energy EiThe sum of (x, y).
For real face and deceptive face, the surface of deceptive face such as photo and screen is smooth, so the reflection is very smoothOf the components, the specular component is dominant. However, the surface of the human face is relatively rough, so that the diffuse reflection component is dominant in the reflection component of the real human face. By detecting the specular component, genuine and spoofed faces can be distinguished. But the reflected components are difficult to separate, so a compromise solution is devised. And highlighting the specular reflection component by adopting an additional supplementary lighting mode, and approximating the specular reflection component of the image by detecting the specular reflection component of the additional supplementary lighting. And the larger the energy E (x, y), the specular component IsThe larger (x, y) is, the larger the difference between the real person and the deceptive person face is, so that purple light is selected as additional supplementary light. In addition, in order to ensure the sufficiency of illumination, the distance between the user and the camera is required to be within a limited range, and then the human face is automatically monitored and a picture is taken to obtain an RGB image I (x, y). Referring to fig. 2, the result of shooting the real person and the deceptive person face under the supplementary purple light is shown, where the left is the photo of the real person face under the supplementary purple light, and the right is the photo of the deceptive person face under the supplementary purple light.
S2, detecting the mirror component of the photo shot in the step S1, and acquiring a color mode image, a saturation image and a brightness image;
in the embodiment of the present invention, the step S2 specifically includes converting the photo taken in the step S1 from the RGB color space to the HSV space, separating the color and the illumination of the image, and acquiring the image in the hue space representing the color, that is, the color mode image, and the saturation image and the brightness image. Referring to fig. 3, color mode images of a real person and a spoofed face are shown, wherein the left side is the color mode image of the real person face, and the right side is the color mode image of the spoofed face, so that an obvious difference exists, and the living body detection can be performed by using the difference.
S3, carrying out face detection and secondary cutting on the color mode image acquired in the step S2;
in the embodiment of the present invention, step S3 includes the following sub-steps:
s31, carrying out face detection on the color mode image;
because the light filling area of the photo is only on the face, the environment around the face is interference information, and the face is extracted by carrying out face detection on the collected photo.
And S32, extracting key points of the detected face, and secondarily cutting the face by using the coordinates of the key points, so as to reduce the width of the face and keep the height unchanged.
Due to the fact that the environment is changeable, the size of the face detection frame fluctuates greatly, and therefore secondary clipping is conducted on the face detection frame again. Considering that the positions of the key points of the face are fixed and do not change with the size of the detection frame, after the face is detected, the key points of the face are extracted by using the dlib library, and the detection result is shown in fig. 4. And then the coordinates of the 1 st and 17 th key points are used for judging the face, the width of the face is reduced, the height is not adjusted, and the influence of the surrounding environment is further reduced through the processing. The original image and the result after the second cropping are shown in fig. 5, where the left side is the original color mode image and the right side is the color mode image after the face detection and the second cropping, and it can be seen that after the second cropping, the noise is significantly less.
S4, binarizing the color mode image cut in the step S3, and taking a maximum connected region;
in the embodiment of the present invention, step S4 includes the following sub-steps:
s41, performing binarization processing on the color mode image cut in the step S3 by adopting an adaptive threshold value Dajin binarization method, specifically including firstly calculating to obtain a gray level histogram and a probability histogram of the original image, then traversing all possible threshold values t, finding a corresponding threshold value when the inter-class variance is maximum, and performing binarization processing on the color mode image by using the threshold value; compared with the traditional fixed threshold value binarization method, the method can better adapt to different conditions of different images for dynamically updating the threshold values of different images.
The energy E (x, y) of the incident light is the ambient light energy Ea(x, y) and irradiation light energy EiThe sum of (x, y). The illumination light is an extra fill light, which is a controllable variable, but the ambient light is an uncontrollable quantity affected by a variable environment. Although ambient light may notHowever, since the energy of the ambient light is much smaller than that of the stable irradiation light, the robustness of the color mode data can be further improved by selecting a threshold value, performing binarization, and removing the influence of the specular reflection of the ambient light to leave only the specular reflection component of the irradiation light. And (3) performing binarization processing on the dynamically updated threshold values of different images by adopting a Dajin binarization method of self-adaptive threshold values.
And S42, acquiring connected regions contained in the color mode image subjected to the binarization processing in the step S41, calculating the areas of all the connected regions, sequencing the connected regions in sequence, reserving the connected region with the largest area, and deleting other connected regions to obtain a binarized color mode image only containing one connected region.
In addition to the illuminated area, other areas of the face may also have uncontrollable specular components of the environment, but these disturbances are much smaller in area than the controllable specular component of the illumination. In order to remove the interference, for the color mode image after binarization, a connected region is obtained, the maximum connected region is reserved, and other environmental interference is removed. Referring to fig. 6, the results before and after the processing are shown, where the left side is the color mode image after the secondary cropping, and the right side is the color mode image after the binarization and the maximum region is taken, it can be seen that after the processing, the specular reflection components caused by the other environments are all processed except the specular reflection component caused by the fill light.
S5, combining the color mode image processed in the step S4 with the unprocessed saturation image and the luminance image to obtain a three-channel image;
when clipping and binarization are performed and the maximum connected region is obtained, information loss occurs, and in order to better perform living body detection, the processed color mode image and the unprocessed S-space and V-space images are combined again to obtain a three-channel image. In this way, information lost in the preprocessing can be compensated with information of other channels.
S6, building a CMNet network model, and training the CMNet network model by using the three-channel image obtained in the step S5;
in the embodiment of the present invention, step S6 includes the following sub-steps:
s61, extracting features by taking DenseNet as a backbone network, then classifying by utilizing the extracted features by a full connection layer, and judging whether the image is a real image or a deceptive image;
s62, combining a pixel supervision technology, taking BCELoss and Cross Engine Loss as Loss functions, and supervising the training of the model to establish a CMNet network model;
s63, selecting a random gradient descent optimizer as an optimizer in the training process, wherein the learning rate is 0.001, the momentum parameter is 0.99, firstly downloading a DesenNet121 model which is pre-trained on imagenet, then processing the model by using collected color mode images according to the pre-processing method described above, inputting the three-channel images obtained in the step S5 into the CMNet network model, and training the CMNet network model by adopting a transfer learning mode.
In this embodiment, a network model CMNet shown in fig. 7 is newly proposed for processing color mode images, extracting features, and performing attack detection by using classical DenseNet as a backbone network, combining with the latest pixel supervision technology, and using BCELoss as loss. The reason why the DenseNet is selected as the backbone instead of other reasons such as ResNet is that the DenseNet takes the shallow feature as input and inputs the shallow feature into a high-level network, so that the shallow feature can be well utilized, and the method is very suitable for processing the situation that the data volume is insufficient. Secondly, compared with the traditional solution thought of treating the living body detection as two classifications, the pixel supervision introduces supervision information at a pixel level before the two classifications. In the case of a live photo, the photo corresponds to a 14x14 supervision matrix with all 1's, and the photo with fraud corresponds to a 14x14 supervision matrix with all 0's. BCELoss is a commonly used loss function in neural network classification, and has good effect.
And S8, deploying the trained CMNet network model on a server to perform silent live body detection.
Referring to the following table, the results of comparing the performance of the present invention with some of the latest algorithms currently available are shown in the following table:
training is carried out in data of paper printing and screen playback attack, then testing is carried out in unknown photo printing attack to compare the robustness of different algorithms to the unknown attack, and the algorithm provided by the invention is obviously superior to other algorithms. And the processing speed is the second fastest of all algorithms, only 80ms is needed on the Intel Core i5-7500 CPU to process a picture.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (4)
1. A robust silence live detection method based on color mode images is characterized by comprising the following steps:
s1, shooting a human face photo under the light supplement of high-energy visible light;
s2, detecting the mirror component of the photo shot in the step S1, and acquiring a color mode image, a saturation image and a brightness image;
s3, carrying out face detection and secondary cutting on the color mode image acquired in the step S2, comprising the following steps:
s31, carrying out face detection on the color mode image;
s32, extracting key points of the detected face, cutting the face twice by using the coordinates of the key points, reducing the width of the face and keeping the height unchanged;
s4, binarizing the color mode image cut in the step S3, and taking a maximum connected region;
s5, combining the color mode image processed in the step S4 with the unprocessed saturation image and the luminance image to obtain a three-channel image;
s6, building a CMNet network model, and training the CMNet network model by using the three-channel image obtained in the step S5;
and S7, performing silent live body detection by adopting the trained CMNet network model.
2. The method for robust silence liveness detection based on color modality images as claimed in claim 1, wherein the step S2 specifically comprises:
and (4) transferring the picture taken in the step (S1) from the RGB color space to the HSV color space, separating the color and the illumination of the image, and acquiring a color mode image, a saturation graph and a brightness image.
3. The method for robust silence liveness detection based on color modality images as claimed in claim 1, wherein said step S4 comprises the sub-steps of:
s41, performing binarization processing on the color mode image cut in the step S3 by adopting an adaptive threshold value Dajin binarization method, specifically including firstly calculating to obtain a gray level histogram and a probability histogram of the original image, then traversing all possible threshold values t, finding a corresponding threshold value when the inter-class variance is maximum, and performing binarization processing on the color mode image by using the threshold value;
and S42, acquiring connected regions contained in the color mode image subjected to the binarization processing in the step S41, calculating the areas of all the connected regions, sequencing the connected regions in sequence, reserving the connected region with the largest area, and deleting other connected regions to obtain a binarized color mode image only containing one connected region.
4. The method according to claim 3, wherein the step S6 includes the following sub-steps:
s61, extracting features by taking the DenseNet121 as a backbone network, then classifying by utilizing the extracted features by a full connection layer, and judging whether the image is a real image or a deceptive image;
s62, combining a pixel supervision technology, taking BCELoss and Cross Engine Loss as Loss functions, and supervising the training of the model to establish a CMNet network model;
and S63, selecting a random gradient descent optimizer as the optimizer in the training process, inputting the three-channel image obtained in the step S5 into the CMNet network model, and training the CMNet network model in a transfer learning mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116023.9A CN112766205B (en) | 2021-01-28 | 2021-01-28 | Robustness silence living body detection method based on color mode image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116023.9A CN112766205B (en) | 2021-01-28 | 2021-01-28 | Robustness silence living body detection method based on color mode image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112766205A CN112766205A (en) | 2021-05-07 |
CN112766205B true CN112766205B (en) | 2022-02-11 |
Family
ID=75706367
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110116023.9A Expired - Fee Related CN112766205B (en) | 2021-01-28 | 2021-01-28 | Robustness silence living body detection method based on color mode image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112766205B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101059836A (en) * | 2007-06-01 | 2007-10-24 | 华南理工大学 | Human eye positioning and human eye state recognition method |
CN101976352A (en) * | 2010-10-29 | 2011-02-16 | 上海交通大学 | Various illumination face identification method based on small sample emulating and sparse expression |
CN108197534A (en) * | 2017-12-19 | 2018-06-22 | 迈巨(深圳)科技有限公司 | A kind of head part's attitude detecting method, electronic equipment and storage medium |
CN111160257A (en) * | 2019-12-30 | 2020-05-15 | 河南中原大数据研究院有限公司 | Monocular human face in-vivo detection method stable to illumination transformation |
CN111932540A (en) * | 2020-10-14 | 2020-11-13 | 北京信诺卫康科技有限公司 | CT image contrast characteristic learning method for clinical typing of new coronary pneumonia |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590430A (en) * | 2017-07-26 | 2018-01-16 | 百度在线网络技术(北京)有限公司 | Biopsy method, device, equipment and storage medium |
-
2021
- 2021-01-28 CN CN202110116023.9A patent/CN112766205B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101059836A (en) * | 2007-06-01 | 2007-10-24 | 华南理工大学 | Human eye positioning and human eye state recognition method |
CN101976352A (en) * | 2010-10-29 | 2011-02-16 | 上海交通大学 | Various illumination face identification method based on small sample emulating and sparse expression |
CN108197534A (en) * | 2017-12-19 | 2018-06-22 | 迈巨(深圳)科技有限公司 | A kind of head part's attitude detecting method, electronic equipment and storage medium |
CN111160257A (en) * | 2019-12-30 | 2020-05-15 | 河南中原大数据研究院有限公司 | Monocular human face in-vivo detection method stable to illumination transformation |
CN111932540A (en) * | 2020-10-14 | 2020-11-13 | 北京信诺卫康科技有限公司 | CT image contrast characteristic learning method for clinical typing of new coronary pneumonia |
Non-Patent Citations (2)
Title |
---|
Automatic Face Detection Using Color Based Segmentation;Yogesh Tayal 等;《International Journal of Scientific and Research Publications》;20120630;第2卷(第6期);第1-7页 * |
基于多模态特征融合的轻量级人脸活体检测方法;皮家甜 等;《计算机应用》;20201210;第40卷(第12期);第3658-3665页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112766205A (en) | 2021-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Matern et al. | Gradient-based illumination description for image forgery detection | |
WO2019134536A1 (en) | Neural network model-based human face living body detection | |
JP4755202B2 (en) | Face feature detection method | |
US8582875B2 (en) | Method for skin tone detection | |
CN111460931A (en) | Face spoofing detection method and system based on color channel difference image characteristics | |
CN109948566B (en) | Double-flow face anti-fraud detection method based on weight fusion and feature selection | |
CN112861671B (en) | Method for identifying deeply forged face image and video | |
CN112052830B (en) | Method, device and computer storage medium for face detection | |
CN115131880A (en) | Multi-scale attention fusion double-supervision human face in-vivo detection method | |
Tao et al. | Smoke vehicle detection based on robust codebook model and robust volume local binary count patterns | |
CN111882525A (en) | Image reproduction detection method based on LBP watermark characteristics and fine-grained identification | |
Zaidan et al. | A new hybrid module for skin detector using fuzzy inference system structure and explicit rules | |
CN112766205B (en) | Robustness silence living body detection method based on color mode image | |
CN112016437A (en) | Living body detection method based on face video key frame | |
Hadiprakoso | Face anti-spoofing method with blinking eye and hsv texture analysis | |
JP3962517B2 (en) | Face detection method and apparatus, and computer-readable medium | |
Alharbi et al. | Spoofing Face Detection Using Novel Edge-Net Autoencoder for Security. | |
Chang et al. | Image Forgery Using An Enhanced Bayesian Matting Algorithm | |
CN109961025B (en) | True and false face identification and detection method and detection system based on image skewness | |
CN116012248B (en) | Image processing method, device, computer equipment and computer storage medium | |
WO2024025134A1 (en) | A system and method for real time optical illusion photography | |
CN112818782B (en) | Generalized silence living body detection method based on medium sensing | |
Neves et al. | GAN Fingerprints in Face Image Synthesis | |
CN117541969B (en) | Pornography video detection method based on semantics and image enhancement | |
Fang et al. | Studies Advanced in Robust Face Recognition under Complex Light Intensity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220211 |