CN112766205B - Robustness silence living body detection method based on color mode image - Google Patents

Robustness silence living body detection method based on color mode image Download PDF

Info

Publication number
CN112766205B
CN112766205B CN202110116023.9A CN202110116023A CN112766205B CN 112766205 B CN112766205 B CN 112766205B CN 202110116023 A CN202110116023 A CN 202110116023A CN 112766205 B CN112766205 B CN 112766205B
Authority
CN
China
Prior art keywords
image
color mode
color
face
mode image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110116023.9A
Other languages
Chinese (zh)
Other versions
CN112766205A (en
Inventor
骆春波
韦仕才
罗杨
张赟疆
徐加朗
濮希同
许燕
彭涛
刘翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110116023.9A priority Critical patent/CN112766205B/en
Publication of CN112766205A publication Critical patent/CN112766205A/en
Application granted granted Critical
Publication of CN112766205B publication Critical patent/CN112766205B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/187Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a robustness silence living body detection method based on color mode images. The method comprises the steps of shooting a human face picture under high-energy visible light; acquiring a color modality image, a saturation image and a brightness image based on the picture; carrying out face detection and secondary cutting on the obtained color modal image; binarizing the cut color mode image, and taking a maximum connected region; combining the processed color mode image with the unprocessed saturation image and the luminance image to obtain a three-channel image; establishing a CMNet network model, and training the CMNet network model by using the obtained three-channel image; and (5) performing silent in-vivo detection by adopting a trained CMNet network model. The invention aims to solve the problems that the prior art can not effectively solve the unknown living body attack and is high in cost, utilizes the specular reflection component in the image to distinguish the real person from the deceptive face, and has the advantages of simplicity and high efficiency.

Description

Robustness silence living body detection method based on color mode image
Technical Field
The invention relates to the field of face recognition, in particular to a robustness silence living body detection method based on color mode images.
Background
In recent years, face recognition technology has been developed. However, in many applications, such as face recognition mobile payment, video witness account opening, etc., when a face image is verified, it is necessary to determine whether the face image is a face image of a living body, a photograph, or a face image in a recorded video.
The existing face living body judgment technology can be roughly divided into two types, namely a silent living body and an action living body. The action living body mainly refers to various living body judgment based on actions, and requires a user to complete specified facial actions such as mouth opening and blinking before taking a lens. However, on the one hand, these facial actions can also be easily accomplished by various face synthesis software, the security level is not high enough, and on the other hand, because it requires the cooperation of the user, the user experience is extremely bad, and therefore, is gradually being replaced by a silent living body.
The silent living bodies can be simply classified into three categories according to the data used for living body judgment: silence liveness detection based on a single frame RGB image, silence liveness detection based on a multi-frame image, and silence liveness detection based on a multi-modality. The method has the characteristics of simplicity and high efficiency, but because the static RGB face image is very easy to obtain, and the textures of a real person and a deceptive face are greatly influenced by the environment, deceptive media and the like, the detection method is very easy to crack, and the robustness is not high. Researchers subsequently propose to use images of multiple frames to perform silence live body detection, so that more information, such as subtle movements of human faces, can be introduced to detect spoofing attacks. However, this method has a serious disadvantage that when an attacker uses real-person video playback to attack, there is also slight movement of the human face, and at this time, multi-frame-based live body detection may fail. So, other researchers have proposed that the detection accuracy can be improved by introducing data of other modalities, such as depth map and infrared map. However, the method has two defects, namely, the method cannot be effective to 3D living attack, and the scheme often needs a special camera to acquire a depth map and an infrared map, but the camera is expensive in manufacturing cost and is difficult to popularize in practical application scenes.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a robust silence living body detection method based on color mode images.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a robust silence live body detection method based on color mode images comprises the following steps:
s1, shooting a human face photo under the light supplement of high-energy visible light;
s2, detecting the mirror component of the photo shot in the step S1, and acquiring a color mode image, a saturation image and a brightness image;
s3, carrying out face detection and secondary cutting on the color mode image acquired in the step S2;
s4, binarizing the color mode image cut in the step S3, and taking a maximum connected region;
s5, combining the color mode image processed in the step S4 with the unprocessed saturation image and the luminance image to obtain a three-channel image;
s6, building a CMNet network model, and training the CMNet network model by using the three-channel image obtained in the step S5;
and S7, performing silent live body detection by adopting the trained CMNet network model.
The invention has the following beneficial effects: the method has the advantages that the specular reflection component in the image is utilized to distinguish the real person and the deceptive person face, only a single RGB image for obtaining the person face under the irradiation of monochromatic light is needed, the problem of in-vivo detection can be solved without utilizing data of multi-frame images or other modes, the problems of poor user experience and easiness in playback attack deception existing in the multi-frame images are avoided, meanwhile, the silence in-vivo detection of a single RGB picture which is far from common in robustness is ensured, the performance is not lost while the user experience is ensured, and the detection effect on unknown deceptive attacks is superior to that of the existing in-vivo detection algorithm. Meanwhile, the method overcomes the requirement of expensive peripherals in the multi-mode-based silence living body detection method, and is easy to deploy in the existing face recognition system.
Preferably, step S2 specifically includes:
the photograph taken in step S1 is transferred from the RGB color space to the HSV space, the color and illumination of the image are separated, and a color mode image, a saturation image, and a brightness image are acquired.
The preferred scheme has the following beneficial effects: compared with an RGB color space which represents an image by three primary colors, the HSV space decomposes the image into an H component representing color, an S component representing saturation and a V component representing brightness, and can more conveniently acquire and process the color components of the image.
Preferably, step S3 includes the following substeps:
s31, carrying out face detection on the color mode image;
and S32, extracting key points of the detected face, and secondarily cutting the face by using the coordinates of the key points, so as to reduce the width of the face and keep the height unchanged.
The preferred scheme has the following beneficial effects: the influence caused by interference information, namely the environment around the human face, is reduced.
Preferably, step S4 includes the following substeps:
s41, performing binarization processing on the color mode image cut in the step S3 by adopting an adaptive threshold value Dajin binarization method, specifically including firstly calculating to obtain a gray level histogram and a probability histogram of the original image, then traversing all possible threshold values t, finding a corresponding threshold value when the inter-class variance is maximum, and performing binarization processing on the color mode image by using the threshold value;
and S42, acquiring connected regions contained in the color mode image subjected to the binarization processing in the step S41, calculating the areas of all the connected regions, sequencing the connected regions in sequence, reserving the connected region with the largest area, and deleting other connected regions to obtain a binarized color mode image only containing one connected region.
The preferred scheme has the following beneficial effects: the binaryzation removes the specular reflection of the ambient light, only the specular reflection component of the irradiated light is left, the robustness of color modal data is further improved, the connected region is solved, the maximum connected region is reserved, the uncontrollable specular reflection component of the ambient light existing in other regions except the irradiated region is removed, and after the step, the other specular reflection components of the ambient light are all processed except the specular reflection component of the irradiated light.
Preferably, step S6 includes the following substeps:
s61, extracting features by taking DenseNet as a backbone network, then classifying by utilizing the extracted features by a full connection layer, and judging whether the image is a real image or a deceptive image;
s62, combining a pixel supervision technology, taking BCELoss and Cross Engine Loss as Loss functions, and supervising the training of the model to establish a CMNet network model;
and S63, selecting a random gradient descent optimizer as the optimizer in the training process, inputting the three-channel image obtained in the step S5 into the CMNet network model, and training the CMNet network model in a transfer learning mode.
The preferred scheme has the following beneficial effects: the DenseNet takes the shallow feature as input and inputs the shallow feature into a high-level network, so that the shallow feature can be well utilized, and the DenseNet is suitable for the condition that the data volume is insufficient; compared to the conventional solution of treating liveness detection as a binary classification, pixel supervision introduces supervision information at a pixel level before binary classification. If the photo is a live photo, the photo corresponds to a 14x14 supervision matrix of all 1, and the spoofed photo corresponds to a 14x14 supervision matrix of all 0, so that the characteristics can be better aggregated, and the secondary classification is assisted to obtain more stable output; BCELoss is a loss function commonly used in neural network classification, and has a good effect, and the training of inputting images processed according to the preprocessing method described in the steps S1 to S6 into the network can improve the generalization of the model as much as possible under the condition of insufficient data.
Drawings
FIG. 1 is a flow chart of a robust silence liveness detection method based on color modality images in accordance with the present invention;
FIG. 2 is a schematic diagram of imaging results of a real person and a deceptive person face under fill-in of purple light in the embodiment of the invention;
FIG. 3 is a schematic diagram of color mode images of a real person and a deceptive person face under fill-in of purple light in the embodiment of the invention;
FIG. 4 is a schematic diagram of key points of a face according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a color modality image cropping result according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a result of binarization and maximum connected region of a color mode image according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a CMNet network model structure in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, the present invention provides a robust silence live detection method based on color mode images, comprising the following steps:
s1, shooting a human face photo under the light supplement of high-energy visible light;
and (3) adopting purple light with the strongest energy as additional supplementary light to irradiate the human face, and then detecting the human face and taking a picture. According to the global illumination rendering equation, the energy intensity of a certain point in the reflection image is as follows:
Figure BDA0002920469540000051
wherein L is0(p,ω0) Is the intensity of the last observed light at a point, p is the position of the point, woIs the direction of this point, Le(p,ω0) Is the emergent intensity, xi, of the reflecting surface2Is the respective direction of the hemisphere, frIs a scattering function, LiIs the incident intensity, wiIs the direction of incidence and theta is the angle of the incoming direction with the normal.
The image I (x, y) can be simply split into the sum of the illumination of the reflecting surface itself, the specular component, and the diffuse component.
I(x,y)=Ii(x,y)+Is(x,y)+Id(x,y)
Is(x,y)=ks*E(x,y)
Id(x,y)=kd*E(x,y)
E(x,y)=Ea(x,y)+Ei(x,y)
Wherein Ii(x, y) is the illumination of the reflecting surface itself, if the reflecting surface itself is to emit light, such as a screen, then Ii(x, y) is the light intensity of the screen, otherwise it is 0. I iss(x, y) represents a specular reflection component, which is defined by incident light energy E (x, y) and a specular reflection coefficient k of the materialsAnd (6) determining. I isd(x, y) represents a diffuse reflection component, which is defined by incident light energy E (x, y) and diffuse reflection coefficient k of the materialdAnd (6) determining. And the energy E (x, y) of the incident light is the ambient light energy Ea(x, y) and irradiation light energy EiThe sum of (x, y).
For real face and deceptive face, the surface of deceptive face such as photo and screen is smooth, so the reflection is very smoothOf the components, the specular component is dominant. However, the surface of the human face is relatively rough, so that the diffuse reflection component is dominant in the reflection component of the real human face. By detecting the specular component, genuine and spoofed faces can be distinguished. But the reflected components are difficult to separate, so a compromise solution is devised. And highlighting the specular reflection component by adopting an additional supplementary lighting mode, and approximating the specular reflection component of the image by detecting the specular reflection component of the additional supplementary lighting. And the larger the energy E (x, y), the specular component IsThe larger (x, y) is, the larger the difference between the real person and the deceptive person face is, so that purple light is selected as additional supplementary light. In addition, in order to ensure the sufficiency of illumination, the distance between the user and the camera is required to be within a limited range, and then the human face is automatically monitored and a picture is taken to obtain an RGB image I (x, y). Referring to fig. 2, the result of shooting the real person and the deceptive person face under the supplementary purple light is shown, where the left is the photo of the real person face under the supplementary purple light, and the right is the photo of the deceptive person face under the supplementary purple light.
S2, detecting the mirror component of the photo shot in the step S1, and acquiring a color mode image, a saturation image and a brightness image;
in the embodiment of the present invention, the step S2 specifically includes converting the photo taken in the step S1 from the RGB color space to the HSV space, separating the color and the illumination of the image, and acquiring the image in the hue space representing the color, that is, the color mode image, and the saturation image and the brightness image. Referring to fig. 3, color mode images of a real person and a spoofed face are shown, wherein the left side is the color mode image of the real person face, and the right side is the color mode image of the spoofed face, so that an obvious difference exists, and the living body detection can be performed by using the difference.
S3, carrying out face detection and secondary cutting on the color mode image acquired in the step S2;
in the embodiment of the present invention, step S3 includes the following sub-steps:
s31, carrying out face detection on the color mode image;
because the light filling area of the photo is only on the face, the environment around the face is interference information, and the face is extracted by carrying out face detection on the collected photo.
And S32, extracting key points of the detected face, and secondarily cutting the face by using the coordinates of the key points, so as to reduce the width of the face and keep the height unchanged.
Due to the fact that the environment is changeable, the size of the face detection frame fluctuates greatly, and therefore secondary clipping is conducted on the face detection frame again. Considering that the positions of the key points of the face are fixed and do not change with the size of the detection frame, after the face is detected, the key points of the face are extracted by using the dlib library, and the detection result is shown in fig. 4. And then the coordinates of the 1 st and 17 th key points are used for judging the face, the width of the face is reduced, the height is not adjusted, and the influence of the surrounding environment is further reduced through the processing. The original image and the result after the second cropping are shown in fig. 5, where the left side is the original color mode image and the right side is the color mode image after the face detection and the second cropping, and it can be seen that after the second cropping, the noise is significantly less.
S4, binarizing the color mode image cut in the step S3, and taking a maximum connected region;
in the embodiment of the present invention, step S4 includes the following sub-steps:
s41, performing binarization processing on the color mode image cut in the step S3 by adopting an adaptive threshold value Dajin binarization method, specifically including firstly calculating to obtain a gray level histogram and a probability histogram of the original image, then traversing all possible threshold values t, finding a corresponding threshold value when the inter-class variance is maximum, and performing binarization processing on the color mode image by using the threshold value; compared with the traditional fixed threshold value binarization method, the method can better adapt to different conditions of different images for dynamically updating the threshold values of different images.
The energy E (x, y) of the incident light is the ambient light energy Ea(x, y) and irradiation light energy EiThe sum of (x, y). The illumination light is an extra fill light, which is a controllable variable, but the ambient light is an uncontrollable quantity affected by a variable environment. Although ambient light may notHowever, since the energy of the ambient light is much smaller than that of the stable irradiation light, the robustness of the color mode data can be further improved by selecting a threshold value, performing binarization, and removing the influence of the specular reflection of the ambient light to leave only the specular reflection component of the irradiation light. And (3) performing binarization processing on the dynamically updated threshold values of different images by adopting a Dajin binarization method of self-adaptive threshold values.
And S42, acquiring connected regions contained in the color mode image subjected to the binarization processing in the step S41, calculating the areas of all the connected regions, sequencing the connected regions in sequence, reserving the connected region with the largest area, and deleting other connected regions to obtain a binarized color mode image only containing one connected region.
In addition to the illuminated area, other areas of the face may also have uncontrollable specular components of the environment, but these disturbances are much smaller in area than the controllable specular component of the illumination. In order to remove the interference, for the color mode image after binarization, a connected region is obtained, the maximum connected region is reserved, and other environmental interference is removed. Referring to fig. 6, the results before and after the processing are shown, where the left side is the color mode image after the secondary cropping, and the right side is the color mode image after the binarization and the maximum region is taken, it can be seen that after the processing, the specular reflection components caused by the other environments are all processed except the specular reflection component caused by the fill light.
S5, combining the color mode image processed in the step S4 with the unprocessed saturation image and the luminance image to obtain a three-channel image;
when clipping and binarization are performed and the maximum connected region is obtained, information loss occurs, and in order to better perform living body detection, the processed color mode image and the unprocessed S-space and V-space images are combined again to obtain a three-channel image. In this way, information lost in the preprocessing can be compensated with information of other channels.
S6, building a CMNet network model, and training the CMNet network model by using the three-channel image obtained in the step S5;
in the embodiment of the present invention, step S6 includes the following sub-steps:
s61, extracting features by taking DenseNet as a backbone network, then classifying by utilizing the extracted features by a full connection layer, and judging whether the image is a real image or a deceptive image;
s62, combining a pixel supervision technology, taking BCELoss and Cross Engine Loss as Loss functions, and supervising the training of the model to establish a CMNet network model;
s63, selecting a random gradient descent optimizer as an optimizer in the training process, wherein the learning rate is 0.001, the momentum parameter is 0.99, firstly downloading a DesenNet121 model which is pre-trained on imagenet, then processing the model by using collected color mode images according to the pre-processing method described above, inputting the three-channel images obtained in the step S5 into the CMNet network model, and training the CMNet network model by adopting a transfer learning mode.
In this embodiment, a network model CMNet shown in fig. 7 is newly proposed for processing color mode images, extracting features, and performing attack detection by using classical DenseNet as a backbone network, combining with the latest pixel supervision technology, and using BCELoss as loss. The reason why the DenseNet is selected as the backbone instead of other reasons such as ResNet is that the DenseNet takes the shallow feature as input and inputs the shallow feature into a high-level network, so that the shallow feature can be well utilized, and the method is very suitable for processing the situation that the data volume is insufficient. Secondly, compared with the traditional solution thought of treating the living body detection as two classifications, the pixel supervision introduces supervision information at a pixel level before the two classifications. In the case of a live photo, the photo corresponds to a 14x14 supervision matrix with all 1's, and the photo with fraud corresponds to a 14x14 supervision matrix with all 0's. BCELoss is a commonly used loss function in neural network classification, and has good effect.
And S8, deploying the trained CMNet network model on a server to perform silent live body detection.
Referring to the following table, the results of comparing the performance of the present invention with some of the latest algorithms currently available are shown in the following table:
Figure BDA0002920469540000101
training is carried out in data of paper printing and screen playback attack, then testing is carried out in unknown photo printing attack to compare the robustness of different algorithms to the unknown attack, and the algorithm provided by the invention is obviously superior to other algorithms. And the processing speed is the second fastest of all algorithms, only 80ms is needed on the Intel Core i5-7500 CPU to process a picture.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (4)

1. A robust silence live detection method based on color mode images is characterized by comprising the following steps:
s1, shooting a human face photo under the light supplement of high-energy visible light;
s2, detecting the mirror component of the photo shot in the step S1, and acquiring a color mode image, a saturation image and a brightness image;
s3, carrying out face detection and secondary cutting on the color mode image acquired in the step S2, comprising the following steps:
s31, carrying out face detection on the color mode image;
s32, extracting key points of the detected face, cutting the face twice by using the coordinates of the key points, reducing the width of the face and keeping the height unchanged;
s4, binarizing the color mode image cut in the step S3, and taking a maximum connected region;
s5, combining the color mode image processed in the step S4 with the unprocessed saturation image and the luminance image to obtain a three-channel image;
s6, building a CMNet network model, and training the CMNet network model by using the three-channel image obtained in the step S5;
and S7, performing silent live body detection by adopting the trained CMNet network model.
2. The method for robust silence liveness detection based on color modality images as claimed in claim 1, wherein the step S2 specifically comprises:
and (4) transferring the picture taken in the step (S1) from the RGB color space to the HSV color space, separating the color and the illumination of the image, and acquiring a color mode image, a saturation graph and a brightness image.
3. The method for robust silence liveness detection based on color modality images as claimed in claim 1, wherein said step S4 comprises the sub-steps of:
s41, performing binarization processing on the color mode image cut in the step S3 by adopting an adaptive threshold value Dajin binarization method, specifically including firstly calculating to obtain a gray level histogram and a probability histogram of the original image, then traversing all possible threshold values t, finding a corresponding threshold value when the inter-class variance is maximum, and performing binarization processing on the color mode image by using the threshold value;
and S42, acquiring connected regions contained in the color mode image subjected to the binarization processing in the step S41, calculating the areas of all the connected regions, sequencing the connected regions in sequence, reserving the connected region with the largest area, and deleting other connected regions to obtain a binarized color mode image only containing one connected region.
4. The method according to claim 3, wherein the step S6 includes the following sub-steps:
s61, extracting features by taking the DenseNet121 as a backbone network, then classifying by utilizing the extracted features by a full connection layer, and judging whether the image is a real image or a deceptive image;
s62, combining a pixel supervision technology, taking BCELoss and Cross Engine Loss as Loss functions, and supervising the training of the model to establish a CMNet network model;
and S63, selecting a random gradient descent optimizer as the optimizer in the training process, inputting the three-channel image obtained in the step S5 into the CMNet network model, and training the CMNet network model in a transfer learning mode.
CN202110116023.9A 2021-01-28 2021-01-28 Robustness silence living body detection method based on color mode image Expired - Fee Related CN112766205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110116023.9A CN112766205B (en) 2021-01-28 2021-01-28 Robustness silence living body detection method based on color mode image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110116023.9A CN112766205B (en) 2021-01-28 2021-01-28 Robustness silence living body detection method based on color mode image

Publications (2)

Publication Number Publication Date
CN112766205A CN112766205A (en) 2021-05-07
CN112766205B true CN112766205B (en) 2022-02-11

Family

ID=75706367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110116023.9A Expired - Fee Related CN112766205B (en) 2021-01-28 2021-01-28 Robustness silence living body detection method based on color mode image

Country Status (1)

Country Link
CN (1) CN112766205B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101059836A (en) * 2007-06-01 2007-10-24 华南理工大学 Human eye positioning and human eye state recognition method
CN101976352A (en) * 2010-10-29 2011-02-16 上海交通大学 Various illumination face identification method based on small sample emulating and sparse expression
CN108197534A (en) * 2017-12-19 2018-06-22 迈巨(深圳)科技有限公司 A kind of head part's attitude detecting method, electronic equipment and storage medium
CN111160257A (en) * 2019-12-30 2020-05-15 河南中原大数据研究院有限公司 Monocular human face in-vivo detection method stable to illumination transformation
CN111932540A (en) * 2020-10-14 2020-11-13 北京信诺卫康科技有限公司 CT image contrast characteristic learning method for clinical typing of new coronary pneumonia

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107590430A (en) * 2017-07-26 2018-01-16 百度在线网络技术(北京)有限公司 Biopsy method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101059836A (en) * 2007-06-01 2007-10-24 华南理工大学 Human eye positioning and human eye state recognition method
CN101976352A (en) * 2010-10-29 2011-02-16 上海交通大学 Various illumination face identification method based on small sample emulating and sparse expression
CN108197534A (en) * 2017-12-19 2018-06-22 迈巨(深圳)科技有限公司 A kind of head part's attitude detecting method, electronic equipment and storage medium
CN111160257A (en) * 2019-12-30 2020-05-15 河南中原大数据研究院有限公司 Monocular human face in-vivo detection method stable to illumination transformation
CN111932540A (en) * 2020-10-14 2020-11-13 北京信诺卫康科技有限公司 CT image contrast characteristic learning method for clinical typing of new coronary pneumonia

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Automatic Face Detection Using Color Based Segmentation;Yogesh Tayal 等;《International Journal of Scientific and Research Publications》;20120630;第2卷(第6期);第1-7页 *
基于多模态特征融合的轻量级人脸活体检测方法;皮家甜 等;《计算机应用》;20201210;第40卷(第12期);第3658-3665页 *

Also Published As

Publication number Publication date
CN112766205A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
Matern et al. Gradient-based illumination description for image forgery detection
WO2019134536A1 (en) Neural network model-based human face living body detection
JP4755202B2 (en) Face feature detection method
US8582875B2 (en) Method for skin tone detection
CN111460931A (en) Face spoofing detection method and system based on color channel difference image characteristics
CN109948566B (en) Double-flow face anti-fraud detection method based on weight fusion and feature selection
CN112861671B (en) Method for identifying deeply forged face image and video
CN112052830B (en) Method, device and computer storage medium for face detection
CN115131880A (en) Multi-scale attention fusion double-supervision human face in-vivo detection method
Tao et al. Smoke vehicle detection based on robust codebook model and robust volume local binary count patterns
CN111882525A (en) Image reproduction detection method based on LBP watermark characteristics and fine-grained identification
Zaidan et al. A new hybrid module for skin detector using fuzzy inference system structure and explicit rules
CN112766205B (en) Robustness silence living body detection method based on color mode image
CN112016437A (en) Living body detection method based on face video key frame
Hadiprakoso Face anti-spoofing method with blinking eye and hsv texture analysis
JP3962517B2 (en) Face detection method and apparatus, and computer-readable medium
Alharbi et al. Spoofing Face Detection Using Novel Edge-Net Autoencoder for Security.
Chang et al. Image Forgery Using An Enhanced Bayesian Matting Algorithm
CN109961025B (en) True and false face identification and detection method and detection system based on image skewness
CN116012248B (en) Image processing method, device, computer equipment and computer storage medium
WO2024025134A1 (en) A system and method for real time optical illusion photography
CN112818782B (en) Generalized silence living body detection method based on medium sensing
Neves et al. GAN Fingerprints in Face Image Synthesis
CN117541969B (en) Pornography video detection method based on semantics and image enhancement
Fang et al. Studies Advanced in Robust Face Recognition under Complex Light Intensity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220211