WO2022247539A1 - 活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品 - Google Patents

活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品 Download PDF

Info

Publication number
WO2022247539A1
WO2022247539A1 PCT/CN2022/088444 CN2022088444W WO2022247539A1 WO 2022247539 A1 WO2022247539 A1 WO 2022247539A1 CN 2022088444 W CN2022088444 W CN 2022088444W WO 2022247539 A1 WO2022247539 A1 WO 2022247539A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
training
features
network
variance
Prior art date
Application number
PCT/CN2022/088444
Other languages
English (en)
French (fr)
Inventor
姚太平
张克越
丁守鸿
李季檩
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to US17/993,246 priority Critical patent/US20230082906A1/en
Publication of WO2022247539A1 publication Critical patent/WO2022247539A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/247Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/50Maintenance of biometric data or enrolment thereof

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a living body detection method, an estimation network processing method, a device, a computer device, a storage medium and a computer-readable instruction product.
  • living body detection technology has emerged, which is widely used in remote banking services, face payment and access control systems.
  • the detection model trained by the sample image is used to detect the living body of the image to be tested, so as to determine whether the image to be tested is a living body image.
  • these images to be tested and the sample images used in the training process have differences in domain information such as faces, lighting, background, and attack types, that is, the data distribution of the real images to be tested and the sample images are Therefore, the model generalization ability of the detection model is insufficient, and when the detection model is used for live detection, the accuracy of the detection result is low.
  • a living body detection method an estimation network processing method, a device, a computer device, a storage medium, and a computer-readable instruction product are provided.
  • a living body detection method performed by a computer device, said method comprising:
  • a living body detection device comprising:
  • a feature extraction module is used to extract image features from images to be tested in different data domains
  • the convolution processing module is used to perform convolution processing on the image features through the estimation network to obtain the predicted mean value and predicted variance of the image features;
  • An acquisition module configured to acquire network parameters used for standardization in the estimation network
  • a standardization processing module configured to standardize the image features based on the predicted mean value, the predicted variance and the network parameters, to obtain standardized features
  • a determining module configured to determine whether the image to be tested is a living body image according to the living body classification probability obtained by classifying the image to be tested based on the standardized features.
  • a computer device comprising a memory and a processor, the memory stores computer-readable instructions, and the processor implements the following steps when executing the computer-readable instructions:
  • a computer-readable storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • a computer-readable instruction product including computer-readable instructions, the computer-readable instructions implement the following steps when executed by a processor:
  • An estimation network processing method for liveness detection executed by a computer device, the method comprising:
  • Feature extraction is performed on sample images in different data domains to obtain training image features
  • the estimated network after parameter adjustment is used to determine the predicted mean value and predicted variance of the image features in the image to be tested, so as to base on the predicted mean value , the prediction variance and the network parameters used for normalization in the estimation network to standardize the image features, and determine whether the image to be tested is an in vivo image according to the obtained normalized features.
  • An estimation network processing device for living body detection comprising:
  • a feature extraction module is used to extract features from sample images in different data domains to obtain training image features
  • the convolution processing module is used to perform convolution processing on the training image features through the estimation network before training to obtain the training prediction mean value and training prediction variance of the training image features;
  • a normalization processing module configured to determine an estimated loss value based on the training prediction mean and the statistical mean of the training image features, as well as the training prediction variance and the training image feature statistical variance;
  • the determination module is used to adjust the network parameters of the estimated network before training based on the estimated loss value; the estimated network after parameter adjustment is used to determine the predicted mean value and predicted variance of the image features in the image to be tested, so as to Standardize the image features based on the predicted mean value, the predicted variance, and the network parameters used for normalization in the estimation network, and determine whether the image to be tested is an in vivo image according to the obtained normalized features.
  • a computer device comprising a memory and a processor, the memory stores computer-readable instructions, and the processor implements the following steps when executing the computer-readable instructions:
  • Feature extraction is performed on sample images in different data domains to obtain training image features
  • the estimated network after parameter adjustment is used to determine the predicted mean value and predicted variance of the image features in the image to be tested, so as to base on the predicted mean value , the prediction variance and the network parameters used for normalization in the estimation network to standardize the image features, and determine whether the image to be tested is an in vivo image according to the obtained normalized features.
  • a computer-readable storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • Feature extraction is performed on sample images in different data domains to obtain training image features
  • the estimated network after parameter adjustment is used to determine the predicted mean value and predicted variance of the image features in the image to be tested, so as to base on the predicted mean value , the prediction variance and the network parameters used for normalization in the estimation network to standardize the image features, and determine whether the image to be tested is an in vivo image according to the obtained normalized features.
  • a computer-readable instruction product including computer-readable instructions, the computer-readable instructions implement the following steps when executed by a processor:
  • Feature extraction is performed on sample images in different data domains to obtain training image features
  • the estimated network after parameter adjustment is used to determine the predicted mean value and predicted variance of the image features in the image to be tested, so as to base on the predicted mean value , the prediction variance and the network parameters used for normalization in the estimation network to standardize the image features, and determine whether the image to be tested is an in vivo image according to the obtained normalized features.
  • Fig. 1 is the application environment diagram of living body detection method in an embodiment
  • Fig. 2 is a schematic flow chart of a living body detection method in an embodiment
  • Figure 3 is a schematic diagram of an estimation network in one embodiment
  • Fig. 4 is a schematic structural diagram of a living body detection method in an embodiment
  • Fig. 5 is a schematic diagram of the principle of standardization processing in an embodiment
  • Figure 6 is a schematic diagram of the application of the living body detection method in an embodiment
  • Figure 7 is a schematic diagram of the application of the living body detection method in another embodiment
  • Fig. 8 is a schematic diagram of the application of the living body detection method in another embodiment
  • FIG. 9 is a timing diagram of a living body detection method in an embodiment
  • Fig. 10 is a schematic flow chart of a living body detection method in another embodiment
  • Figure 11 is a schematic structural diagram of a living body detection method in an embodiment
  • Fig. 12a is a schematic diagram of a target area in a sample image in one embodiment
  • Fig. 12b is a schematic diagram of an enlarged target area in a sample image in an embodiment
  • Fig. 12c is a schematic diagram of a sample image after deleting the background area in an embodiment
  • Fig. 12d is a schematic diagram of a depth map label of a sample image in an embodiment
  • Fig. 13 is a schematic diagram of an estimation network processing method for liveness detection in an embodiment
  • Fig. 14 is a structural block diagram of a living body detection device in an embodiment
  • Fig. 15 is a structural block diagram of a living body detection device in another embodiment
  • Fig. 16 is a structural block diagram of an estimation network processing device for live body detection in an embodiment
  • Fig. 17 is a structural block diagram of an estimation network processing device for liveness detection in another embodiment
  • Figure 18 is a diagram of the internal structure of a computer device in one embodiment.
  • the living body detection method provided in this application can be applied to the application environment shown in FIG. 1 .
  • the terminal 102 communicates with the service device 104 through the network.
  • the service device 104 extracts image features from images to be tested in different data domains.
  • the image features are convoluted through the estimation network to obtain the predicted mean and predicted variance of the image features.
  • the service device 104 acquires the network parameters used for normalization in the estimated network, standardizes the image features based on the predicted mean, predicted variance and network parameters, obtains standardized features, and determines whether the image to be tested is a live image according to the standardized features.
  • the terminal 102 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the service device 104 can be an independent physical server, or it can be a server cluster composed of multiple service nodes in the blockchain system. Each service node forms a point-to-point (P2P, Peer To Peer) network.
  • P2P point-to-point
  • the P2P protocol is a An application layer protocol that runs on top of the Transmission Control Protocol (TCP, Transmission Control Protocol).
  • the service device 104 can also be a server cluster composed of multiple physical servers, which can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services , content delivery network (Content Delivery Network, CDN), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • the service device 104 can also be integrated with an access control system, and the service device 104 can perform liveness detection on the image to be tested in combination with the access control system.
  • the terminal 102 and the service device 104 can be connected through communication connection methods such as Bluetooth, USB (Universal Serial Bus, Universal Serial Bus) or a network, and this application does not make a limitation here.
  • a living body detection method is provided.
  • the application of the method to the service device in FIG. 1 is used as an example for illustration, including the following steps:
  • the service device extracts image features from images to be tested in different data domains.
  • different data domains may refer to different image categories.
  • the images to be tested in different data domains can be: different types of images collected in different application scenarios, such as images collected outdoors with strong light, images collected outdoors with weak light, and images collected outdoors at night Images captured, and images captured indoors.
  • the image to be tested is an image to be detected, which can be a color image or a grayscale image.
  • the service device performs grayscale processing on the image to be tested.
  • the image to be tested may include a face image of the object to be tested, and may also include gestures or expressions or a background area.
  • the images to be tested may be images collected under various lighting conditions and backgrounds, and may also have various resolutions, sizes, etc., which are not limited in this application.
  • Liveness detection is a technology to detect whether the object to be tested contained in the image to be tested is a living body. It is usually used to determine whether the object to be tested in the image to be tested is a real live user, and is mostly used in identity verification scenarios. For example, when registering a virtual account remotely through a mobile terminal, the service device that manages the virtual account needs to verify whether the user who requests to register the virtual account is himself or not.
  • the image is an in-vivo image obtained by shooting user a in real time, rather than a non-in-vivo image obtained by shooting another person's image stolen by user a, a mannequin or a mask worn by user a.
  • image features are the characteristics or characteristics of the image itself that can be distinguished from other images, including features that can be intuitively felt and features that need to be transformed or processed.
  • Features that can be felt intuitively can be, for example, features such as brightness, edge, texture, outline, and color; features that can only be obtained through transformation or processing, such as moments, histograms, and principal components.
  • S202 may specifically include: the service device calculates and obtains image features of the image to be tested by using a feature extraction algorithm.
  • the service device can extract the edge features of the image to be tested from the image to be tested through the Sobel operator, and the service device can also calculate the seventh-order characteristic moment of the image to be tested through the seventh-order moment algorithm.
  • S202 may specifically include: the service device may further extract image features from the image to be tested through a neural network.
  • the neural network may be, for example, a feedforward neural network, a convolutional neural network, a residual convolutional neural network, a recurrent neural network, and the like.
  • At least one preprocessing of grayscale processing, image enhancement processing, and denoising processing may be performed on the image to be tested.
  • the service device can capture the object to be tested in the target environment through an integrated camera to obtain the image to be tested.
  • the service device When the service device is a server, the service device can establish a communication connection with the terminal, and then receive the image to be tested from the terminal side.
  • the image to be tested may be an image collected by the terminal during the process of registering a virtual account, or an image collected during face-to-face payment.
  • the service device performs convolution processing on the image features through the estimation network to obtain the predicted mean and predicted variance of the image features.
  • the estimating network may be a neural network including a global pooling layer and at least two convolutional layers, which is used to predict the mean and variance of image features and the like.
  • the convolution processing is based on the convolution layer to perform convolution calculation. It should be pointed out that the number of features of at least two convolutional layers performed by the serving device may be different.
  • the service device performs convolution processing on the image features through the first convolutional layer in the estimation network to obtain the first convolutional features; then, the first convolutional features are input to the second convolutional layer in the estimation network Convolutional layer, the first convolutional feature is convoluted through the second convolutional layer to obtain the second convolutional feature, and so on until the last convolutional layer in the estimation network, and the last convolutional layer is on top of The convolutional features output by the first convolutional layer are convoluted to obtain the predicted mean and predicted variance.
  • the estimation network includes four convolutional layers, and its feature numbers decrease layer by layer from convolutional layer 1 (conv64) to convolutional layer 4 (conv1), that is, the feature number of convolutional layer 1 is 64, the feature number of convolutional layer 2 (conv32) is 32, the feature number of convolutional layer 3 (conv16) is 16, and the feature number of convolutional layer 4 is 1.
  • the image features are convoluted layer by layer through the convolutional layer 1 to the convolutional layer 4, and the convolutional layer 4 outputs the predicted mean value and predicted variance.
  • S204 may specifically include: the service device may first pool the image features through the estimation network, and then perform convolution processing on the obtained pooled image features to obtain the predicted mean and predicted variance of the image features.
  • performing pooling on the image features may be down-sampling the image features to reduce the amount of data of the image features.
  • Pooling includes global pooling and local pooling, and pooling methods include maximum pooling, average pooling, random pooling, etc.
  • the service device first inputs image features into a global pooling layer in the estimation network, and performs global pooling on the image features to obtain pooled image features. Then, the service device performs multi-layer convolution calculation on the pooled image features through multiple cascaded convolution layers to obtain the predicted mean and predicted variance of the image features. For example, as shown in Figure 3, the service device can first input the image features into the global average pooling layer (Global Average Pooling, GAP) for global pooling processing, and then input the obtained pooled image features into the convolutional layer 1, and convolve The output of convolutional layer 1 is input to convolutional layer 2, and so on, until the predicted mean and predicted variance of the output of the last convolutional layer are obtained.
  • GAP Global Average Pooling
  • the service device acquires network parameters used for standardization processing in the estimated network.
  • the network parameters are parameters used in the estimation network for standardizing image features, and the network parameters may be learned by training the estimation network based on sample images.
  • the service device can convert image features into dimensionless pure values in a specific interval, which facilitates unified processing of image features in different units or orders of magnitude.
  • S208 may specifically include: the service device may use the predicted mean value, predicted variance, and network parameters as parameters in the normalization processing algorithm, and use image features as independent variables of the normalization processing algorithm, so as to calculate the normalization features.
  • the normalization processing algorithm includes but not limited to a straight line method, a broken line method, a curve method and the like.
  • the straight-line method may be, for example, the extreme value method, the standard deviation method, etc.; the broken-line method may be, for example, the tri-line method; and the curve-type method may be, for example, the semi-normal distribution method.
  • the service device determines whether the image to be tested is a living body image according to the living body classification probability obtained by classifying the image to be tested based on standardized features.
  • the live image may refer to an image containing a live object. If the image to be tested is a live image, it means that the object to be tested in the image to be tested belongs to a live object. For example, the image captured by the user through a camera is a live image; If the image to be tested is not a living body image (that is, the image to be tested is a non-living body image), it means that the object to be tested in the image to be tested is not a living body object. For example, the image obtained by taking a photo of the user or a mask worn by a camera is not a living body image.
  • S210 may include: the service device may calculate the probability that the image to be tested is an in vivo image according to the standardized features, and determine that the image to be tested is an in vivo image when the probability that the image to be tested is an in vivo image is greater than a probability threshold.
  • S210 may include: the service device inputs the standardized features to the classifier, so that the classifier classifies the test image based on the standardized features to obtain the classification probability of the living body; when the classification probability of the living body reaches the preset threshold, the service The device determines that the image to be tested is an image of a living body; when the classification probability of a living body does not reach a preset threshold, the service device determines that the image to be tested is an image of a non-living body. Since the standardized features obtained by standardizing the image features according to the predicted mean and predicted variance obtained from the estimation network prediction are more accurate, the living body classification probability obtained by the service device is more accurate, which improves the accuracy of living body detection.
  • the classifier may include a binary classification classifier, a multi-classification classifier, a multi-task classifier, etc., and is used to classify the standardized features to obtain a classification result of whether the image is a live image.
  • the classifier may be a classifier constructed based on a decision tree algorithm, a logistic regression algorithm, a naive Bayesian algorithm, or a neural network algorithm.
  • the classifier is trained based on the standardized training features of the sample images.
  • the specific training steps may include: the service device extracts the training image features from the sample images, and then based on the predicted mean, predicted variance and pre-training Estimate the network parameters used for standardization processing in the network, and standardize the training image features to obtain standardized training features; then, input the standardized training features into the classifier before training to obtain the living body classification probability of whether the sample image is a living body image.
  • the service device calculates the classification loss value according to the classification probability of the living body and the label of the sample image, and adjusts the parameters of the pre-training classifier according to the classification loss value, so as to obtain the final classifier.
  • the living body classification probability may be a value between [0,1].
  • the preset threshold is a threshold set by the service device according to detection requirements, and the service device can adjust the preset threshold according to actual detection requirements.
  • the service device inputs the obtained image to be tested into the feature extraction network, and the feature extraction network includes multiple cascaded convolutional layers and data normalization layers. Among them, the convolutional layer is used to extract the image features of the image to be tested, and the data normalization layer is used to perform data standardization on the features extracted by the convolutional layer.
  • the data normalization layer includes an estimation network, which is used to pool the image features. Normalization and convolution calculations to obtain the predicted mean and predicted variance.
  • the service device first inputs the image to be tested into the convolutional layer of the feature extraction network, performs convolution calculation on the image to be tested through the convolutional layer, and extracts the image features of the image to be tested. Then, the service device inputs the image features into the estimation network in the data normalization layer, the estimation network pools the image features, and performs at least two convolutions on the pooled image features to obtain the predicted mean and predicted variance of the image features .
  • the data normalization layer normalizes the image features based on the predicted mean and predicted variance, and performs affine transformation on the normalized features based on the linear transformation parameters and translation parameters to obtain standardized features.
  • the service device inputs the standardized features into the next convolutional layer for feature extraction again, and then performs standardized processing through the data normalization layer, so that the above-mentioned processing is performed several times to obtain the final standardized features, and the final standardized features are input into the classifier to obtain Liveness classification probabilities.
  • the estimated mean and variance of the extracted image features are predicted by the estimation network to obtain the predicted mean and predicted variance of the image features, thereby avoiding the use of the mean and variance of the data normalization layer during model training, which is beneficial to the different scenarios.
  • the images to be tested can be standardized based on the predicted mean and predicted variance obtained from the prediction, and liveness detection can be performed according to the obtained standardized features, which improves the universality of liveness detection, and improves the liveness of the images to be tested in different data domains. detection accuracy.
  • the image features are also standardized in combination with the network parameters used for standardization in the estimated network and the predicted mean value and predicted variance. Since the network parameters in the estimated network are obtained after model training, thus The obtained standardized features are more conducive to living body detection, and are conducive to improving the accuracy of living body detection of the image to be tested.
  • the network parameters include linear transformation parameters and translation parameters;
  • S208 includes: the service device normalizes the image features according to the predicted mean value and prediction variance to obtain normalized features; the service device based on the linear transformation parameters and translation parameters Perform affine transformation on the normalized features to obtain standardized features.
  • the normalization process is to scale the image feature to a preset scale, that is, to make the mean value of the image feature be 0 and the variance be unit variance.
  • the data distribution of image features is different, for example, the brightness of different images is different; or the contrast of different images is different, that is, the brightness level difference between the brightest and darkest areas of the image is relatively large.
  • Normalization processing can reduce the impact of differences in data distribution on the extracted image features.
  • the service device calculates the difference between the image feature before normalization processing and the predicted mean value, and then calculates the quotient between the obtained difference value and the predicted variance to obtain the normalized feature.
  • the service device performs normalization processing on image features through formula (1).
  • x is the image feature before normalization processing
  • is the predicted mean
  • is the predicted variance
  • x′ is the normalized feature.
  • affine transformation is a mapping method that maps normalized features from one vector space to another through linear transformation and translation.
  • Affine transformation is a linear transformation between two-dimensional coordinates and two-dimensional coordinates, and can maintain the straightness and translation of two-dimensional graphics, that is, the relative positional relationship between straight lines can be kept unchanged, and parallel lines undergo affine transformation After that, it is still a parallel line, and the position order of the points on the line will not change.
  • the linear transformation parameter is a parameter for performing linear transformation on the normalized feature during the affine transformation process.
  • the translation parameter is the parameter that translates the normalized features during the affine transformation process.
  • the service device performs affine transformation on the normalized features by formula (2).
  • x' is the normalized feature
  • is the linear transformation parameter
  • is the translation parameter.
  • the service device inputs the image features into the estimation network, and obtains the predicted mean value and predicted variance of the image features through the estimation network. Then, the service device performs normalization processing on the image features according to the predicted mean value and the predicted variance to obtain normalized features. After obtaining the normalized features, the service device performs affine transformation on the normalized features through the linear transformation parameters and the translation parameters to obtain the normalized features.
  • the service device normalizes the image features according to the predicted mean value and the predicted variance, and performs affine transformation on the normalized features to obtain standardized features.
  • the method of standardizing the image features according to the predicted mean and predicted variance reduces the difference between the real image to be tested and the sample image data distribution. Due to the influence caused by the standardization process, the accuracy of the obtained standardized features is higher, so that the living body detection results obtained according to the standardized features are more accurate.
  • S208 may specifically include: the service device acquires the statistical mean and statistical variance of the training image features; the training image features are image features extracted from sample images; the service device performs weighted calculation on the predicted mean and statistical mean, Obtain a weighted mean; and, the service device performs weighted calculation on the prediction variance and statistical variance to obtain a weighted variance; the service device standardizes the image features according to the weighted mean, weighted variance, and network parameters to obtain standardized features.
  • the sample images may be images of different data domains extracted from the training image set.
  • the service device extracts multiple live images and multiple non-living images of different data domains from the training image set.
  • the object to be measured in the in vivo image and the object to be measured in the non-in vivo image may be the same in vivo object, or they may not be the same in vivo object.
  • their data distributions can be different, for example, sample images in different data domains have different brightness, contrast and attack types, etc.
  • the sample image of the attack type may refer to an image collected by a user using a paper photo, a photo displayed on a display screen, or wearing a mask when collecting images in real time.
  • the statistical mean value is the mean value obtained by performing statistics on the training image features of the sample image.
  • the statistical variance is the variance obtained by performing statistics on the training image features of the sample image.
  • Predicted mean and predicted variance are the mean and variance of the image under test estimated by the estimation network. Since the estimation network is trained by sample images, when the data distribution of the image to be tested and the sample image differ greatly, the predicted mean and predicted variance can also be relatively close to the real mean and variance of the image to be tested.
  • the weighted calculation of the predicted mean value and the statistical mean value by the service device may refer to: respectively weighting the predicted mean value and the statistical mean value by using different weight values, and then summing up the weighted results.
  • the service device performs weighted calculation on the forecast variance and the statistical variance: respectively weights the forecast variance and the statistical variance by using different weight values, and then sums the weighted results.
  • the service device standardizes the image features by using the weighted mean and weighted variance obtained from the weighted calculation and the network parameters, so that the service device can obtain more accurate standardized features when the data distribution of the image to be tested and the sample image is greatly different or small, improving The stability and accuracy of the living body detection results are ensured.
  • the respective weight values corresponding to the predicted mean value and the statistical mean value, and the respective weight values corresponding to the predicted variance and the statistical variance may be adjusted as required.
  • the weights corresponding to the forecast mean, statistical mean, forecast variance and statistical variance may be the same or different.
  • the weighted mean obtained by the service device combines the predicted mean and the statistical mean
  • the weighted variance combines the predicted variance and the statistical variance, so no matter when the data distribution of the image to be tested and the sample image differ greatly or small, it can be obtained More accurate standardized features improve the stability and accuracy of the living body detection results.
  • the service device extracts sample images from a plurality of sample image sets in different data domains. The sample images extracted have various data distributions, and the statistical mean and statistical variance obtained have stronger generalization performance, so that the service device according to The standardized features obtained by weighted mean and weighted variance are more accurate, which improves the accuracy of liveness detection results.
  • the image to be tested is an image collected by the client in response to the interaction request; the living body detection method further includes: when the interaction request is a resource account registration request and the image to be tested is a living body image, generating resource account information, And feed back the resource account information to the client; when the interaction request is a resource transfer request and the image to be tested is a live image, transfer the resources specified in the resource transfer request; when the interaction request is an access switch opening request, and the image to be tested When the image is a live image, turn on the access switch.
  • the interaction request is a request triggered by the user through the client to acquire the business service provided by the service device.
  • the resource account registration request is a request for registering a resource account with the service device.
  • a resource account is an account that can interact with virtual resources, including but not limited to virtual currency, virtual items, etc.
  • the resource transfer request is a request to apply to the service device for transferring virtual resources, and the transfer includes transfer-in and transfer-out.
  • the access switch opening request is a request to apply to the service device for opening the access switch.
  • the client is an application program that provides financial services.
  • the registration initiation page 60A of the client includes a registration button 602.
  • the client will generate a resource account registration request, to request the service device 62 to register a resource account, and the service device performs liveness detection before registering the resource account, and the specific process is as follows:
  • the client sends a resource account registration request to the service device 62 .
  • the service device sends a living body detection instruction to the client in response to the resource account registration request, so as to instruct the client to collect a face image of the object to be detected.
  • the client terminal In response to the living body detection instruction, the client terminal displays an image capture frame 604 on the image capture page 60B, so as to capture a test image of the object to be measured in the image capture frame 604 . After the client acquires the image to be tested through the image acquisition device, S606 is executed.
  • the client sends the image to be tested to the service device 62.
  • the service device 62 extracts image features from the image to be tested, and pools the image features through the estimation network to obtain pooled image features, and performs at least two convolution processing on the pooled image features to obtain the predicted mean and forecast variance.
  • the service device 62 normalizes the image features according to the predicted mean value and predicted variance to obtain normalized features, and performs affine transformation on the normalized features based on the linear transformation parameters and translation parameters to obtain standardized features.
  • the service device 62 inputs the standardized features to the classifier, so that the classifier can classify the image to be tested based on the standardized features to obtain the living body classification probability.
  • the living body classification probability When the living body classification probability reaches the preset threshold, it is determined that the image to be tested is a living body image; when the living body classification probability does not reach the preset threshold, it is determined that the test image is a non-living body image, so as to obtain the detection of whether the test image is a living body image result.
  • the service device 62 sends the detection result to the client, and at the same time sends the resource account information when the image to be tested is a living body image.
  • the client After receiving the resource account information, the client registers according to the resource account information, and displays the registration result on the result display page 60C.
  • the service device performs liveness detection on the object to be tested that requests to register a resource account, and when it is determined that the image to be tested is a living body image, that is, when the object to be tested is determined to be a living body object, the resource account information is generated and fed back to the client Resource account information is used to register resource accounts, which can effectively prevent illegal users from posing as legitimate users to register resource accounts, ensure the legal use of resource accounts, and improve the registration efficiency of resource accounts compared with manual resource account registration.
  • the client is a payment application
  • the interaction request is a resource transfer request.
  • the payment request page 70A of the client includes a payment list 702 and a payment control 704 .
  • the client will generate a resource transfer request to request the service device 72 to perform resource transfer, and the service device will perform liveness detection before the resource transfer.
  • the specific process is as follows:
  • the client sends a resource transfer request to the service device 72 .
  • the service device 72 sends a living body detection instruction to the client in response to the resource transfer request, so as to instruct the client to collect a face image of the object to be detected.
  • the client terminal In response to the living body detection instruction, the client terminal displays an image capture frame 706 on the image capture page 70B, so as to capture the image of the object to be measured in the image capture frame 706 . After the client acquires the image to be tested through the image acquisition device, S706 is executed.
  • the client sends the image to be tested to the service device 72.
  • the service device 72 extracts image features from the image to be tested, and pools the image features through the estimation network to obtain pooled image features, and performs at least two convolution processing on the pooled image features to obtain the predicted mean and forecast variance.
  • the service device 72 normalizes the image features according to the predicted mean value and predicted variance to obtain normalized features, and performs affine transformation on the normalized features based on linear transformation parameters and translation parameters to obtain standardized features.
  • the service device 72 inputs the standardized features to the classifier, so that the classifier can classify the image to be tested based on the standardized features to obtain the living body classification probability.
  • the classification probability of living body reaches the preset threshold, it is determined that the image to be tested is a living body image; when the classification probability of living body does not reach the preset threshold, it is determined that the image to be tested is a non-living body image.
  • the service device determines that the image to be tested is an in vivo image, it transfers the amount of resources specified in the resource transfer request, and executes S708.
  • the service device 72 sends payment success information to the client.
  • the client displays payment success information 708 on the result display page 70C.
  • the service device performs liveness detection on the object to be measured that requests resource transfer, and when it is determined that the image to be tested is a living body image, that is, when the object to be measured is determined to be a living body object, the service device performs a liveness detection on the amount of resources specified in the resource transfer request.
  • the transfer ensures the security of resource transfer and improves the efficiency of resource transfer.
  • the client is an access control system
  • the interaction request is an access switch opening request.
  • the system displays an image capture frame 802 on the access switch control page 80 , so as to capture the image to be measured of the object to be measured through the image capture frame 802 .
  • the access control system When the access control system collects the image to be tested through the camera, the access control system sends the image to be tested to the service device (ie, the gate 82 on which the access control system is installed).
  • the service device extracts image features from the image to be tested, and pools the image features through the estimation network to obtain the pooled image features, and performs at least two convolution processing on the pooled image features to obtain the predicted mean value and predicted value of the image features variance.
  • the service device normalizes the image features according to the predicted mean value and predicted variance to obtain normalized features, and performs affine transformation on the normalized features based on the linear transformation parameters and translation parameters to obtain standardized features.
  • the service device inputs the standardized features to the classifier, so that the classifier can classify the test image based on the standardized features to obtain the classification probability of the living body.
  • the classification probability of living body reaches the preset threshold, it is determined that the image to be tested is a living body image; when the classification probability of living body does not reach the preset threshold, it is determined that the image to be tested is a non-living body image.
  • the service device sends the result of whether the image to be tested is a living image or a non-living image to the access control system, so that when the access control system determines that the image to be tested is a living image, it turns on the access switch and displays information allowing passage on the access switch control page 804: When the access control system determines that the image to be tested is a non-living image, the access switch is not turned on.
  • liveness detection is performed on the subject to be tested who requests to turn on the access switch, so as to verify the identity of the subject to be tested, and when the image to be tested is a living body image, the access switch is turned on to ensure the security of the access control system, and compared to Manual verification of the test object saves costs and improves traffic efficiency.
  • the living body detection method includes the following steps:
  • the client sends an interaction request to the service device.
  • the client may be an application program installed on the terminal, such as a social application, a payment application, a shopping application, or other application programs that provide financial services.
  • the client can also be an access control system installed on the service device.
  • the service device sends a detection instruction to the client in response to the interaction request.
  • the client collects images to be tested for the object to be tested according to the detection instruction.
  • the client sends the image to be tested to the service device.
  • the service device pools the image features through the estimation network, and performs at least two convolution processes on the obtained pooled image features to obtain a predicted mean value and a predicted variance of the image features.
  • the service device normalizes the image features according to the predicted mean value and the predicted variance to obtain normalized features, and performs affine transformation on the normalized features based on the linear transformation parameters and translation parameters to obtain standardized features.
  • the service device inputs the standardized features to the classifier, so that the classifier classifies the image to be tested based on the standardized features, obtains a living body classification probability, and determines whether the image to be tested is a living body image according to the standardized features.
  • the service device sends the liveness detection result of the object to be measured to the client.
  • the estimation network is obtained by training the estimation network before training; the training of the estimation network before training by the service device includes the following steps:
  • the service device extracts features from sample images in different data domains to obtain training image features.
  • sample images of different data domains can be: sample images of different categories collected in different application scenarios, such as sample images collected outdoors with strong light, sample images collected outdoors with weak light, and sample images collected outdoors with weak light. Sample images collected at night, and sample images collected indoors.
  • the sample images may be images of different data domains extracted from the training image set.
  • the service device respectively extracts a plurality of living body sample images and a plurality of non-living body sample images in different data domains from the training image set.
  • the object to be measured in the image of the living body sample and the object to be measured in the image of the non-living body sample may be the same living body object, or they may not be the same living body object.
  • the training image features are the characteristics or characteristics of the sample image itself that can be distinguished from other sample images, including features that can be intuitively felt and features that need to be transformed or processed.
  • Features that can be felt intuitively can be, for example, features such as brightness, edge, texture, outline, and color; features that can only be obtained through transformation or processing, such as moments, histograms, and principal components.
  • S1002 may specifically include: the service device calculates the training image features of the sample image through a feature extraction algorithm.
  • the service device can extract the edge features of the sample image from the sample image through the Sobel operator, and the service device can also calculate the seventh-order characteristic moment of the sample image through the seventh-order moment algorithm.
  • S1002 may specifically include: the service device may further extract training image features from sample images through a neural network.
  • the neural network may be, for example, a feedforward neural network, a convolutional neural network, a residual convolutional neural network, a recurrent neural network, and the like.
  • the service device may perform at least one preprocessing of grayscale processing, image enhancement processing, and denoising processing on the sample image.
  • the service device performs convolution processing on the training image features through the pre-training estimation network, and obtains a training prediction mean value and a training prediction variance corresponding to the training image features.
  • the service device performs convolution processing on the training image features through the first convolution layer in the pre-training estimation network to obtain the first convolution training features; then, input the first convolution training features to the training
  • the second convolutional layer in the previous estimation network performs convolution processing on the first convolutional training features through the second convolutional layer to obtain the second convolutional training features, and so on.
  • the convolution layer performs convolution processing on the input convolution training features to obtain the training prediction mean and training prediction variance.
  • S1004 may specifically include: the service device may pool the training image features through the pre-training estimation network, and then perform convolution processing on the obtained training pooled image features to obtain the training prediction of the training image features Mean and variance of training predictions.
  • performing pooling on the training image features may be performing down-sampling on the training image features, so as to reduce the amount of data of the training image features.
  • Pooling includes global pooling and local pooling, and pooling methods include maximum pooling, average pooling, random pooling, etc.
  • the service device first inputs the training image features into the global pooling layer in the pre-training estimation network, and performs global pooling processing on the training image features to obtain pooled image features. Then, the service device performs multi-layer convolution calculation on the training pooled image features through multiple cascaded convolution layers to obtain the training prediction mean and training prediction variance of the training image features.
  • the service device determines an estimated loss value based on the training prediction mean value and the statistical mean value of the training image features, as well as the training prediction variance and the statistical variance of the training image features.
  • the statistical mean is the mean of the training image features obtained through mathematical calculation
  • the statistical variance is the variance of the training image features obtained through mathematical calculation
  • S1006 may specifically include: the service device calculates the difference or square difference between the training prediction mean and the statistical mean of the training image features, and the difference or square difference between the training prediction variance and the statistical variance of the training image features Squared difference, the sum of the above two differences, or the sum of the above two squared differences is used as the estimated loss value.
  • S1006 may specifically include: calculating the difference between the training prediction mean and the statistical mean of the training image features, and the difference between the training prediction variance and the statistical variance of the training image features, and then calculating The square of the above two differences, and the sum of the above two squares is used as the estimated loss value.
  • the service device calculates the estimated loss value according to formula (3), where Lmve is the estimated loss value, ⁇ is the predicted mean value, is the statistical mean, ⁇ 2 is the forecast variance, is the statistical variance.
  • the service device adjusts the network parameters of the estimated network before training based on the estimated loss value.
  • the service device after the service device obtains the estimated loss value, it can backpropagate the estimated loss value in the estimated network to obtain the gradient of network parameters in each network layer of the estimated network, and adjust the estimated value based on the obtained gradient Network parameters in the network that minimize the estimated loss value from the adjusted estimated network.
  • the service device adjusts the network parameters of the estimation network by estimating the loss value, which improves the ability of the estimation network to predict the mean value and variance and the generalization ability of the estimation network, so that the prediction mean value and prediction variance obtained by the estimation network prediction are as close as possible to The true mean and variance of the image under test.
  • the service device acquires the image to be tested, it can directly use the predicted mean value and predicted variance obtained from the estimation network to perform normalization processing on the image features of the image to be tested.
  • the service device can obtain a more accurate prediction mean and prediction variance.
  • the service device performs global pooling on the training image features through the pre-training estimation network, which reduces the data volume of the training image features, avoids overfitting in the process of adjusting the parameters of the pre-training estimation network, and makes the estimated network parameters adjusted
  • the prediction mean and prediction variance obtained by network prediction are more accurate, thus improving the accuracy of liveness detection.
  • the service device standardizes the training image features based on the predicted mean value, the predicted variance, and the network parameters used for normalization in the pre-training estimation network to obtain standardized training features; the service device inputs the standardized training features into the pre-training classifier, so that the pre-training classifier can classify the sample images based on standardized training features to obtain the sample classification probability; the service device calculates the classification loss value based on the sample classification probability and the label of the sample image; the service device calculates the classification loss value based on the classification loss value and The estimated loss value adjusts the network parameters of the pre-training classifier and estimated network.
  • the service device converts the training image features into a dimensionless pure value in a specific interval, which facilitates the unified processing of image features of different units or orders of magnitude.
  • the service device can use the predicted mean value, predicted variance, and network parameters used for normalization in the estimation network before training as parameters in the normalization processing algorithm, and use image features as independent variables of the normalization processing algorithm to calculate standardized features.
  • the service device may use a logarithmic loss function to calculate the label of the sample image and the classifier's classification result of the sample image to obtain a classification loss value.
  • the service device calculates the cross entropy between the classifier's classification result of the sample image and the label of the sample image, and calculates the classification loss value according to the expectation of the obtained cross entropy.
  • the service device first uses a feature extraction network to extract features from the sample image, and then inputs the standardized features obtained from the feature extraction into the classifier, and the classifier classifies the sample images according to the standardized features.
  • the service device takes the logarithm of the classification result obtained from the classification, and calculates the cross entropy between the classification result and the label of the sample image, and then calculates the classification loss value according to the expectation of the obtained cross entropy. For example, the service device calculates the classification loss value through formula (4). where X is the set of sample images, Y is the set of sample image labels, x is the sample image, and y is the label of the sample image x.
  • G is to extract the features of x through the feature extraction network to obtain the standardized features of x.
  • C is the classification result obtained by classifying x according to the standardized features through the classifier. It refers to traversing all sample images, calculating the loss between the label of the sample image and the classification result obtained by the classifier, and then taking the average, that is, calculating the expectation of the classification loss.
  • the service device adjusts the network parameters of the pre-training classifier and the estimated network according to the classification loss value, so that the predicted mean value and predicted variance obtained by the estimated network prediction obtained by adjusting the network parameters and the classifier after the adjusted network parameters are obtained
  • the accuracy of the living body classification probability is higher, which improves the accuracy of living body detection.
  • the service device extracts the training image features of the sample image, it standardizes the training image features by predicting the mean and predicting variance, which can avoid gradient descent and gradient disappearing problem.
  • the service device obtains the depth map label of the sample image; performs depth feature extraction on standardized training features to obtain a depth feature map; calculates a depth map loss value based on the depth feature map and depth map label; and obtains a depth map loss value based on the depth map loss value and the estimated loss value to adjust the network parameters of the estimated network before training.
  • the depth map label is a label used to label the sample image.
  • the depth feature map is an image that can describe the depth characteristics of the sample image obtained by extracting the depth features of the standardized training features through the depth estimation network.
  • the service device calculates the difference between the depth feature map and the depth map label of the sample image, and then takes the square of the difference as the depth map loss value.
  • the service device calculates the depth feature map and the square of the depth map label of the sample image respectively, and uses the difference between the above two squares as the depth map loss value.
  • the service device calculates the difference between the depth feature map and the depth map label of the sample image, and then calculates the norm of the difference to obtain the depth map loss value. For example, the service device calculates the depth map loss value according to formula (5), where L dep (X; dep) is the depth map loss value, X is the set of sample images, I is the depth map label of the sample image, and D is the pair The depth feature map obtained by extracting the depth features of the standardized training features of X.
  • the service device obtains the SIFT (Scale-invariant Feature Transform) label of the sample image, and performs SIFT feature estimation on the standardized training features to obtain the SIFT estimated features.
  • the service device calculates the SIFT loss value based on the SIFT estimated feature and SIFT label; adjusts the network parameters of the estimated network before training based on the SIFT loss value and the estimated loss value.
  • the depth map loss value can be backpropagated in the depth estimation network, and the classification loss value can be reversed in the feature extraction network Propagate to obtain the gradient of the network parameters in each network layer of the feature extraction network and the depth estimation network, adjust the estimation network in the feature extraction network and the network parameters of the depth estimation network based on the obtained gradient, so that according to the adjusted estimation network and depth estimation
  • the sum of the depth map loss value and the classification loss value obtained by the network is the smallest, which improves the accuracy of the prediction mean and prediction variance obtained by estimating the network prediction and the living body classification probability of the image to be tested calculated by the classifier, and improves the accuracy of the living body detection result. accuracy.
  • the service device calculates the estimated loss value, the depth map loss value and the classification loss value, and then adjusts the network parameters in the feature extraction network, the network parameters of the classifier and the network parameters of the depth estimation network, so that The sum of the estimated loss value, the depth map loss value and the classification loss value is the smallest, which improves the accuracy of the predicted mean and predicted variance obtained by estimating the network prediction and the liveness classification probability of the image to be tested calculated by the classifier, and improves the liveness detection result accuracy.
  • the service device performs depth feature extraction on standardized training features through a depth estimation network to obtain a depth feature map.
  • the service device obtains standardized training features through the feature extraction network, and then calculates estimated loss values, depth map loss values, and classification loss values.
  • the service device can adjust the network parameters of the depth estimation network while adjusting the network parameters of the feature extraction network and classifier according to the estimated loss value, depth map loss value and classification loss value; the service device can also separately adjust the depth estimation network through training samples
  • the network is trained to tune the network parameters of the depth estimation network.
  • the service device acquires sample images, and the sample images include live sample images and non-living sample images.
  • the sample image is a living sample image
  • the depth map calculation is performed on the sample image to obtain the depth map label of the sample image
  • the black base image with the same size as the sample image is used as the depth of the sample image Figure label.
  • the service device inputs the sample image into the feature extraction network, and extracts the training sample features of the sample image through the convolutional layer in the feature extraction network, and then inputs the features output by the convolutional layer into the data standardization layer connected to the convolutional layer, through data standardization layer performs data normalization on the features output by the convolutional layer.
  • the service device feeds the normalized features into the next convolutional layer, and so on until the last data normalization layer.
  • the service device inputs the standardized features output by the last data normalization layer of the feature extraction network into the classifier, and obtains the living body detection result through the classifier.
  • the service device inputs the standardized features into the depth estimation network, and performs deep feature extraction on the standardized features through the depth estimation network to obtain a depth feature map.
  • the service device calculates the classification loss value according to the liveness detection result output by the classifier and the label of the sample image, and calculates the depth map loss value according to the depth feature map and the depth map label of the sample image.
  • the service device adjusts the network parameters of the feature extraction network and the classifier to minimize the sum of the depth map loss value and the classification loss value.
  • the service device determines the estimated loss value based on the training prediction mean and the statistical mean of the training image features, as well as the training prediction variance and the statistical variance of the training image features.
  • the service device adjusts the network parameters of the feature extraction network and the classifier to minimize the sum of the depth map loss value, the classification loss value and the estimated loss value.
  • the sample image includes a living sample image and a non-living sample image; obtaining a depth map label of the sample image includes: when the sample image is a living sample image, performing depth map calculation on the sample image to obtain a depth map label; When the sample image is a non-living sample image, a black basemap with the same size as the sample image is generated, and the black basemap is used as the depth map label.
  • the living body sample image may refer to a sample image including a living body object. If the sample image is a living body sample image, it means that the object to be tested in the sample image belongs to a living body object. For example, a sample image obtained by capturing a user through a camera is a living body image; If the sample image is a non-living sample image, it means that the object to be tested in the sample image is not a living object. For example, a sample image obtained by taking a photo of a user or a mask worn by a camera is a non-living sample image.
  • the service device calculates the disparity of each pixel in the sample image, and obtains the depth map label of the sample image according to the disparity of each pixel. In another embodiment, the service device performs depth calculation on the sample image through the trained neural network to obtain a depth map label of the sample image.
  • the sample image includes a living sample image and a non-living sample image
  • the service device obtains depth map labels for the sample image and the non-living sample image respectively. Then, the service device can calculate the depth loss value through the depth map label of the sample image and the depth feature map of the sample image estimated by the depth estimation network, and use the depth loss value to train the estimation network and depth estimation network in the feature extraction network , so as to improve the accuracy of the predicted mean and predicted variance estimated by the estimation network, thereby improving the accuracy of the liveness detection results.
  • the service device performs face recognition on the sample image to obtain a target area containing facial features; the service device deletes the image area outside the target area in the sample image; after the service device obtains the deleted image area, The depth map label for the sample image.
  • face recognition is a technology for detecting a face area in an image based on human facial feature information.
  • the service device performs face recognition on the sample image, and obtains the target area containing the face features as shown in the box in Fig. 12a.
  • the service device expands the target area in order to obtain more background content. The enlarged target area is shown in FIG. 12b.
  • the service device deletes the image area outside the target area, or the service device may also delete the image area outside the expanded target area.
  • the service device deletes the image area outside the target area in the white frame in Fig. 12b to obtain the image as shown in Fig. 12c.
  • the sample image in FIG. 12a is a living body sample image
  • the service device performs depth map calculation on the image shown in FIG. 12c to obtain a depth map label as shown in FIG. 12d.
  • the service device before the service device performs face recognition on the sample image, if the sample image is a color image, the service device performs grayscale processing on the sample image, and the grayscale processing algorithm includes component method and maximum value method. , average method, weighted average method, etc.
  • the service device performs image enhancement processing on the sample image before performing face recognition on the sample image.
  • Algorithms for image enhancement include space domain method and frequency domain method.
  • an estimation network processing method for liveness detection is provided.
  • the application of the method to the service device in FIG. 1 is used as an example for illustration, including the following steps:
  • the service device extracts features from sample images in different data domains to obtain training image features.
  • the service device performs convolution processing on the training image features through the pre-training estimation network to obtain the training prediction mean value and the training prediction variance of the training image features.
  • the service device determines an estimated loss value based on the training prediction mean value and the statistical mean value of the training image features, as well as the training prediction variance and the statistical variance of the training image features.
  • the service device adjusts the network parameters of the estimated network before training based on the estimated loss value; the estimated network after parameter adjustment is used to determine the predicted mean value and predicted variance of the image features in the image to be tested, and based on the predicted mean value, predicted variance and Estimate the network parameters used in the normalization process in the network to standardize the image features, and determine whether the image to be tested is a live image according to the normalized features obtained through the normalization process.
  • the service device standardizes the training image features based on the predicted mean value, predicted variance, and network parameters used for normalization in the pre-training estimation network to obtain standardized training features; input the standardized training features to the pre-training classification device, so that the pre-training classifier classifies the sample images based on the standardized training features to obtain the sample classification probability; calculate the classification loss value according to the sample classification probability and the label of the sample image; S1308 can specifically include: the service device based on the classification loss value and the estimated loss value to adjust the network parameters of the pre-training classifier and the estimated network.
  • the service device obtains the depth map label of the sample image; performs depth feature extraction on standardized training features to obtain a depth feature map; calculates a depth map loss value according to the depth feature map and the depth map label; S1308 may specifically include: The service device adjusts the network parameters of the estimated network before training based on the depth map loss value and the estimated loss value.
  • the sample image of the service device includes a living sample image and a non-living sample image; obtaining the depth map label of the sample image includes: when the sample image is a living sample image, performing depth map calculation on the sample image to obtain the depth map label ; When the sample image is a non-living sample image, generate a black basemap with the same size as the sample image, and use the black basemap as the depth map label.
  • the service device performs face recognition on the sample image to obtain a target area containing facial features; in the sample image, the image area outside the target area is deleted; the service device obtains the sample after the image area is deleted The depth map label for the image.
  • a kind of living body detection device is provided, and this device can adopt software module or hardware module, or the combination of both becomes a part of computer equipment, and this device specifically comprises: feature extraction module 1402, convolution processing module 1404, acquisition module 1406, normalization processing module 1408 and determination module 1410, wherein:
  • a feature extraction module 1402 configured to extract image features from images to be tested in different data domains
  • the convolution processing module 1404 is used to perform convolution processing on the image features through the estimation network to obtain the predicted mean value and predicted variance of the image features;
  • An acquisition module 1406, configured to acquire network parameters used for standardization in the estimated network
  • the standardization processing module 1408 is used to standardize the image features based on the predicted mean value, predicted variance and network parameters to obtain standardized features;
  • the determining module 1410 is configured to determine whether the image to be tested is a living body image according to the living body classification probability obtained by classifying the image to be tested based on standardized features.
  • the mean value and variance of the extracted image features are predicted by the estimation network, and the predicted mean value and predicted variance of the image features are obtained, thereby Avoiding the use of the mean and variance of the data normalization layer during model training is conducive to standardizing the images to be tested based on the predicted mean and predicted variance obtained in different scenes, and performing liveness detection based on the obtained standardized features.
  • the universality of living body detection improves the accuracy of living body detection for images to be tested in different data domains.
  • the image features are also standardized in combination with the network parameters used for standardization in the estimated network and the predicted mean value and predicted variance. Since the network parameters in the estimated network are obtained after model training, thus The obtained standardized features are more conducive to living body detection, and are conducive to improving the accuracy of living body detection of the image to be tested.
  • the network parameters include linear transformation parameters and translation parameters; the normalization processing module 1408 is also used for:
  • Affine transformation is performed on the normalized features based on the linear transformation parameters and the translation parameters to obtain the normalized features.
  • the standardization processing module 1408 is also used for:
  • the training image feature is the image feature extracted from the sample image
  • the image features are standardized according to the weighted mean, weighted variance, and network parameters to obtain standardized features.
  • the determining module 1410 is also used for:
  • the living body classification probability reaches a preset threshold, it is determined that the image to be tested is a living body image
  • the living body classification probability does not reach the preset threshold, it is determined that the image to be tested is a non-living body image.
  • the image to be tested is an image collected by the client in response to an interaction request; as shown in FIG. 15 , the device further includes:
  • a generating module 1412 configured to generate resource account information and feed back the resource account information to the client when the interaction request is a resource account registration request and the image to be tested is a live image;
  • a resource transfer module 1414 configured to transfer resources of a specified amount in the resource transfer request when the interaction request is a resource transfer request and the image to be tested is an in vivo image;
  • the access switch opening module 1416 is configured to open the access switch when the interaction request is a request for opening the access switch and the image to be tested is a live image.
  • the estimated network is obtained by training the estimated network before training; the device also includes:
  • the feature extraction module 1402 is also used to perform feature extraction on sample images in different data domains to obtain training image features
  • the convolution processing module 1404 is also used to perform convolution processing on the training image features through the pre-training estimation network to obtain the training prediction mean and training prediction variance corresponding to the training image features;
  • Calculation module 1418 for determining the estimated loss value based on the training prediction mean value and the statistical mean value of the training image features, as well as the training prediction variance and the statistical variance of the training image features;
  • the network parameter adjustment module 1420 is configured to adjust the network parameters of the estimated network before training based on the estimated loss value.
  • the device also includes:
  • the standardization processing module 1408 is also used to standardize the training image features based on the predicted mean value, predicted variance and network parameters used for standardization processing in the estimated network before training, to obtain standardized training features;
  • the classification module 1422 is used to input the standardized training features to the pre-training classifier, so that the pre-training classifier classifies the sample images based on the standardized training features to obtain the sample classification probability;
  • the calculation module 1418 is also used to calculate the classification loss value according to the classification probability of the sample and the label of the sample image;
  • the network parameter adjustment module 1420 is further configured to adjust the network parameters of the pre-training classifier and estimation network based on the classification loss value.
  • the device also includes:
  • the obtaining module 1406 is also used to obtain the depth map label of the sample image
  • the depth feature extraction module 1424 is used to perform depth feature extraction on standardized training features to obtain a depth feature map
  • the calculation module 1418 is also used to calculate the depth map loss value according to the depth feature map and the depth map label;
  • the network parameter adjustment module 1420 is further configured to adjust the network parameters of the estimated network before training based on the depth map loss value and the estimated loss value.
  • the sample image includes a live sample image and a non-living sample image; the acquisition module 1406 is also used for:
  • the sample image is a living sample image
  • a black basemap with the same size as the sample image is generated, and the black basemap is used as the depth map label.
  • the device also includes:
  • a face recognition module 1426 configured to perform face recognition on the sample image to obtain a target area containing facial features
  • a deletion module 1428 configured to delete image areas other than the target area in the sample image
  • the obtaining module 1406 is further configured to obtain the depth map label of the sample image after the image region is deleted.
  • Each module in the above-mentioned living body detection device can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • an estimation network processing device for liveness detection is provided.
  • This device can use software modules or hardware modules, or a combination of the two to become a part of computer equipment.
  • the device It specifically includes: a feature extraction module 1602, a convolution processing module 1604, a normalization processing module 1606 and a determination module 1608, wherein:
  • a feature extraction module 1602 configured to extract features from sample images in different data domains to obtain training image features
  • the convolution processing module 1604 is used to perform convolution processing on the training image features through the estimated network before training to obtain the training prediction mean value and the training prediction variance of the training image features;
  • a standardization processing module 1606, configured to determine an estimated loss value based on the training prediction mean and the statistical mean of the training image features, as well as the training prediction variance and the statistical variance of the training image features;
  • the determination module 1608 is used to adjust the network parameters of the estimated network before training based on the estimated loss value; the estimated network after parameter adjustment is used to determine the mean value and variance of the image features in the image to be tested, and to use the mean value, variance and estimated network
  • the network parameters used in the normalization process normalize the image features, and determine whether the image to be tested is a live image according to the normalized features obtained by the normalization process.
  • the above-mentioned living body detection method, estimation network processing method, device, computer equipment, storage medium, and computer readable instruction product perform mean value and variance prediction on the extracted image features through the estimation network before training, and obtain the training prediction mean value and training value of the image features. Predict the variance, then calculate the estimated loss value based on the training predicted mean and statistical mean, and calculate the estimated loss value based on the estimated loss value, and adjust the network parameters of the estimated network, so that the estimated network can be used for sample images in different application scenarios.
  • the corresponding mean value and variance can be estimated, and the generalization ability of the network is improved, which can help improve the accuracy of live body detection for images to be tested in different data domains.
  • the image features are also standardized in combination with the network parameters used for standardization in the estimated network and the predicted mean value and predicted variance. Since the network parameters in the estimated network are obtained after model training, thus The obtained standardized features are more conducive to living body detection, and are conducive to improving the accuracy of living body detection of the image to be tested.
  • the device further includes:
  • the standardization processing module 1606 is also used to standardize the training image features based on the predicted mean value, predicted variance and network parameters used for standardization processing in the estimated network before training, to obtain standardized training features;
  • the classification module 1610 is used to input the standardized training feature to the classifier before training, so that the classifier before training classifies the sample image based on the standardized training feature to obtain the sample classification probability;
  • the calculation module 1612 is also used to calculate the classification loss value according to the classification probability of the sample and the label of the sample image;
  • the network parameter adjustment module 1614 is also used to adjust the network parameters of the pre-training classifier and estimation network based on the classification loss value.
  • the device also includes:
  • the obtaining module 1616 is also used to obtain the depth map label of the sample image
  • the depth feature extraction module 1618 is used to perform depth feature extraction on standardized training features to obtain a depth feature map
  • the calculation module 1612 is also used to calculate the depth map loss value according to the depth feature map and the depth map label;
  • the network parameter adjustment module 1614 is further configured to adjust the network parameters of the estimated network before training based on the depth map loss value and the estimated loss value.
  • the sample image includes a live sample image and a non-living sample image; the acquisition module 1616 is also used for:
  • the sample image is a living sample image
  • a black basemap with the same size as the sample image is generated, and the black basemap is used as a depth map label.
  • the device also includes:
  • a face recognition module 1620 configured to perform face recognition on the sample image to obtain a target area containing facial features
  • a deletion module 1622 configured to delete image areas other than the target area in the sample image
  • the acquiring module 1616 is further configured to acquire the depth map label of the sample image after the image region is deleted.
  • each module in the estimation network processing device for living body detection described above may be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a service device, and its internal structure may be as shown in FIG. 18 .
  • the computer device includes a processor, memory and a network interface connected by a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions and a database.
  • the internal memory provides an environment for the execution of the operating system and computer readable instructions in the non-volatile storage medium.
  • the database of the computer device is used to store the living body detection data.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer readable instructions are executed by the processor, a living body detection method is realized.
  • Figure 18 is only a block diagram of a partial structure related to the solution of this application, and does not constitute a limitation to the computer equipment on which the solution of this application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and a processor, where computer-readable instructions are stored in the memory, and the steps in the above-mentioned method embodiments are implemented when the processor executes the computer-readable instructions.
  • a computer-readable storage medium which stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the steps in the foregoing method embodiments are implemented.
  • a product of computer readable instructions comprising computer readable instructions stored on a computer readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the steps in the foregoing method embodiments.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
  • Volatile memory can include Random Access Memory (RAM) or external cache memory.
  • RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及一种活体检测方法、估算网络处理方法、装置、计算机设备、存储介质和计算机可读指令产品。所述方法包括:从不同数据域的待测图像中提取图像特征;通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;获取所述估算网络中用于标准化处理的网络参数,基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。

Description

活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品
本申请要求于2021年05月25日提交中国专利局,申请号为2021105694644,申请名称为“活体检测方法、估算网络处理方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,特别是涉及一种活体检测方法、估算网络处理方法、装置、计算机设备、存储介质和计算机可读指令产品。
背景技术
随着人工智能技术的发展,出现了活体检测技术,活体检测技术广泛应用于银行远程业务、人脸支付和门禁系统中。传统的活体检测技术中,在进行活体检测时,通过由样本图像训练得到的检测模型来对待测图像进行活体检测,以判断出该待测图像是否为活体图像。
而在实际应用中,这些待测图像和训练过程中所使用的样本图像,在脸部、光照、背景以及攻击类型等域信息上存在差异,即真实的待测图像与样本图像的数据分布存在很大差异,因此导致检测模型的模型泛化能力不足,使用检测模型进行活体检测时,检测结果准确率低。
发明内容
根据本申请的各种实施例,提供一种活体检测方法、估算网络处理方法、装置、计算机设备、存储介质和计算机可读指令产品。
一种活体检测方法,由计算机设备执行,所述方法包括:
从不同数据域的待测图像中提取图像特征;
通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
获取所述估算网络中用于标准化处理的网络参数;
基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
一种活体检测装置,所述装置包括:
特征提取模块,用于从不同数据域的待测图像中提取图像特征;
卷积处理模块,用于通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
获取模块,用于获取所述估算网络中用于标准化处理的网络参数;
标准化处理模块,用于基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
确定模块,用于依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现以下步骤:
从不同数据域的待测图像中提取图像特征;
通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
获取所述估算网络中用于标准化处理的网络参数;
基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
一种计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:
从不同数据域的待测图像中提取图像特征;
通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
获取所述估算网络中用于标准化处理的网络参数;
基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
一种计算机可读指令产品,包括计算机可读指令,该计算机可读指令被处理器执行时实现以下步骤:
从不同数据域的待测图像中提取图像特征;
通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
获取所述估算网络中用于标准化处理的网络参数;
基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
一种用于活体检测的估算网络处理方法,由计算机设备执行,所述方法包括:
对不同数据域的样本图像进行特征提取,得到训练图像特征;
通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述 预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,根据所得的标准化特征确定所述待测图像是否为活体图像。
一种用于活体检测的估算网络处理装置,所述装置包括:
特征提取模块,用于对不同数据域的样本图像进行特征提取,得到训练图像特征;
卷积处理模块,用于通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
标准化处理模块,用于基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
确定模块,用于基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,根据所得的标准化特征确定所述待测图像是否为活体图像。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现以下步骤:
对不同数据域的样本图像进行特征提取,得到训练图像特征;
通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,根据所得的标准化特征确定所述待测图像是否为活体图像。
一种计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:
对不同数据域的样本图像进行特征提取,得到训练图像特征;
通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,根据所得的标准化特征确定所述待测图像是否为活体图像。
一种计算机可读指令产品,包括计算机可读指令,该计算机可读指令被处理器执行时实现以下步骤:
对不同数据域的样本图像进行特征提取,得到训练图像特征;
通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,根据所得的标准化特征确定所述待测图像是否为活体图像。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中活体检测方法的应用环境图;
图2为一个实施例中活体检测方法的流程示意图;
图3为一个实施例中估算网络的示意图;
图4为一个实施例中活体检测方法的结构示意图;
图5为一个实施例中标准化处理的原理示意图;
图6为一个实施例中活体检测方法的应用示意图;
图7为另一个实施例中活体检测方法的应用示意图;
图8为另一个实施例中活体检测方法的应用示意图;
图9为一个实施例中活体检测方法的时序图;
图10为另一个实施例中活体检测方法的流程示意图;
图11为一个实施例中活体检测方法的结构示意图;
图12a为一个实施例中样本图像中目标区域的示意图;
图12b为一个实施例中样本图像中扩大后目标区域的示意图;
图12c为一个实施例中删除背景区域后的样本图像的示意图;
图12d为一个实施例中样本图像的深度图标签的示意图;
图13为一个实施例中用于活体检测的估算网络处理方法示意图;
图14为一个实施例中活体检测装置的结构框图;
图15为另一个实施例中活体检测装置的结构框图;
图16为一个实施例中用于活体检测的估算网络处理装置的结构框图;
图17为另一个实施例中用于活体检测的估算网络处理装置的结构框图;
图18为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的活体检测方法,可以应用于如图1所示的应用环境中。在该应用环境中,终端102通过网络和服务设备104进行通信。具体地,服务设备104从不同数据域的待测图像中提取图像特征。通过估算网络对图像特征进行卷积处理,得到图像特征的预测均值和预测方差。服务设备104获取估算网络中用于标准化处理的网络参数,基于预测均值、预测方差和网络参数对图像特征标准化处理,得到标准化特征,并根据标准化特征确定待测图像是否为活体图像。
其中,终端102可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。
服务设备104可以是独立的物理服务器,也可以是区块链系统中的多个服务节点所组成的服务器集群,各服务节点之间形成组成点对点(P2P,Peer To Peer)网络,P2P协议是一个运行在传输控制协议(TCP,Transmission Control Protocol)协议之上的应用层协议。
此外,服务设备104还可以是多个物理服务器构成的服务器集群,可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。此外,该服务设备104还可以是集成了门禁系统,该服务设备104结合门禁系统可以对待测图像进行活体检测。
终端102与服务设备104之间可以通过蓝牙、USB(Universal Serial Bus,通用串行总线)或者网络等通讯连接方式进行连接,本申请在此不做限制。
在一个实施例中,如图2所示,提供了一种活体检测方法,以该方法应用于图1中的服务设备为例进行说明,包括以下步骤:
S202,服务设备从不同数据域的待测图像中提取图像特征。
其中,不同的数据域可以指不同的图像类别。对应的,不同数据域的待测图像可以是:在不同应用场景下采集的不同类别的图像,如在光线强的室外所采集到的图像,在光线弱的室外所采集到的图像,在夜晚所采集到的图像,以及在室内采集到的图像。
待测图像是待进行活体检测的图像,可以是彩色图像也可以是灰度图像。当待测图像是彩色图像时,服务设备对待测图像进行灰度化处理。待测图像中可以包含待测对象的人脸图像,此外还可以包含手势动作或表情动作或背景区域。待测图像可以是在各种光照条件及背景下采集的图像,也可以具有各种分辨率、大小尺寸等,本申请对此不作限定。
活体检测是检测待测图像中所包含的待测对象是否为活体的技术,通常用于确定待测图像中的待测对象是否是真实的活体用户,较多地应用于身份验证场景中。例如,当通过移动终端远程注册虚拟账户时,管理虚拟账户的服务设备要对请求注册虚拟账户的用户是否为本人进行验证,如用户a在注册虚拟账户时,服务设备可以通过摄像头采集的图像判断该图像 是实时拍摄用户a所得的活体图像,而非拍摄用户a盗用的他人图像、人体模型或佩戴的面具所得的非活体图像。
其中,图像特征是图像本身具有的、能够区别于其他图像的特点或特性,包括可以直观地感受到的特征和需要通过变换或处理得到的特征。可以直观地感受到的特征例如可以是亮度、边缘、纹理、轮廓和色彩等特征;通过变换或处理才能得到的特征例如矩、直方图以及主成份等特征。
在一个实施例中,S202具体可以包括:服务设备通过特征提取算法计算得到待测图像的图像特征。例如,服务设备可以通过Sobel算子从待测图像中提取待测图像的边缘特征,服务设备也可以通过七阶矩算法计算待测图像的七阶特征矩。
在另一个实施例中,S202具体可以包括:服务设备还可以通过神经网络从待测图像中提取图像特征。其中,该神经网络例如可以是前馈神经网络、卷积神经网络、残差卷积神经网络、循环神经网络等。
在一个实施例中,服务设备从待测图像中提取图像特征之前,可以对待测图像进行灰度化处理、图像增强处理和去噪处理中的至少一种预处理。
对于待测图像的获取,当服务设备为门禁系统时,服务设备可以通过集成的摄像头拍摄目标环境中的待测对象,得到待测图像。
当服务设备为服务器时,服务设备可以与终端建立通信连接,然后从终端侧接收待测图像。其中,该待测图像可以是终端在注册虚拟账户的过程中所采集的图像,或者在进行人脸支付时所采集的图像。
S204,服务设备通过估算网络对图像特征进行卷积处理,得到图像特征的预测均值和预测方差。
其中,估算网络可以是包括全局池化层和至少两个卷积层的神经网络,用于预测图像特征的均值和方差等。其中,卷积处理是基于卷积层进行卷积计算。需要指出的是,服务设备进行卷积处理的至少两个卷积层的特征数可以不同。
在一个实施例中,服务设备通过估算网络中的第一卷积层对图像特征进行卷积处理,得到第一卷积特征;然后,将第一卷积特征输入至估算网络中的第二卷积层,通过第二卷积层对第一卷积特征进行卷积处理,得到第二卷积特征,依次类推,直到估算网络中的最后一层卷积层,最后一层卷积层对上一卷积层输出的卷积特征进行卷积处理,得到预测均值和预测方差。
例如,如图3所示,估算网络中包括四个卷积层,其特征数从卷积层1(conv64)至卷积层4(conv1)逐层递减,即卷积层1的特征数为64,卷积层2(conv32)的特征数为32,卷积层3(conv16)的特征数为16,卷积层4的特征数为1。通过卷积层1至卷积层4对图像特征逐层进行卷积计算,卷积层4输出预测均值和预测方差。
在一个实施例中,S204具体可以包括:服务设备可以先通过估算网络对图像特征进行池化,然后对所得的池化图像特征进行卷积处理,得到图像特征的预测均值和预测方差。
其中,对图像特征进行池化可以是对图像特征进行降采样,以降低图像特征的数据量。 池化包括全局池化和局部池化,池化的方法包括最大池化、平均池化、随机池化等。
在一个实施例中,服务设备首先将图像特征输入估算网络中的全局池化层,对图像特征进行全局池化处理后,得到池化图像特征。然后,服务设备通过多个级联的卷积层对池化图像特征进行多层卷积计算,得到图像特征的预测均值和预测方差。例如,如图3所示,服务设备可以先将图像特征输入全局池化层(Global Average Pooling,GAP)进行全局池化处理,然后将所得的池化图像特征输入卷积层1,并将卷积层1的输出输入卷积层2,依次类推,直到得到最后一个卷积层输出的预测均值和预测方差。
S206,服务设备获取估算网络中用于标准化处理的网络参数。
其中,网络参数是估算网络中用于对图像特征进行标准化处理的参数,该网络参数可以是基于样本图像对估算网络进行训练所学习到的。服务设备通过标准化处理,可以将图像特征转换为无量纲的特定区间的纯数值,便于对不同单位或数量级的图像特征进行统一处理。
S208,基于预测均值、预测方差和网络参数对图像特征标准化处理,得到标准化特征。
在一个实施例中,S208具体可以包括:服务设备可以将预测均值、预测方差和网络参数作为标准化处理算法中的参数,将图像特征作为标准化处理算法的自变量,从而计算出标准化特征。其中,标准化处理算法包括但不限于直线型方法、折线型方法、曲线型方法等。直线型方法例如可以是极值法、标准差法等;折线型方法例如可以是三折线法等;曲线型方法例如可以是半正太性分布法等。
S210,服务设备依据基于标准化特征对待测图像进行分类所得的活体分类概率,确定待测图像是否为活体图像。
其中,活体图像可以指包含活体对象的图像,若待测图像为活体图像,则表示该待测图像中的待测对象属于活体对象,如通过摄像头拍摄用户所得的图像即为活体图像;若待测图像不是活体图像(即待测图像为非活体图像),则表示该待测图像中的待测对象不是活体对象,如通过摄像头拍摄用户的照片或佩戴的面具所得的图像则不是活体图像。
在一个实施例中,S210可以包括:服务设备可以根据标准化特征计算待测图像为活体图像的概率,当待测图像为活体图像的概率大于概率阈值时,确定待测图像为活体图像。
在另一个实施例中,S210可以包括:服务设备将标准化特征输入至分类器,以使分类器基于标准化特征对待测图像进行分类,得到活体分类概率;当活体分类概率达到预设阈值时,服务设备确定待测图像为活体图像;当活体分类概率未达到预设阈值时,服务设备确定待测图像为非活体图像。由于根据由估算网络预测得到的预测均值和预测方差对图像特征进行标准化得到的标准化特征更加准确,所以,服务设备得到的活体分类概率更加准确,提高了活体检测的准确性。
其中,分类器可以包括二分类的分类器、多分类的分类器、多任务的分类器等,用于对标准化特征进行分类处理,得到是否为活体图像的分类结果。该分类器可以基于决策树算法、逻辑回归算法、朴素贝叶斯算法或神经网络算法等构建的分类器。
需要指出的是,该分类器是基于样本图像的标准化训练特征进行训练所得的,具体的训练步骤可以包括:服务设备从样本图像中提取训练图像特征,然后基于预测均值、预测方差 和训练前的估算网络中用于标准化处理的网络参数,对训练图像特征进行标准化处理,得到标准化训练特征;然后,将标准化训练特征输入训练前的分类器,得到样本图像是否为活体图像的活体分类概率。服务设备根据活体分类概率和样本图像的标签计算得到分类损失值,根据分类损失值对训练前分类器的参数进行调整,从而可以得到最终的分类器。
其中,活体分类概率可以是[0,1]之间的数值。预设阈值是服务设备根据检测需求设置的阈值,且服务设备可以根据实际的检测需求对预设阈值进行调整。
接下来结合图4对上述实施例进行阐述,具体如下所述:
服务设备将获取的待测图像输入特征提取网络,特征提取网络包括多个级联的卷积层和数据标准化层。其中,卷积层用于提取待测图像的图像特征,数据标准化层用于对卷积层提取的特征进行数据标准化,该数据标准化层中包括估算网络,该估算网络用于对图像特征进行池化和卷积计算,以获得预测均值和预测方差。服务设备首先将待测图像输入特征提取网络的卷积层,通过卷积层对待测图像进行卷积计算,提取出待测图像的图像特征。然后,服务设备将图像特征输入数据标准化层中的估算网络,估算网络对图像特征进行池化,并对所得的池化图像特征进行至少两次卷积处理,得到图像特征的预测均值和预测方差。
然后数据标准化层基于预测均值、预测方差对图像特征归一化处理,并基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。
服务设备将标准化特征输入下一个卷积层再次进行特征提取,然后再通过数据标准化层进行标准化处理,如此进行多次上述处理得到最终的标准化特征,并将最终得到的标准化特征输入分类器,得到活体分类概率。
上述实施例中,通过估算网络对提取的图像特征进行均值和方差预测,得到图像特征的预测均值和预测方差,从而避免使用模型训练时数据标准化层的均值和方差,有利于对不同场景所得的待测图像均可以基于预测所得的预测均值和预测方差进行标准化处理,以及根据所得的标准化特征进行活体检测,提高了活体检测的普适性,而且提高了对不同数据域的待测图像进行活体检测的准确性。此外,在进行标准化处理时,还结合估算网络中用于标准化处理的网络参数与预测均值、预测方差对图像特征进行标准化处理,由于估算网络中的网络参数是经过模型训练后所得的参数,从而所得的标准化特征更加有利于活体检测,有利于提高待测图像进行活体检测的准确性。
在一个实施例中,网络参数包括线性变换参数和平移参数;S208包括:服务设备根据预测均值和预测方差对图像特征归一化处理,得到归一化特征;服务设备基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。
其中,归一化处理是将图像特征缩放到预设的尺度,也即,使图像特征的均值为0、方差为单位方差。对于不同的图像,图像特征的数据分布不同,例如不同图像的亮度不同;或者不同图像的对比度不同,也即图像最亮和最暗区域间的亮度层级间差别较大。归一化处理可以降低数据分布的差异对所提取的图像特征的影响。在一个实施例中,服务设备计算归一化处理前的图像特征和预测均值的差值,然后将所得的差值和预测方差求商,得到归一化特征。 例如,服务设备通过公式(1)对图像特征进行归一化处理。其中,x是归一化处理前的图像特征,μ是预测均值,σ是预测方差,x′是归一化特征。
Figure PCTCN2022088444-appb-000001
其中,仿射变换是通过线性变换和平移将归一化特征从一个向量空间映射到另一个向量空间的映射方法。仿射变换是二维坐标到二维坐标之间的线性变换,且可以保持二维图形的平直性和平移性,即可以保持直线之间的相对位置关系不变,平行线经仿射变换后依然为平行线,且直线上点的位置顺序不会发生变化。其中,线性变换参数是在仿射变换过程中,对归一化特征进行线性变换的参数。平移参数是在仿射变换过程中,对归一化特征进行平移的参数。在一个实施例中,服务设备通过公式(2)对归一化特征进行仿射变换。其中,x′为归一化特征,γ为线性变换参数,β为平移参数。
Figure PCTCN2022088444-appb-000002
在一个实施例中,如图5所示,服务设备将图像特征输入估算网络,通过估算网络得到图像特征的预测均值和预测方差。然后,服务设备根据预测均值和预测方差对图像特征进行归一化处理,得到归一化特征。在得到归一化特征后,服务设备通过线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。
上述实施例中,服务设备根据预测均值和预测方差对图像特征归一化处理,并对归一化特征进行仿射变换,得到标准化特征。相对于套用对样本图像进行数学统计得到的统计均值和统计方差对图像特征进行标准化,根据预测均值和预测方差对图像特征进行标准化的方法,降低了真实待测图像和样本图像数据分布的不同对标准化处理过程造成的影响,得到的标准化特征准确性更高,从而根据标准化特征所得到的活体检测结果更加准确。
在另一个实施例中,S208具体可以包括:服务设备获取训练图像特征的统计均值和统计方差;训练图像特征是从样本图像中提取的图像特征;服务设备对预测均值和统计均值进行加权计算,得到加权均值;以及,服务设备对预测方差和统计方差进行加权计算,得到加权方差;服务设备根据加权均值、加权方差以及网络参数对图像特征进行标准化,得到标准化特征。
其中,样本图像可以是从训练图像集内提取出来的不同数据域的图像。例如,服务设备从训练图像集中分别抽取不同数据域的多个活体图像和多个非活体图像。需要指出的是,该活体图像中的待测对象和非活体图像中的待测对象可以是同一个活体对象,也可以不是同一个活体对象。对于不同数据域的样本图像,其数据分布可以不同,如不同数据域的样本图像具有不同的亮度、对比度和攻击类型等。其中,攻击类型的样本图像可以指:在实时采集图像时,用户利用纸质照片、显示屏显示的照片或佩戴面具的方式所采集的图像。
统计均值是对样本图像的训练图像特征进行统计得到的均值。统计方差是对样本图像的训练图像特征进行统计得到的方差。预测均值和预测方差是估算网络估算得到的待测图像的均值和方差。由于估算网络是通过样本图像训练得到的,在待测图像和样本图像的数据分布差异较大时,预测均值和预测方差也能够比较接近待测图像真实的均值和方差。
其中,服务设备对预测均值和统计均值进行加权计算可以指:利用不同的权重值分别对预测均值和统计均值进行加权,然后对加权所得的结果进行求和。服务设备对预测方差和统计方差进行加权计算:利用不同的权重值分别对预测方差和统计方差进行加权,然后对加权所得的结果进行求和。服务设备利用加权计算得到的加权均值和加权方差以及网络参数对图像特征进行标准化,使服务设备在待测图像和样本图像的数据分布差异较大或较小时皆能得到较为准确的标准化特征,提高了活体检测结果的稳定性和准确性。
在一个实施例中,服务设备进行加权计算时,可以根据需要调整预测均值和统计均值各自对应的权重值,以及预测方差和统计方差各自对应的权重值。预测均值、统计均值、预测方差和统计方差分别对应的权重可以相同,也可以不相同。
上述实施例中,服务设备得到的加权均值融合了预测均值和统计均值,加权方差融合了预测方差和统计方差,所以不论待测图像和样本图像的数据分布差异较大或较小时,皆能得到较为准确的标准化特征,提高了活体检测结果的稳定性和准确性。服务设备从多个不同数据域的样本图像集中分别抽取样本图像,所抽取的样本图像具有各种不同的数据分布,得到的统计均值和统计方差具有更强的泛化性能,从而使服务设备根据加权均值和加权方差得到的标准化特征更加准确,提高了活体检测结果的准确性。
在一个实施例中,待测图像为客户端响应于交互请求所采集的图像;活体检测方法还包括:当交互请求为资源账户注册请求、且待测图像为活体图像时,生成资源账户信息,并向客户端反馈资源账户信息;当交互请求为资源转移请求、且待测图像为活体图像时,对资源转移请求中指定数额的资源进行转移;当交互请求为出入开关开启请求、且待测图像为活体图像时,开启出入开关。
其中,交互请求是用户通过客户端触发的获取服务设备提供的业务服务的请求。其中,资源账户注册请求是向服务设备申请注册资源账户的请求。资源账户是可以交互虚拟资源的账户,虚拟资源包括但不限于虚拟货币、虚拟物品等。其中,资源转移请求是向服务设备申请转移虚拟资源的请求,转移包括转入和转出。其中,出入开关开启请求是向服务设备申请开启出入开关的请求。
在一个实施例中,客户端为提供金融服务的应用程序,如图6所示,客户端的注册发起页面60A中包括注册按钮602,当用户点击该注册按钮602时,客户端会产生资源账户注册请求,以请求服务设备62注册资源账户,服务设备在注册资源账户之前进行活体检测,具体过程如下所述:
S602,客户端向服务设备62发送资源账户注册请求。
S604,服务设备响应于资源账户注册请求向客户端发送活体检测指令,以指示客户端采集待测对象的人脸图像。
客户端响应于活体检测指令,在图像采集页面60B中显示图像采集框604,以在图像采集框604中采集待测对象的待测图像。客户端通过图像采集设备采集到待测图像后执行S606。
S606,客户端将待测图像发送至服务设备62。
服务设备62从待测图像中提取图像特征,并通过估算网络对图像特征进行池化,得到池化图像特征,并对池化图像特征进行至少两次卷积处理,得到图像特征的预测均值和预测方差。服务设备62根据预测均值和预测方差对图像特征归一化处理,得到归一化特征,并基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。最后,服务设备62将标准化特征输入至分类器,以使分类器基于标准化特征对待测图像进行分类,得到活体分类概率。当活体分类概率达到预设阈值时,确定待测图像为活体图像;当活体分类概率未达到预设阈值时,确定待测图像为非活体图像,从而得到该待测图像是否为活体图像的检测结果。
S608,服务设备62将检测结果发送至客户端,并在待测图像为活体图像时,同时发送资源账户信息。客户端收到资源账户信息后,根据资源账户信息进行注册,并在结果显示页面60C中显示注册结果。
上述实施例中,服务设备对请求注册资源账户的待测对象进行活体检测,在确定待测图像为活体图像,也即确定待测对象为活体对象时,生成资源账户信息,并向客户端反馈资源账户信息,以对资源账户进行注册,可以有效防止非法用户冒充合法用户的身份注册资源账户,保证了资源账户的合法使用,并且相对于人工注册资源账户,提高了资源账户的注册效率。
在一个实施例中,客户端为支付类应用,交互请求为资源转移请求,如图7所示,客户端的支付请求页面70A中包括支付清单702和支付控件704。当用户点击支付控件704时,客户端会产生资源转移请求,以请求服务设备72进行资源转移,服务设备在资源转移之前进行活体检测,具体过程如下所述:
S702,客户端向服务设备72发送资源转移请求。
S704,服务设备72响应于资源转移请求向客户端发送活体检测指令,以指示客户端采集待测对象的人脸图像。
客户端响应于活体检测指令,在图像采集页面70B上显示图像采集框706,以在图像采集框706中采集待测对象的待测图像。客户端通过图像采集设备采集到待测图像后,执行S706。
S706,客户端将待测图像发送至服务设备72。
服务设备72从待测图像中提取图像特征,并通过估算网络对图像特征进行池化,得到池化图像特征,并对池化图像特征进行至少两次卷积处理,得到图像特征的预测均值和预测方差。服务设备72根据预测均值和预测方差对图像特征归一化处理,得到归一化特征,并基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。最后,服务设备72将标准化特征输入至分类器,以使分类器基于标准化特征对待测图像进行分类,得到活体分类概率。当活体分类概率达到预设阈值时,确定待测图像为活体图像;当活体分类概率未达到预设阈值时,确定待测图像为非活体图像。
当服务设备在确定待测图像为活体图像时,对资源转移请求中指定数额的资源进行转移,并执行S708。
S708,服务设备72将支付成功信息发送至客户端。客户端在结果显示页面70C中显示支付成功信息708。
上述实施例中,服务设备对请求进行资源转移的待测对象进行活体检测,在确定待测图像为活体图像,也即确定待测对象为活体对象时,对资源转移请求中指定数额的资源进行转移,保证了资源转移的安全性,并且提高了资源转移的效率。
在一个实施例中,客户端为门禁系统,交互请求为出入开关开启请求,如图8所示,当待测对象接近安装门禁系统的闸门82时,触发门禁系统对待测对象进行活体检测,门禁系统在出入开关控制页面80中显示图像采集框802,以通过图像采集框802采集待测对象的待测图像。
当门禁系统通过摄像头采集到待测图像时,门禁系统将待测图像发送给服务设备(即安装门禁系统的闸门82)。服务设备从待测图像中提取图像特征,并通过估算网络对图像特征进行池化,得到池化图像特征,并对池化图像特征进行至少两次卷积处理,得到图像特征的预测均值和预测方差。服务设备根据预测均值和预测方差对图像特征归一化处理,得到归一化特征,并基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。最后,服务设备将标准化特征输入至分类器,以使分类器基于标准化特征对待测图像进行分类,得到活体分类概率。当活体分类概率达到预设阈值时,确定待测图像为活体图像;当活体分类概率未达到预设阈值时,确定待测图像为非活体图像。然后服务设备将待测图像为活体图像或为非活体图像的结果发送给门禁系统,从而门禁系统在确定待测图像为活体图像时,开启出入开关并在出入开关控制页面中显示允许通行的信息804;门禁系统在确定待测图像为非活体图像时,不开启出入开关。
上述实施例中,对请求开启出入开关的待测对象进行活体检测,以对待测对象的身份进行验证,在待测图像为活体图像时开启出入开关,保证了门禁系统的安全,并且相比于人工对待测对象进行验证,节省了成本,提高了通行效率。
在一个实施例中,如图9所示,活体检测方法包括如下步骤:
S902,客户端向服务设备发送交互请求。
其中,客户端可以是安装在终端上的应用程序,如社交应用、支付类应用、购物类应用或其它提供金融服务的应用程序等。此外,该客户端还可以是安装在服务设备上的门禁系统。
S904,服务设备响应于交互请求向客户端发送检测指令。
S906,客户端根据检测指令针对待测对象采集待测图像。
S908,客户端将待测图像发送至服务设备。
S910,当服务设备获得待测图像时,从待测图像中提取图像特征。
S912,服务设备通过估算网络对图像特征进行池化,并对所得的池化图像特征进行至少两次卷积处理,得到图像特征的预测均值和预测方差。
S914,服务设备根据预测均值和预测方差对图像特征归一化处理,得到归一化特征,并 基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。
S916,服务设备将标准化特征输入至分类器,以使分类器基于标准化特征对待测图像进行分类,得到活体分类概率,并根据标准化特征确定待测图像是否为活体图像。
S918,服务设备将待测对象的活体检测结果发送至客户端。
在一个实施例中,如图10所示,估算网络是对训练前的估算网络进行训练所得;服务设备对训练前的估算网络进行训练包括如下步骤:
S1002,服务设备对不同数据域的样本图像进行特征提取,得到训练图像特征。
其中,不同数据域的样本图像可以是:在不同应用场景下采集的不同类别的样本图像,如在光线强的室外所采集到的样本图像,在光线弱的室外所采集到的样本图像,在夜晚所采集到的样本图像,以及在室内采集到的样本图像。
其中,样本图像可以是从训练图像集内提取出来的不同数据域的图像。例如,服务设备从训练图像集中分别抽取不同数据域的多个活体样本图像和多个非活体样本图像。需要指出的是,该活体样本图像中的待测对象和非活体样本图像中的待测对象可以是同一个活体对象,也可以不是同一个活体对象。
其中,训练图像特征是样本图像本身具有的、能够区别于其他样本图像的特点或特性,包括可以直观地感受到的特征和需要通过变换或处理得到的特征。可以直观地感受到的特征例如可以是亮度、边缘、纹理、轮廓和色彩等特征;通过变换或处理才能得到的特征例如矩、直方图以及主成份等特征。
在一个实施例中,S1002具体可以包括:服务设备通过特征提取算法计算得到样本图像的训练图像特征。例如,服务设备可以通过Sobel算子从样本图像中提取样本图像的边缘特征,服务设备也可以通过七阶矩算法计算样本图像的七阶特征矩。
在另一个实施例中,S1002具体可以包括:服务设备还可以通过神经网络从样本图像中提取训练图像特征。其中,该神经网络例如可以是前馈神经网络、卷积神经网络、残差卷积神经网络、循环神经网络等。
在一个实施例中,服务设备从样本图像中提取训练图像特征之前,可以对样本图像进行灰度化处理、图像增强处理和去噪处理中的至少一种预处理。
S1004,服务设备通过训练前的估算网络对训练图像特征进行卷积处理,得到训练图像特征对应的训练预测均值和训练预测方差。
在一个实施例中,服务设备通过训练前的估算网络中的第一卷积层对训练图像特征进行卷积处理,得到第一卷积训练特征;然后,将第一卷积训练特征输入至训练前的估算网络中的第二卷积层,通过第二卷积层对第一卷积训练特征进行卷积处理,得到第二卷积训练特征,依次类推,通过训练前的估算网络中的其它卷积层对输入的卷积训练特征进行卷积处理,得到训练预测均值和训练预测方差。
在一个实施例中,S1004具体可以包括:服务设备可以先通过训练前的估算网络对训练图像特征进行池化,然后对所得的训练池化图像特征进行卷积处理,得到训练图像特征的训 练预测均值和训练预测方差。
其中,对训练图像特征进行池化可以是对训练图像特征进行降采样,以降低训练图像特征的数据量。池化包括全局池化和局部池化,池化的方法包括最大池化、平均池化、随机池化等。
在一个实施例中,服务设备首先将训练图像特征输入训练前的估算网络中的全局池化层,对训练图像特征进行全局池化处理后,得到池化图像特征。然后,服务设备通过多个级联的卷积层对训练池化图像特征进行多层卷积计算,得到训练图像特征的训练预测均值和训练预测方差。
S1006,服务设备基于训练预测均值和训练图像特征的统计均值,以及训练预测方差和训练图像特征的统计方差,确定估算损失值。
其中,统计均值是通过数学计算得到的训练图像特征的均值,统计方差是通过数学计算得到的训练图像特征的方差。
在一个实施例中,S1006具体可以包括:服务设备计算训练预测均值和训练图像特征的统计均值之间的差值或平方差,以及训练预测方差和训练图像特征的统计方差之间的差值或平方差,将上述两种差值的和值,或上述两种平方差的和值作为估算损失值。
在另一个实施例中,S1006具体可以包括:计算训练预测均值和训练图像特征的统计均值之间的差值,以及训练预测方差和训练图像特征的统计方差之间的差值,然后再分别计算上述两种差值的平方,将上述两个平方的和值作为估算损失值。例如,服务设备根据公式(3)计算估算损失值,其中,L mve是估算损失值,μ是预测均值,
Figure PCTCN2022088444-appb-000003
是统计均值,σ 2是预测方差,
Figure PCTCN2022088444-appb-000004
是统计方差。
Figure PCTCN2022088444-appb-000005
S1008,服务设备基于估算损失值对训练前的估算网络进行网络参数调整。
在一个实施例中,服务设备在获得估算损失值之后,可以将该估算损失值在估算网络中进行反向传播,以得到估算网络的各网络层中网络参数的梯度,基于所得的梯度调整估算网络中的网络参数,使根据调整后的估算网络得到的估算损失值最小。
上述实施例中,服务设备通过估算损失值对估算网络进行网络参数调整,提高了估算网络预测均值和方差的能力以及估算网络的泛化能力,使估算网络预测得到的预测均值和预测方差尽量接近待测图像真实的均值和方差。在服务设备获取待测图像时,可以直接使用估算网络预测得到的预测均值和预测方差对待测图像的图像特征进行标准化处理。并且,在待测图像和样本图像的数据分布不同时,由于估算网络的泛化能力较强,相对于套用根据样本图像统计得到的样本和方差,服务设备根据预测均值和预测方差可以得到更为准确的标准化特征,提高了活体检测的准确性。服务设备通过训练前估算网络对训练图像特征进行全局池化,降低了训练图像特征的数据量,避免在对训练前估算网络进行参数调整的过程中出现过拟合,使网络参数调整得到的估算网络预测得到的预测均值和预测方差更加准确,从而提高了活体检测的准确性。
在一个实施例中,服务设备基于预测均值、预测方差和训练前的估算网络中用于标准化 处理的网络参数对训练图像特征标准化处理,得到标准化训练特征;服务设备将标准化训练特征输入至训练前的分类器,以使训练前的分类器基于标准化训练特征对样本图像进行分类,得到样本分类概率;服务设备根据样本分类概率以及样本图像的标签计算得到分类损失值;服务设备基于分类损失值和估算损失值对训练前的分类器和估算网络进行网络参数调整。
通过标准化处理,服务设备将训练图像特征转换为无量纲的特定区间的纯数值,便于对不同单位或数量级的图像特征进行统一处理。
在一个实施例中,服务设备可以将预测均值、预测方差和训练前的估算网络中用于标准化处理的网络参数作为标准化处理算法中的参数,将图像特征作为标准化处理算法的自变量,从而计算出标准化特征。
在一个实施例中,服务设备可以通过对数损失函数对样本图像的标签以及分类器对样本图像的分类结果进行计算,得到分类损失值。
在另一个实施例中,服务设备计算分类器对样本图像的分类结果和样本图像的标签间的交叉熵,根据所得的交叉熵的期望计算得到分类损失值。
在另一个实施例中,服务设备首先用过特征提取网络对样本图像进行特征提取,然后将特征提取所得的标准化特征输入分类器,通过分类器根据标准化特征对样本图像进行分类。服务设备对分类所得的分类结果取对数,并计算分类结果和样本图像的标签间的交叉熵,然后根据所得的交叉熵的期望计算得到分类损失值。例如,服务设备通过公式(4)计算得到分类损失值。其中,X是样本图像的集合,Y是样本图像标签的集合,x是样本图像,y是样本图像x的标签。G是通过特征提取网络对x进行特征提取,得到x的标准化特征。C是通过分类器根据标准化特征对x进行分类,所得的分类结果。
Figure PCTCN2022088444-appb-000006
指的是遍历所有的样本图像,计算样本图像的标签和通过分类器得到的分类结果之间的损失之后取平均,即计算分类损失的期望。
Figure PCTCN2022088444-appb-000007
上述实施例中,服务设备根据分类损失值对训练前的分类器和估算网络进行网络参数调整,使网络参数调整得到的估算网络预测得到的预测均值和预测方差以及网络参数调整后的分类器得到的活体分类概率准确性更高,提高了活体检测的准确性。此外,服务设备在提取出样本图像的训练图像特征之后,通过预测均值和预测方差对训练图像特征进行标准化处理,可以避免在对估算网络和分类的网络参数进行调整的过程中出现梯度下降和梯度消失的问题。
在一个实施例中,服务设备获取样本图像的深度图标签;对标准化训练特征进行深度特征提取,得到深度特征图;根据深度特征图和深度图标签计算得到深度图损失值;基于深度图损失值和估算损失值对训练前的估算网络进行网络参数调整。
其中,深度图标签是用于对样本图像进行标注的标签。深度特征图是通过深度估计网络对标准化训练特征进行深度特征提取得到的能够描述样本图像深度特征的图像。
在一个实施例中,服务设备计算深度特征图和样本图像的深度图标签间的差值,然后将上述差值的平方作为深度图损失值。
在另一个实施例中,服务设备分别计算深度特征图和样本图像的深度图标签的平方,将 上述两个平方的差值作为深度图损失值。
在另一个实施例中,服务设备计算深度特征图和样本图像的深度图标签间的差值,然后再计算上述差值的范数,得到深度图损失值。例如,服务设备根据公式(5)计算得到深度图损失值,其中,L dep(X;dep)是深度图损失值,X是样本图像的集合,I是样本图像的深度图标签,D为对X的标准化训练特征进行深度特征提取得到的深度特征图。
Figure PCTCN2022088444-appb-000008
在一个实施例中,服务设备获取样本图像的SIFT(Scale-invariant Feature Transform,尺度不变特征变换)标签,并对标准化训练特征进行SIFT特征估计,得到SIFT估计特征。服务设备根据SIFT估计特征和SIFT标签计算得到SIFT损失值;基于SIFT损失值和估算损失值对训练前的估算网络进行网络参数调整。
在一个实施例中,服务设备在获得深度图损失值和分类损失值之后,可以将该深度图损失值在深度估计网络中进行反向传播,并将分类损失值在特征提取网络中进行反向传播,以得到特征提取网络和深度估计网络各网络层中网络参数的梯度,基于所得的梯度调整特征提取网络中的估算网络以及深度估计网络的网络参数,使根据调整后的估算网络和深度估计网络得到的深度图损失值与分类损失值之和最小,提高了估算网络预测得到的预测均值和预测方差以及分类器计算得到的待测图像的活体分类概率的准确性,提高了活体检测结果的准确性。
在一个实施例中,服务设备计算得到估算损失值、深度图损失值和分类损失值,然后对特征提取网络中的网络参数、分类器的网络参数以及深度估计网络的网络参数进行调整,以使估算损失值、深度图损失值和分类损失值之和最小,提高了估算网络预测得到的预测均值和预测方差以及分类器计算得到的待测图像的活体分类概率的准确性,提高了活体检测结果的准确性。
在一个实施例中,服务设备通过深度估计网络对标准化训练特征进行深度特征提取,得到深度特征图。服务设备通过特征提取网络得到标准化训练特征,然后计算得到估算损失值,深度图损失值和分类损失值。服务设备可以在根据估算损失值,深度图损失值和分类损失值对特征提取网络和分类器进行网络参数调整的同时对深度估计网络进行网络参数调整;服务设备也可以通过训练样本单独对深度估计网络进行训练,以对深度估计网络的网络参数进行调整。
在一个实施例中,如图11所示,服务设备获取样本图像,样本图像包括活体样本图像和非活体样本图像。当样本图像为活体样本图像时,对样本图像进行深度图计算,得到样本图像的深度图标签;当样本图像为非活体样本图像时,将与样本图像尺寸相同的黑色底图作为样本图像的深度图标签。服务设备将样本图像输入特征提取网络,并通过特征提取网络中的卷积层提取样本图像的训练样本特征,然后将卷积层输出的特征输入与卷积层相连的数据标准化层,通过数据标准化层对卷积层输出的特征进行数据标准化。然后服务设备将标准化处理后的特征输入下一个卷积层,依次类推,直到最后一个数据标准化层。服务设备将特征提取网络的最后一个数据标准化层输出的标准化特征输入分类器,通过分类器得到活体检测结 果。并且,服务设备将标准化特征输入深度估计网络,通过深度估计网络对标准化特征进行深度特征提取,得到深度特征图。服务设备根据分类器输出的活体检测结果以及样本图像的标签计算得到分类损失值,并根据深度特征图和样本图像的深度图标签计算得到深度图损失值。服务设备对特征提取网络以及分类器的网络参数进行调整,使深度图损失值与分类损失值的和最小。在一个实施例中,服务设备基于训练预测均值和训练图像特征的统计均值,以及训练预测方差和训练图像特征的统计方差,确定估算损失值。服务设备对特征提取网络以及分类器的网络参数进行调整,使深度图损失值、分类损失值以及估算损失值的和最小。
在一个实施例中,样本图像包括活体样本图像和非活体样本图像;获取样本图像的深度图标签包括:当样本图像为活体样本图像时,对样本图像进行深度图计算,得到深度图标签;当样本图像为非活体样本图像时,生成与样本图像尺寸相同的黑色底图,并将黑色底图作为深度图标签。
其中,活体样本图像可以指包含活体对象的样本图像,若样本图像为活体样本图像,则表示该样本图像中的待测对象属于活体对象,如通过摄像头拍摄用户所得的样本图像即为活体图像;若样本图像为非活体样本图像,则表示该样本图像中的待测对象不是活体对象,如通过摄像头拍摄用户的照片或佩戴的面具所得的样本图像为非活体样本图像。
在一个实施例中,服务设备计算样本图像中各个像素点的视差,根据各个像素点的视差得到样本图像的深度图标签。在另一个实施例中,服务设备通过训练得到的神经网络对样本图像进行深度计算,得到样本图像的深度图标签。
上述实施例中,样本图像包括活体样本图像和非活体样本图像,服务设备分别针对样本图像和非活体样本图像得到深度图标签。然后,服务设备可以通过样本图像的深度图标签和深度估计网络估计得到的样本图像的深度特征图计算得到深度损失值,并通过深度损失值对特征提取网络中的估算网络和深度估计网络进行训练,从而提高估算网络估算的预测均值和预测方差的准确性,进而提高活体检测结果的准确性。
在一个实施例中,服务设备对样本图像进行人脸识别,得到包含人脸特征的目标区域;服务设备在样本图像中,将目标区域之外的图像区域进行删除;服务设备获取删除图像区域后的样本图像的深度图标签。
其中,人脸识别是基于人的脸部特征信息检测图像中的人脸区域的技术。服务设备对样本图像进行人脸识别,得到如图12a中方框中所示的包含人脸特征的目标区域。在一个实施例中,服务设备在得到目标区域之后,为获取更多的背景内容,对目标区域进行扩大,扩大后的目标区域如图12b所示。然后,服务设备将目标区域之外的图像区域进行删除,或者服务设备也可以将扩大后的目标区域之外的图像区域进行删除。服务设备对图12b的白框中的目标区域之外的图像区域删除,得到如图12c所示的图像。若图12a中的样本图像为活体样本图像,服务设备对图12c所示的图像进行深度图计算,得到如图12d所示的深度图标签。
在一个实施例中,服务设备在对样本图像进行人脸识别之前,若样本图像为彩色图像,则服务设备对样本图像进行灰度化处理,灰度化处理的算法包括分量法、最大值法、平均值法、加权平均法等。
在一个实施例中,服务设备在对样本图像进行人脸识别之前对样本图像进行图像增强处理。图像增强处理的算法包括空间域法和频率域法。
在一个实施例中,如图13所示,提供了一种用于活体检测的估算网络处理方法,以该方法应用于图1中的服务设备为例进行说明,包括以下步骤:
S1302,服务设备对不同数据域的样本图像进行特征提取,得到训练图像特征。
S1304,服务设备通过训练前的估算网络对训练图像特征进行卷积处理,得到训练图像特征的训练预测均值和训练预测方差。
S1306,服务设备基于训练预测均值和训练图像特征的统计均值,以及训练预测方差和训练图像特征的统计方差,确定估算损失值。
S1308,服务设备基于估算损失值对训练前的估算网络进行网络参数调整;参数调整后的估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于预测均值、预测方差和估算网络中用于标准化处理的网络参数对图像特征标准化处理,并根据标准化处理所得的标准化特征确定待测图像是否为活体图像。
在一个实施例中,服务设备基于预测均值、预测方差和训练前的估算网络中用于标准化处理的网络参数对训练图像特征标准化处理,得到标准化训练特征;将标准化训练特征输入至训练前的分类器,以使训练前的分类器基于标准化训练特征对样本图像进行分类,得到样本分类概率;根据样本分类概率以及样本图像的标签计算得到分类损失值;S1308具体可以包括:服务设备基于分类损失值和估算损失值对训练前的分类器和估算网络进行网络参数调整。
在一个实施例中,服务设备获取样本图像的深度图标签;对标准化训练特征进行深度特征提取,得到深度特征图;根据深度特征图和深度图标签计算得到深度图损失值;S1308具体可以包括:服务设备基于深度图损失值和估算损失值,对训练前的估算网络进行网络参数调整。
在一个实施例中,服务设备样本图像包括活体样本图像和非活体样本图像;获取样本图像的深度图标签包括:当样本图像为活体样本图像时,对样本图像进行深度图计算,得到深度图标签;当样本图像为非活体样本图像时,生成与样本图像尺寸相同的黑色底图,并将黑色底图作为深度图标签。
在一个实施例中,服务设备对样本图像进行人脸识别,得到包含人脸特征的目标区域;在样本图像中,将目标区域之外的图像区域进行删除;服务设备获取删除图像区域后的样本图像的深度图标签。
上述S1302~1308的具体步骤可以参考图2实施例和图10实施例。
应该理解的是,虽然图2、6、10、13的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2、6、10、13中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一 时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图14所示,提供了一种活体检测装置,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:特征提取模块1402、卷积处理模块1404、获取模块1406、标准化处理模块1408和确定模块1410,其中:
特征提取模块1402,用于从不同数据域的待测图像中提取图像特征;
卷积处理模块1404,用于通过估算网络对图像特征进行卷积处理,得到图像特征的预测均值和预测方差;
获取模块1406,用于获取估算网络中用于标准化处理的网络参数;
标准化处理模块1408,用于基于预测均值、预测方差和网络参数对图像特征标准化处理,得到标准化特征;
确定模块1410,用于依据基于标准化特征对待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
上述活体检测方法、估算网络处理方法、装置、计算机设备、存储介质和计算机可读指令产品中,通过估算网络对提取的图像特征进行均值和方差预测,得到图像特征的预测均值和预测方差,从而避免使用模型训练时数据标准化层的均值和方差,有利于对不同场景所得的待测图像均可以基于预测所得的预测均值和预测方差进行标准化处理,以及根据所得的标准化特征进行活体检测,提高了活体检测的普适性,而且提高了对不同数据域的待测图像进行活体检测的准确性。此外,在进行标准化处理时,还结合估算网络中用于标准化处理的网络参数与预测均值、预测方差对图像特征进行标准化处理,由于估算网络中的网络参数是经过模型训练后所得的参数,从而所得的标准化特征更加有利于活体检测,有利于提高待测图像进行活体检测的准确性。
在一个实施例中,网络参数包括线性变换参数和平移参数;标准化处理模块1408,还用于:
根据预测均值和预测方差对图像特征归一化处理,得到归一化特征;
基于线性变换参数和平移参数对归一化特征进行仿射变换,得到标准化特征。
在一个实施例中,标准化处理模块1408,还用于:
获取训练图像特征的统计均值和统计方差;训练图像特征是从样本图像中提取的图像特征;
对预测均值和统计均值进行加权计算,得到加权均值;以及,对预测方差和统计方差进行加权计算,得到加权方差;
根据加权均值、加权方差以及网络参数对图像特征进行标准化,得到标准化特征。
在一个实施例中,确定模块1410,还用于:
将标准化特征输入至分类器,以使分类器基于标准化特征对待测图像进行分类,得到活 体分类概率;
当活体分类概率达到预设阈值时,确定待测图像为活体图像;
当活体分类概率未达到预设阈值时,确定待测图像为非活体图像。
在一个实施例中,待测图像为客户端响应于交互请求所采集的图像;如图15所示,装置还包括:
生成模块1412,当交互请求为资源账户注册请求、且待测图像为活体图像时,用于生成资源账户信息,并向客户端反馈资源账户信息;
资源转移模块1414,当交互请求为资源转移请求、且待测图像为活体图像时,用于对资源转移请求中指定数额的资源进行转移;
出入开关开启模块1416,当交互请求为出入开关开启请求、且待测图像为活体图像时,用于开启出入开关。
在一个实施例中,估算网络是对训练前的估算网络进行训练所得;装置还包括:
特征提取模块1402,还用于对不同数据域的样本图像进行特征提取,得到训练图像特征;
卷积处理模块1404,还用于通过训练前的估算网络对训练图像特征进行卷积处理,得到训练图像特征对应的训练预测均值和训练预测方差;
计算模块1418,用于基于训练预测均值和训练图像特征的统计均值,以及训练预测方差和训练图像特征的统计方差,确定估算损失值;
网络参数调整模块1420,用于基于估算损失值对训练前的估算网络进行网络参数调整。
在一个实施例中,装置还包括:
标准化处理模块1408,还用于基于预测均值、预测方差和训练前的估算网络中用于标准化处理的网络参数对训练图像特征标准化处理,得到标准化训练特征;
分类模块1422,用于将标准化训练特征输入至训练前的分类器,以使训练前的分类器基于标准化训练特征对样本图像进行分类,得到样本分类概率;
计算模块1418,还用于根据样本分类概率以及样本图像的标签计算得到分类损失值;
网络参数调整模块1420,还用于基于分类损失值对训练前的分类器和估算网络进行网络参数调整。
在一个实施例中,装置还包括:
获取模块1406,还用于获取样本图像的深度图标签;
深度特征提取模块1424,用于对标准化训练特征进行深度特征提取,得到深度特征图;
计算模块1418,还用于根据深度特征图和深度图标签计算得到深度图损失值;
网络参数调整模块1420,还用于基于深度图损失值和估算损失值,对训练前的估算网络进行网络参数调整。
在一个实施例中,样本图像包括活体样本图像和非活体样本图像;获取模块1406,还用于:
当样本图像为活体样本图像时,对样本图像进行深度图计算,得到深度图标签;
当样本图像为非活体样本图像时,生成与样本图像尺寸相同的黑色底图,并将黑色底图 作为深度图标签。
在一个实施例中,装置还包括:
人脸识别模块1426,用于对样本图像进行人脸识别,得到包含人脸特征的目标区域;
删除模块1428,用于在样本图像中,将目标区域之外的图像区域进行删除;
获取模块1406,还用于获取删除图像区域后的样本图像的深度图标签。
关于活体检测装置的具体限定可以参见上文中对于活体检测方法的限定,在此不再赘述。上述活体检测装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,如图16所示,提供了一种用于活体检测的估算网络处理装置,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:特征提取模块1602、卷积处理模块1604、标准化处理模块1606和确定模块1608,其中:
特征提取模块1602,用于对不同数据域的样本图像进行特征提取,得到训练图像特征;
卷积处理模块1604,用于通过训练前的估算网络对训练图像特征进行卷积处理,得到训练图像特征的训练预测均值和训练预测方差;
标准化处理模块1606,用于基于训练预测均值和训练图像特征的统计均值,以及训练预测方差和训练图像特征的统计方差,确定估算损失值;
确定模块1608,用于基于估算损失值对训练前的估算网络进行网络参数调整;参数调整后的估算网络,用于确定待测图像中图像特征的均值和方差,以基于均值、方差和估算网络中用于标准化处理的网络参数对图像特征标准化处理,并根据标准化处理所得的标准化特征确定待测图像是否为活体图像。
上述活体检测方法、估算网络处理方法、装置、计算机设备、存储介质和计算机可读指令产品,通过训练前的估算网络对提取的图像特征进行均值和方差预测,得到图像特征的训练预测均值和训练预测方差,然后根据训练预测均值和统计均值,以及将训练预测方差和统计方差计算估算损失值,基于估算损失值对估算网络进行网络参数调整,从而使估算网络可以对不同应用场景的样本图像均可以估算出对应的均值和方差,提高了网络的泛化能力,从而可以有利于提高对不同数据域的待测图像进行活体检测的准确性。此外,在进行标准化处理时,还结合估算网络中用于标准化处理的网络参数与预测均值、预测方差对图像特征进行标准化处理,由于估算网络中的网络参数是经过模型训练后所得的参数,从而所得的标准化特征更加有利于活体检测,有利于提高待测图像进行活体检测的准确性。
在一个实施例中,如图17所示,装置还包括:
标准化处理模块1606,还用于基于预测均值、预测方差和训练前的估算网络中用于标准化处理的网络参数对训练图像特征标准化处理,得到标准化训练特征;
分类模块1610,用于将标准化训练特征输入至训练前的分类器,以使训练前的分类器基 于标准化训练特征对样本图像进行分类,得到样本分类概率;
计算模块1612,还用于根据样本分类概率以及样本图像的标签计算得到分类损失值;
网络参数调整模块1614,还用于基于分类损失值对训练前的分类器和估算网络进行网络参数调整。
在一个实施例中,装置还包括:
获取模块1616,还用于获取样本图像的深度图标签;
深度特征提取模块1618,用于对标准化训练特征进行深度特征提取,得到深度特征图;
计算模块1612,还用于根据深度特征图和深度图标签计算得到深度图损失值;
网络参数调整模块1614,还用于基于深度图损失值和估算损失值,对训练前的估算网络进行网络参数调整。
在一个实施例中,样本图像包括活体样本图像和非活体样本图像;获取模块1616,还用于:
当样本图像为活体样本图像时,对样本图像进行深度图计算,得到深度图标签;
当样本图像为非活体样本图像时,生成与样本图像尺寸相同的黑色底图,并将黑色底图作为深度图标签。
在一个实施例中,装置还包括:
人脸识别模块1620,用于对样本图像进行人脸识别,得到包含人脸特征的目标区域;
删除模块1622,用于在样本图像中,将目标区域之外的图像区域进行删除;
获取模块1616,还用于获取删除图像区域后的样本图像的深度图标签。
关于用于活体检测的估算网络处理装置的具体限定可以参见上文中对于用于活体检测的估算网络处理方法的限定,在此不再赘述。上述用于活体检测的估算网络处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务设备,其内部结构图可以如图18所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储活体检测数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种活体检测方法。
本领域技术人员可以理解,图18中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,还提供了一种计算机设备,包括存储器和处理器,存储器中存储有计 算机可读指令,该处理器执行计算机可读指令时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机可读指令产品,该计算机可读指令产品包括计算机可读指令,该计算机可读指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述各方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种活体检测方法,由计算机设备执行,其特征在于,所述方法包括:
    从不同数据域的待测图像中提取图像特征;
    通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
    获取所述估算网络中用于标准化处理的网络参数;
    基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
    依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
  2. 根据权利要求1所述的方法,其特征在于,所述网络参数包括线性变换参数和平移参数;所述基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征包括:
    根据所述预测均值和所述预测方差对所述图像特征归一化处理,得到归一化特征;
    基于所述线性变换参数和平移参数对所述归一化特征进行仿射变换,得到所述标准化特征。
  3. 根据权利要求1所述的方法,其特征在于,所述基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征包括:
    获取训练图像特征的统计均值和统计方差;所述训练图像特征是从样本图像中提取的图像特征;
    对所述预测均值和所述统计均值进行加权计算,得到加权均值;以及,对所述预测方差和所述统计方差进行加权计算,得到加权方差;
    根据所述加权均值、所述加权方差以及所述网络参数对所述图像特征进行标准化,得到所述标准化特征。
  4. 根据权利要求1所述的方法,其特征在于,所述依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像包括:
    将所述标准化特征输入至分类器,以使所述分类器基于所述标准化特征对所述待测图像进行分类,得到活体分类概率;
    当所述活体分类概率达到预设阈值时,确定所述待测图像为活体图像;
    当所述活体分类概率未达到所述预设阈值时,确定所述待测图像为非活体图像。
  5. 根据权利要求1所述的方法,其特征在于,所述待测图像为客户端响应于交互请求所采集的图像;所述方法还包括:
    当所述交互请求为资源账户注册请求、且所述待测图像为所述活体图像时,生成资源账户信息,并向所述客户端反馈所述资源账户信息;
    当所述交互请求为资源转移请求、且所述待测图像为所述活体图像时,对所述资源转移请求中指定数额的资源进行转移;
    当所述交互请求为出入开关开启请求、且所述待测图像为所述活体图像时,开启出入开 关。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述估算网络是对训练前的估算网络进行训练所得;所述对训练前的估算网络进行训练包括:
    对不同数据域的样本图像进行特征提取,得到训练图像特征;
    通过训练前的所述估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征对应的训练预测均值和训练预测方差;
    基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
    基于所述估算损失值对训练前的所述估算网络进行网络参数调整。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    基于所述预测均值、所述预测方差和训练前的所述估算网络中用于标准化处理的网络参数对所述训练图像特征标准化处理,得到标准化训练特征;
    将所述标准化训练特征输入至训练前的分类器,以使训练前的所述分类器基于所述标准化训练特征对所述样本图像进行分类,得到样本分类概率;
    根据所述样本分类概率以及所述样本图像的标签计算得到分类损失值;
    所述基于所述估算损失值对训练前的所述估算网络进行网络参数调整包括:
    基于所述分类损失值和所述估算损失值对训练前的所述分类器和所述估算网络进行网络参数调整。
  8. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    获取所述样本图像的深度图标签;
    对所述标准化训练特征进行深度特征提取,得到深度特征图;
    根据所述深度特征图和所述深度图标签计算得到深度图损失值;
    所述基于所述估算损失值对训练前的所述估算网络进行网络参数调整包括:
    基于所述深度图损失值和所述估算损失值,对训练前的所述估算网络进行网络参数调整。
  9. 根据权利要求8所述的方法,其特征在于,所述样本图像包括活体样本图像和非活体样本图像;所述获取所述样本图像的深度图标签包括:
    当所述样本图像为所述活体样本图像时,对所述样本图像进行深度图计算,得到所述深度图标签;
    当所述样本图像为所述非活体样本图像时,生成与所述样本图像尺寸相同的黑色底图,并将所述黑色底图作为所述深度图标签。
  10. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    对所述样本图像进行人脸识别,得到包含人脸特征的目标区域;
    在所述样本图像中,将所述目标区域之外的图像区域进行删除;
    所述获取所述样本图像的深度图标签包括:
    获取删除所述图像区域后的样本图像的深度图标签。
  11. 一种用于活体检测的估算网络处理方法,由计算机设备执行,其特征在于,所述方 法包括:
    对不同数据域的样本图像进行特征提取,得到训练图像特征;
    通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
    基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
    基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,并根据标准化处理所得的标准化特征确定所述待测图像是否为活体图像。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    基于所述预测均值、所述预测方差和训练前的所述估算网络中用于标准化处理的网络参数对所述训练图像特征标准化处理,得到标准化训练特征;
    将所述标准化训练特征输入至训练前的分类器,以使训练前的所述分类器基于所述标准化训练特征对所述样本图像进行分类,得到样本分类概率;
    根据所述样本分类概率以及所述样本图像的标签计算得到分类损失值;
    所述基于所述估算损失值对训练前的所述估算网络进行网络参数调整包括:
    基于所述分类损失值和所述估算损失值对训练前的所述分类器和所述估算网络进行网络参数调整。
  13. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    获取所述样本图像的深度图标签;
    对所述标准化训练特征进行深度特征提取,得到深度特征图;
    根据所述深度特征图和所述深度图标签计算得到深度图损失值;
    所述基于所述估算损失值对训练前的所述估算网络进行网络参数调整包括:
    基于所述深度图损失值和所述估算损失值,对训练前的所述估算网络进行网络参数调整。
  14. 根据权利要求13所述的方法,其特征在于,所述样本图像包括活体样本图像和非活体样本图像;所述获取所述样本图像的深度图标签包括:
    当所述样本图像为所述活体样本图像时,对所述样本图像进行深度图计算,得到所述深度图标签;
    当所述样本图像为所述非活体样本图像时,生成与所述样本图像尺寸相同的黑色底图,并将所述黑色底图作为所述深度图标签。
  15. 根据权利要求13所述的方法,其特征在于,所述方法还包括:
    对所述样本图像进行人脸识别,得到包含人脸特征的目标区域;
    在所述样本图像中,将所述目标区域之外的图像区域进行删除;
    所述获取所述样本图像的深度图标签包括:
    获取删除所述图像区域后的样本图像的深度图标签。
  16. 一种活体检测装置,其特征在于,所述装置包括:
    特征提取模块,用于从不同数据域的待测图像中提取图像特征;
    卷积处理模块,用于通过估算网络对所述图像特征进行卷积处理,得到所述图像特征的预测均值和预测方差;
    获取模块,用于获取所述估算网络中用于标准化处理的网络参数;
    标准化处理模块,用于基于所述预测均值、所述预测方差和所述网络参数对所述图像特征标准化处理,得到标准化特征;
    确定模块,用于依据基于所述标准化特征对所述待测图像进行分类所得的活体分类概率,确定所述待测图像是否为活体图像。
  17. 一种用于活体检测的估算网络处理装置,其特征在于,所述装置包括:
    特征提取模块,用于对不同数据域的样本图像进行特征提取,得到训练图像特征;
    卷积处理模块,用于通过训练前的估算网络对所述训练图像特征进行卷积处理,得到所述训练图像特征的训练预测均值和训练预测方差;
    标准化处理模块,用于基于所述训练预测均值和所述训练图像特征的统计均值,以及所述训练预测方差和所述训练图像特征的统计方差,确定估算损失值;
    确定模块,用于基于所述估算损失值对训练前的所述估算网络进行网络参数调整;参数调整后的所述估算网络,用于确定待测图像中图像特征的预测均值和预测方差,以基于所述预测均值、所述预测方差和所述估算网络中用于标准化处理的网络参数对所述图像特征标准化处理,并根据标准化处理所得的标准化特征确定所述待测图像是否为活体图像。
  18. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现权利要求1至15中任一项所述的方法的步骤。
  19. 一种计算机可读存储介质,存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现权利要求1至15中任一项所述的方法的步骤。
  20. 一种计算机可读指令产品,包括计算机可读指令,其特征在于,该计算机可读指令被处理器执行时实现权利要求1至15中任一项所述的方法的步骤。
PCT/CN2022/088444 2021-05-25 2022-04-22 活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品 WO2022247539A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/993,246 US20230082906A1 (en) 2021-05-25 2022-11-23 Liveness detection method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110569464.4A CN113033519B (zh) 2021-05-25 2021-05-25 活体检测方法、估算网络处理方法、装置和计算机设备
CN202110569464.4 2021-05-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/993,246 Continuation US20230082906A1 (en) 2021-05-25 2022-11-23 Liveness detection method

Publications (1)

Publication Number Publication Date
WO2022247539A1 true WO2022247539A1 (zh) 2022-12-01

Family

ID=76455842

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/088444 WO2022247539A1 (zh) 2021-05-25 2022-04-22 活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品

Country Status (3)

Country Link
US (1) US20230082906A1 (zh)
CN (1) CN113033519B (zh)
WO (1) WO2022247539A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033519B (zh) * 2021-05-25 2021-08-31 腾讯科技(深圳)有限公司 活体检测方法、估算网络处理方法、装置和计算机设备
CN113723215B (zh) * 2021-08-06 2023-01-17 浙江大华技术股份有限公司 活体检测网络的训练方法、活体检测方法及装置
US20230125629A1 (en) * 2021-10-26 2023-04-27 Avaya Management L.P. Usage and health-triggered machine response
CN114399005B (zh) * 2022-03-10 2022-07-12 深圳市声扬科技有限公司 一种活体检测模型的训练方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549854A (zh) * 2018-03-28 2018-09-18 中科博宏(北京)科技有限公司 一种人脸活体检测方法
US20190295302A1 (en) * 2018-03-22 2019-09-26 Northeastern University Segmentation Guided Image Generation With Adversarial Networks
CN111709409A (zh) * 2020-08-20 2020-09-25 腾讯科技(深圳)有限公司 人脸活体检测方法、装置、设备及介质
CN113033519A (zh) * 2021-05-25 2021-06-25 腾讯科技(深圳)有限公司 活体检测方法、估算网络处理方法、装置和计算机设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010078617A1 (en) * 2009-01-06 2010-07-15 Safe-Tech Industries Pty Ltd Pool monitoring system
DE102015115484C5 (de) * 2015-09-14 2019-11-21 JENETRIC GmbH Vorrichtung und Verfahren zur optischen Direktaufnahme von lebenden Hautbereichen
CN107506713A (zh) * 2017-08-15 2017-12-22 哈尔滨工业大学深圳研究生院 活体人脸检测方法及存储设备
CN112307450B (zh) * 2019-07-24 2024-03-08 云从科技集团股份有限公司 一种基于活体检测的接入控制系统及登录设备
CN112818722B (zh) * 2019-11-15 2023-08-18 上海大学 模块化动态可配置的活体人脸识别系统
CN110991432A (zh) * 2020-03-03 2020-04-10 支付宝(杭州)信息技术有限公司 活体检测方法、装置、电子设备及系统
CN112668519A (zh) * 2020-12-31 2021-04-16 声耕智能科技(西安)研究院有限公司 基于MCCAE网络和Deep SVDD网络的异常人脸识别活体检测方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190295302A1 (en) * 2018-03-22 2019-09-26 Northeastern University Segmentation Guided Image Generation With Adversarial Networks
CN108549854A (zh) * 2018-03-28 2018-09-18 中科博宏(北京)科技有限公司 一种人脸活体检测方法
CN111709409A (zh) * 2020-08-20 2020-09-25 腾讯科技(深圳)有限公司 人脸活体检测方法、装置、设备及介质
CN113033519A (zh) * 2021-05-25 2021-06-25 腾讯科技(深圳)有限公司 活体检测方法、估算网络处理方法、装置和计算机设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU YAOHUA: "Visual Place Recognition Based on Hard Example Mining and Domain Adaption", MASTER THESIS, TIANJIN POLYTECHNIC UNIVERSITY, CN, no. 3, 15 March 2020 (2020-03-15), CN , XP093008334, ISSN: 1674-0246 *

Also Published As

Publication number Publication date
CN113033519A (zh) 2021-06-25
CN113033519B (zh) 2021-08-31
US20230082906A1 (en) 2023-03-16

Similar Documents

Publication Publication Date Title
WO2022247539A1 (zh) 活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品
CN107330439B (zh) 一种图像中物体姿态的确定方法、客户端及服务器
CN109543714B (zh) 数据特征的获取方法、装置、电子设备及存储介质
CN112052831B (zh) 人脸检测的方法、装置和计算机存储介质
WO2021139324A1 (zh) 图像识别方法、装置、计算机可读存储介质及电子设备
WO2020228525A1 (zh) 地点识别及其模型训练的方法和装置以及电子设备
CN110866471A (zh) 人脸图像质量评价方法及装置、计算机可读介质、通信终端
CN113688855A (zh) 数据处理方法、联邦学习的训练方法及相关装置、设备
CN111476306A (zh) 基于人工智能的物体检测方法、装置、设备及存储介质
CN108734185B (zh) 图像校验方法和装置
CN111754396B (zh) 脸部图像处理方法、装置、计算机设备和存储介质
CN111444826B (zh) 视频检测方法、装置、存储介质及计算机设备
WO2022206319A1 (zh) 图像处理方法、装置、设备、存储介质计算机程序产品
TW202026948A (zh) 活體檢測方法、裝置以及儲存介質
CN108304789A (zh) 脸部识别方法及装置
CN111008935B (zh) 一种人脸图像增强方法、装置、系统及存储介质
CN112052830B (zh) 人脸检测的方法、装置和计算机存储介质
CN109977832B (zh) 一种图像处理方法、装置及存储介质
WO2020143316A1 (zh) 证件图像提取方法及终端设备
CN112036284B (zh) 图像处理方法、装置、设备及存储介质
CN108399401B (zh) 用于检测人脸图像的方法和装置
CN110942456B (zh) 篡改图像检测方法、装置、设备及存储介质
Feng et al. Iris R-CNN: Accurate iris segmentation and localization in non-cooperative environment with visible illumination
CN112651333B (zh) 静默活体检测方法、装置、终端设备和存储介质
CN111310531A (zh) 图像分类方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22810272

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE