WO2021068322A1 - Training method and apparatus for living body detection model, computer device, and storage medium - Google Patents

Training method and apparatus for living body detection model, computer device, and storage medium Download PDF

Info

Publication number
WO2021068322A1
WO2021068322A1 PCT/CN2019/116269 CN2019116269W WO2021068322A1 WO 2021068322 A1 WO2021068322 A1 WO 2021068322A1 CN 2019116269 W CN2019116269 W CN 2019116269W WO 2021068322 A1 WO2021068322 A1 WO 2021068322A1
Authority
WO
WIPO (PCT)
Prior art keywords
living body
target
candidate region
detected
position information
Prior art date
Application number
PCT/CN2019/116269
Other languages
French (fr)
Chinese (zh)
Inventor
赵娅琳
陆进
陈斌
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021068322A1 publication Critical patent/WO2021068322A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/70Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in livestock or poultry

Definitions

  • This application relates to a training method, device, computer equipment and storage medium of a living body detection model.
  • Near-infrared live detection uses infrared light with different spectral bands than visible light. It can be blindly measured on near-infrared images without the user’s cooperation, which reduces the cumbersomeness of the live detection algorithm and improves its accuracy, and While reducing production costs, it can better guarantee the interests of related users and enterprises.
  • the traditional near-infrared living body detection method is mostly divided into two steps.
  • the face detector is used to detect the human face on the color picture formed by visible light; then the LBP feature of the human face is extracted at the corresponding position of the near-infrared image and input to the living body discriminator for living body judgment.
  • the inventor realizes that in this way, each step is an independent task, and the face detector and the living body discriminator used need to be trained separately.
  • the fit between the models is not high, and the living body discrimination
  • the accuracy of the detector is easily affected by the face detector, resulting in low accuracy of the trained model. .
  • a training method, device, computer device, and storage medium of a living body detection model are provided.
  • a method for training a living body detection model comprising:
  • the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
  • the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
  • the color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
  • a training device for a living body detection model comprising:
  • An initial model acquisition module for acquiring an initial living body detection model, the initial living body detection model including an initial candidate region generation network and an initial living body classification network;
  • the training sample acquisition module is used to acquire a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and a corresponding target Living body position information;
  • the first training module is configured to train the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network;
  • the second training module is configured to train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain a first living body classification network;
  • the input module is used to input the color image into the first candidate area generation network to obtain current face candidate area position information, and input the current face candidate area position information and the near-infrared image into the In the first living body classification network, the current living body position information is obtained;
  • the parameter adjustment module is configured to adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to input the color image to the first candidate region to generate The steps in the network until they converge, and the target candidate area is obtained to generate the network;
  • the living body detection model obtaining module trains the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain the target living body classification network, and generates the network and the said target candidate region according to the target candidate region.
  • the target living body classification network obtains a trained target living body detection model.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
  • the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
  • the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
  • the color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
  • the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
  • the color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
  • Fig. 1 is an application scenario diagram of a method for training a living body detection model according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of a method for training a living body detection model according to one or more embodiments.
  • Fig. 3 is a schematic flowchart of steps for obtaining location information of a target face candidate area according to one or more embodiments.
  • Fig. 4 is a block diagram of a training device for a living body detection model according to one or more embodiments.
  • Figure 5 is a block diagram of a computer device according to one or more embodiments.
  • the training method of the living body detection model provided in this application can be applied to the application environment as shown in FIG. 1.
  • the computer device 102 first obtains the initial living body detection model including the initial candidate region generation network and the initial living body classification network, and trains the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network.
  • the target candidate region generation network and the second training sample set train the first living body classification network until convergence, and get The target living body classification network, and finally the trained target living body detection model is obtained according to the target candidate area generation network and the target living body classification network. Further, after the computer device 102 obtains the target living body detection model through training, it can be stored locally or sent to the computer device 104.
  • the computer device 102 and the computer device 104 may be, but are not limited to, various personal computers and notebook computers.
  • a training method of a living body detection model is provided. Taking the method applied to the above-mentioned computer device 102 as an example for description, the method includes the following steps:
  • Step 202 Obtain an initial living body detection model.
  • the initial living body detection model includes an initial candidate region generation network and an initial living body classification network.
  • the initial living body detection model may be a predetermined model used for living body detection for training the living body detection model, and the initial living body detection model may be an untrained living body detection model or a living body detection model that has not been trained.
  • the initial living body detection model includes the initial candidate region generation network and the initial living body classification network.
  • the candidate initial candidate area is used to train to obtain the target candidate area generation network, the target candidate area generation network is used to extract the candidate area from the input image; the initial living body classification network is used to train the target living body classification network, and the target living body classification network is used according to the input
  • the image is classified into living body to obtain the result of living body detection.
  • step 202 the following steps are further included:
  • the network structure information of the initial live detection model can be determined. Specifically, since the initial living body detection model includes the initial candidate region generating network and the initial living body classification network, the network structure information of the initial candidate region generating network and the network structure information of the initial living body classification network can be determined respectively.
  • the initial candidate region generation network and the initial living body classification network can be various neural networks.
  • the initial candidate region generation network and the initial living body classification network can be determined respectively which kind of neural network, including several layers of neurons, each How many neurons are in a layer, the connection sequence relationship between neurons in each layer, what parameters each layer of neurons includes, the type of activation function corresponding to each layer of neurons, and so on. It is understandable that for different neural network types, the network structure information that needs to be determined is also different.
  • each network parameter of the initial candidate region generation network and the initial living body classification network in the initial living body detection model can be initialized.
  • each network parameter of the initial candidate region generation network and the initial living body classification network may be initialized with some different small random numbers. "Small random number” is used to ensure that the network will not enter a saturated state due to excessive weights, resulting in training failure, and “different” is used to ensure that the network can learn normally.
  • Step 204 Obtain a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information.
  • the first training sample set and the second training sample set are both labeled image sample sets containing human faces.
  • the training samples in the first training sample set include color images, target face images and corresponding target face candidate area location information.
  • the color images refer to the RGB images collected by the camera under natural light.
  • the target face image refers to the image corresponding to the face area in the color image
  • the location information of the target face candidate area refers to the position coordinates corresponding to the face area in the color image. It is understandable that the color image is the first training sample
  • the corresponding input data, the target face image and the corresponding target face candidate region location information are the training labels corresponding to the first training sample.
  • the training sample corresponding to the second training sample set (hereinafter referred to as the second training sample) includes the color image, the near-infrared image corresponding to the color image, the target living body detection result and the corresponding target living body position information. It can be understood that the target living body The detection result and the corresponding target living body position information are the training labels corresponding to the second training sample.
  • the target living body detection result is used to characterize whether the face in the face image to be detected is a living body face; the target living body position information refers to The position coordinates of the face image corresponding to the target living body detection result.
  • the living body detection result may be a detection result identifier (for example, the number 1 or the vector (1,0)) used to characterize the human face in the face image, or it may be used to characterize the human face image. If the face in is not a live face, the detection result identifier (for example, the number 0 or the vector (0,1)); in other embodiments, the live detection result may also include that the face in the face image is a live person The probability of the face and/or the probability that the face in the face image is a non-living human face.
  • the detection result identifier for example, the number 1 or the vector (1,0)
  • the live detection result may also include that the face in the face image is a live person The probability of the face and/or the probability that the face in the face image is a non-living human face.
  • the live detection result may be a vector including a first probability and a second probability, and the first probability is used to characterize the person in the face image The probability that the face is a living face, and the second probability is used to represent the probability that the face in the face image is a non-living face.
  • Step 206 Train the initial candidate region generation network according to the first training sample set until convergence, and obtain the first candidate region generation network.
  • the color image in the first training sample is input into the initial candidate area generation network, and the target face image corresponding to the color image and the corresponding target face candidate area position information are used as the desired output to perform the initial candidate area generation network.
  • Training During the training process, the parameters of the initial candidate region generation network are continuously adjusted, until the convergence condition is met, the training is stopped, and the currently trained candidate region generation network is obtained, that is, the first candidate region generation network.
  • the convergence condition may be that the training time exceeds the preset duration, the number of training times exceeds the preset number, and the difference between the actual output and the expected output is less than the difference threshold.
  • BP Back Propagation
  • SGD Spochastic Gradient Descent
  • Step 208 Train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain the first living body classification network.
  • the parameters of the currently trained candidate region generation network that is, first input the color image in the second training sample into the first candidate region generation network to obtain the first candidate region generation network.
  • a target face image and its corresponding first face candidate area location information trains the initial living body classification network until the convergence condition is met, the training is stopped, and the currently trained living body classification network, namely the first living body classification network, is obtained.
  • the image of the corresponding position is intercepted from the near-infrared image to obtain the image of the region of interest, and the image of the region of interest is input into the initial living body classification network, and the target living body detection result and the corresponding
  • the target living body position information is used as the expected output to adjust the parameters of the initial living body classification network until the convergence condition is met, and the training ends.
  • Step 210 Input the color image into the first candidate area generation network to obtain the current position information of the face candidate area, and input the current position information of the face candidate area and the near-infrared image into the first living body classification network to obtain the current living body position information .
  • the current face image corresponding to the color image and the current face candidate region location information corresponding to the current face image can be obtained. Further, Input the position information of the current face candidate area and the near-infrared image corresponding to the color image in the second training sample into the first living body classification network.
  • the first living body classification network first intercepts the position of the current face candidate area from the near-infrared image Information corresponding to the image area, the region of interest image is obtained, and then the image of the region of interest is classified by the first living body classification network to obtain the current living body detection result and the corresponding current living body position information.
  • the current living body position information is the sensitive The position coordinates obtained by the position regression of the image of the region of interest.
  • Step 212 Adjust the parameters of the first candidate region generating network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generating network until convergence, to obtain the target candidate region generating network .
  • the difference can be an error, and the error can be a mean absolute error (MAE), a mean squared error (MSE), or a root mean squared error (RMSE), etc.
  • MSE mean squared error
  • RMSE root mean squared error
  • the cost function can be constructed according to the current living body position information and the error of the target living body position information, which is usually also called the loss function. It should be understood that the cost function is used to reflect the current living body position information and the target.
  • the difference between the living body position information may include a regularization term for preventing overfitting.
  • the cost functions of the two are the same, and there is gradient back propagation, so it can be adjusted by minimizing the cost function of the living body classification network
  • the candidate area generates the parameters of the network.
  • the parameters of the first candidate region generation network can be adjusted by the gradient descent method. Specifically, the gradient determined according to the error of the current living body position information and the target living body position information (for example, the cost function versus the model parameter The partial derivative of) propagates back to the first candidate region generating network to adjust the parameters of the first candidate region generating network.
  • step 210-step 212 to train the first candidate region generation network for multiple times, until the convergence condition is met, stop training, and obtain a trained target candidate region generation network.
  • Step 214 Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain the target living body classification network, and obtain the trained target living body detection model according to the target candidate region generation network and the target living body classification network.
  • the parameters of the target candidate region generation network are fixed, and the first living body classification network is trained through the second training sample set.
  • the color image in the second training sample is input into the target candidate area generation network to obtain the second target face image and its corresponding second face candidate area location information, and then according to the second face candidate area location information, the first 2.
  • the near-infrared image corresponding to the color image, the target live body detection result and the corresponding target live body position information train the first live body classification network, until the convergence condition is met, the training is stopped, and the currently trained live body classification network is obtained , Namely the target living body classification network.
  • the image of the corresponding position is intercepted from the near-infrared image to obtain the image of the region of interest, and the image of the region of interest is input into the first living body classification network, and the target living body detection result and the corresponding
  • the target living body position information is used as the desired output to adjust the parameters of the first living body classification network until the convergence condition is met, and the training ends.
  • the output terminal of the target candidate region generation network is connected with the input terminal of the target living body classification network to obtain a trained target living body detection model.
  • the initial candidate region generation network is first trained to obtain the first candidate region generation network, and then the parameters of the first candidate region generation network are fixed, the initial living body classification network is trained, and the first living body classification network is obtained.
  • the current living body position information is obtained, and the parameters of the first candidate region generating network are adjusted according to the difference between the current living body position information and the target living body position information to obtain the target candidate region generation.
  • the fixed target candidate region generation network continues to train the first living body classification network to obtain the target living body classification network, and finally the trained target living body detection model is obtained according to the target candidate region generating network and the target living body classification network.
  • face detection and living body classification are integrated into one model, and an end-to-end model training method is adopted.
  • the loss of the living body classification network can be back propagated to the candidate region generation network, and the fit between the networks Compared with the two separate models in traditional technology, the accuracy of the obtained living body detection model has been significantly improved.
  • the above method further includes: obtaining a target living body detection model; obtaining a to-be-detected color image and a to-be-detected near-infrared image corresponding to the face to be detected; and inputting the to-be-detected color image into a target candidate corresponding to the target living body detection model
  • the region generation network obtains the position information of the target face candidate area; the position information of the target face candidate area and the near-infrared image to be detected are input into the target living body classification network corresponding to the target living body detection model, and the living body detection result is obtained.
  • the color image to be detected refers to the color image used for live detection to determine whether the face to be detected is a live face
  • the near-infrared image to be detected refers to the color image used for live detection to determine whether the face to be detected is a live person Near infrared image of face.
  • the target face image and the corresponding target face candidate area location information can be obtained, and the target face candidate area location information and the near-infrared
  • the image is input to the target living body classification network.
  • the target living body classification network can first intercept the image of the corresponding position from the near-infrared image to be detected according to the position information of the target face candidate area to obtain the image of the region of interest, and classify the image of the region of interest in vivo. Obtain the live body detection result corresponding to the face to be detected.
  • the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer, and the color image to be detected is input to the target corresponding to the target living detection model.
  • the candidate area generation network obtains the position information of the target face candidate area, including:
  • Step 302 Input the color image to be detected into a first convolution layer, and perform a convolution operation on the color image to be detected through the first convolution layer to obtain a first feature matrix.
  • the target candidate region generation network includes at least one convolution layer, and the convolution layer performs a convolution operation on the color image to be detected to obtain the first feature matrix.
  • Convolution operation refers to the operation of multiplying products using a convolution kernel. Convolution through the convolution kernel can reduce the feature dimension and express the local features of the image. Different convolution windows have different expression capabilities.
  • the size of the convolution window is determined according to the latitude (embedding size) and filter width (filter width) of the feature vector corresponding to the image. The filter width is adjusted by experiment. In some embodiments, the filter width is selected as 3 and 4 respectively.
  • the convolution window can be selected respectively 128*3, 128*4, 128*5, 128*6, 128*7, 128* 8.
  • One convolution kernel corresponds to one output. For example, if there are 10 convolution kernels in the convolution layer, 10 outputs will be obtained after the effect of 10 convolution kernels, that is, a 10-dimensional first feature matrix is obtained.
  • Step 304 Input the first feature matrix into the first pooling layer, and project the largest weight in each vector in the first feature matrix through the first pooling layer to obtain a normalized second feature matrix.
  • the target candidate region generation network includes at least one pooling layer.
  • the pooling layer adopts max-pooling, which is used to project the element with the largest energy in each vector obtained by the convolution layer (ie, the element with the largest weight) to the next layer. Input, the purpose of this is to ensure that the output of different feature vectors and different convolution kernels are normalized, and the maximum information is not lost.
  • the first feature matrix is composed of multiple vectors, and the largest weight in each vector is projected to obtain a normalized second feature matrix.
  • Step 306 Input the second feature matrix into the second convolutional layer, and perform convolution calculation on the second feature matrix through the second convolutional layer to obtain position information of the target face candidate region.
  • the candidate region generation network in this embodiment adopts Fully Convolutional Networks. After the image passes through the pooling layer, it is directly input into the second convolutional layer, and the second convolutional layer is used instead of the fully connected layer. Perform convolution calculation on the second feature matrix to obtain the target face image corresponding to the color image to be detected and the corresponding target face candidate region position information.
  • the convolutional layer instead of the fully connected layer, since the calculation of the convolution kernel is parallel and does not need to be read into the memory at the same time, the storage overhead can be saved and the candidate region generation network can be improved. The efficiency of face classification and position regression.
  • inputting the position information of the target face candidate region and the near-infrared image to be detected into the target living body classification network corresponding to the target living body detection model to obtain the living body detection result includes: The corresponding region of interest image is intercepted from the near-infrared image to be detected, the region of interest image is input into the third convolution layer, and the third convolution layer performs convolution operation on the region of interest image to obtain the third feature matrix; The third feature matrix is input to the second pooling layer, and the largest weight in each vector in the third feature matrix is projected through the second pooling layer to obtain the normalized fourth feature matrix; the fourth feature matrix is input In the fourth convolutional layer, the fourth feature matrix is convolved and calculated through the fourth convolutional layer to obtain the live body detection result.
  • the living body classification network adopts a full convolutional network, which includes at least one third convolutional layer, at least one fourth convolutional layer, and at least one second pooling layer.
  • a full convolutional network After intercepting the corresponding region of interest image from the near-infrared image to be detected according to the position information of the target face candidate region, first input the region of interest image into the third convolutional layer, and perform the convolution operation through the third convolutional layer
  • the third feature matrix is obtained, and then the third feature matrix is input to the second pooling layer connected with the third convolutional layer to obtain the fourth feature matrix.
  • the largest weight in each vector in the feature matrix is projected, and the number of parameters is significantly reduced, so that the feature dimension can be reduced.
  • the fourth feature matrix obtained is input to the fourth convolution connected to the second pooling layer.
  • the fourth feature matrix is subjected to convolution calculation through the fourth convolution layer to obtain the living body detection result and the corresponding living body position information.
  • the living body position information here refers to the position information obtained by performing position regression on the image of the region of interest, and may be the position information corresponding to the living body face or the position information corresponding to the non-living body face.
  • the full convolutional network is adopted, not only the storage overhead is saved, but the living body detection efficiency can also be improved.
  • capturing the corresponding region of interest image from the near-infrared image to be detected according to the position information of the target face candidate region includes: according to a pre-calibrated camera parameter matrix, corresponding the position information of the target face candidate region to the target face candidate region. On the detection near-infrared image, locate the face position in the near-infrared image to be detected, and cut out the corresponding region of interest image according to the positioned face position.
  • the dual camera modules are used to collect color images and near-infrared images, and the camera parameter matrix between the camera module corresponding to the color image and the camera module corresponding to the near-infrared image is pre-calibrated.
  • the generation network performs position regression to obtain the position information of the target face candidate area corresponding to the face to be detected
  • the position information of the target face candidate area can be matrix transformed according to the camera parameter matrix to obtain the corresponding position information in the near-infrared image.
  • the position of the face is located from the near-infrared image, and the image area corresponding to the position of the face is intercepted to obtain the image of the region of interest.
  • the image of the region of interest can be accurately captured from the near-infrared image, thereby improving the efficiency and accuracy of living body detection.
  • the above method before acquiring the to-be-detected color image and the to-be-detected near-infrared image corresponding to the to-be-detected face, the above method further includes: using dual camera modules to collect the color image and the near-infrared image corresponding to the to-be-detected face, Perform face detection on the collected color images; when it is determined that a human face is detected according to the face detection results, the collected color image and near-infrared image are determined as the to-be-detected color image and the to-be-detected near-infrared image, respectively; When it is judged that no human face is detected according to the face detection result, return to the step of using the dual camera module to collect the color image and the near-infrared image corresponding to the face to be detected.
  • the color image and the near-infrared image are collected by the dual camera module, the color image is subjected to face detection.
  • the color image and the near-infrared image collected at this time can be determined as the to-be-detected color image and the to-be-detected near-infrared image; on the contrary, if there is no color image If a human face is detected, the near-infrared image must not contain the face area. At this time, it is necessary to continue to collect the color image and the near-infrared image corresponding to the face to be detected to collect the human face that can be used for live detection Image.
  • the color image and the near-infrared image corresponding to the face to be detected are collected by the dual camera module.
  • the face detection is performed on the color image, it can be accurately determined whether the collected data can be used for living body detection.
  • the image of the human face improves the efficiency of image collection, thereby improving the efficiency of living body detection.
  • a training device 400 for a living body detection model including:
  • the initial model acquisition module 402 is used to acquire an initial living body detection model, and the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
  • the training sample acquisition module 404 is used to acquire the first training sample set and the second training sample set; the training samples corresponding to the second training sample set include color images, near-infrared images corresponding to the color images, and corresponding target living body position information ;
  • the first training module 406 is configured to train the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network;
  • the second training module 408 is configured to train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain the first living body classification network;
  • the input module 410 is used to input the color image into the first candidate area generation network to obtain the current position information of the face candidate area, and input the current position information of the face candidate area and the near-infrared image into the first living body classification network to obtain the current Living body position information;
  • the parameter adjustment module 412 is used to adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence, to obtain the target Candidate area generation network;
  • the living body detection model obtaining module 414 trains the first living body classification network according to the target candidate area generation network and the second training sample set until convergence, and obtains the target living body classification network, and obtains the trained target according to the target candidate area generation network and the target living body classification network Live detection model.
  • the above-mentioned device further includes: a living body detection module for obtaining a target living body detection model; obtaining a color image to be detected and a near-infrared image to be detected corresponding to a face to be detected; and inputting the color image to be detected into the target living body
  • the target candidate area generation network corresponding to the detection model obtains the position information of the target face candidate area; the position information of the target face candidate area and the near-infrared image to be detected are input into the target living body classification network corresponding to the target living body detection model to obtain the living body detection result.
  • the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer.
  • the living body detection module is also used to input the color image to be detected into the first convolutional layer.
  • the first convolution layer performs a convolution operation on the color image to be detected to obtain the first feature matrix; the first feature matrix is input into the first pooling layer, and each vector in the first feature matrix is transferred by the first pooling layer The largest weight is projected to obtain the normalized second feature matrix; the second feature matrix is input into the second convolution layer, and the second feature matrix is convolved through the second convolution layer to obtain the target face candidate area location information.
  • the target living body classification network includes a third convolutional layer, a fourth convolutional layer, and a second pooling layer.
  • the living body detection module is also used to obtain information from the near-infrared image to be detected based on the position information of the target face candidate area.
  • the corresponding region of interest image is intercepted, the region of interest image is input into the third convolution layer, and the convolution operation is performed on the region of interest image through the third convolution layer to obtain the third feature matrix;
  • the third feature matrix is input to the first In the second pooling layer, the largest weight in each vector in the third feature matrix is projected by the second pooling layer to obtain a normalized fourth feature matrix;
  • the fourth feature matrix is input to the fourth convolutional layer In, the fourth feature matrix is convolved through the fourth convolution layer to obtain the result of living body detection.
  • the living body detection module is also used to map the position information of the candidate face region of the target to the near-infrared image to be detected according to the pre-calibrated camera parameter matrix, and locate the position of the face in the near-infrared image to be detected, According to the located face position, the corresponding region of interest image is cut out.
  • the above-mentioned device further includes: an image acquisition module, which is used to collect color images and near-infrared images corresponding to the face to be detected by using dual camera modules, and perform face detection on the collected color images;
  • an image acquisition module which is used to collect color images and near-infrared images corresponding to the face to be detected by using dual camera modules, and perform face detection on the collected color images;
  • the face detection result determines that a human face is detected
  • the collected color image and the near-infrared image are respectively determined as the to-be-detected color image and the to-be-detected near-infrared image; when it is determined that no human face is detected according to the face detection result,
  • the dual camera module to collect the color image and the near-infrared image corresponding to the face to be detected.
  • the various modules in the training device for the above-mentioned living body detection model can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided, and its internal structure diagram may be as shown in FIG. 5.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store training sample data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instruction is executed by the processor to realize a training method of a living body detection model.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the method for training a living body detection model provided in any one of the embodiments of the present application is implemented step.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors implement any one of the embodiments of the present application. Provide the steps of the training method of the living body detection model.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Abstract

A training method for a living body detection model comprises: acquiring an initial living body detection model comprising an initial candidate region generation network and an initial living body classification network; training the initial candidate region generation network according to a first training sample set, to obtain a first candidate region generation network; training the initial living body classification network according to the first candidate region generation network and a second training sample set, to obtain a first living body classification network; according to the first candidate region generation network, the first living body classification network and the second training sample set, obtaining current living body position information; according to a difference between the current living body position information and target living body position information, adjusting parameters of the first candidate region generation network and continuing training same, to obtain a target candidate region generation network; and training the first living body classification network according to the target candidate region generation network and the second training sample set, to obtain a target living body classification network.

Description

活体检测模型的训练方法、装置、计算机设备和存储介质Training method, device, computer equipment and storage medium of living body detection model
相关申请的交叉引用Cross-references to related applications
本申请要求于2019年10月10日提交中国专利局,申请号为2019109581915,申请名称为“活体检测模型的训练方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on October 10, 2019, the application number is 2019109581915, and the application title is "Training method, device, computer equipment and storage medium of living body detection model", and its entire content Incorporated in this application by reference.
技术领域Technical field
本申请涉及一种活体检测模型的训练方法、装置、计算机设备和存储介质。This application relates to a training method, device, computer equipment and storage medium of a living body detection model.
背景技术Background technique
随着人工智能技术的发展,出现了近红外活体检测技术。近红外活体检测,作为一种身份见证方法,利用红外光的光谱波段禹可见光不同,无需用户配合,可在近红外图像上进行盲测,降低了活体检测算法的繁琐度与提高其精度,并且降低生产成本的同时,可以更好地保证相关用户与企业的利益。With the development of artificial intelligence technology, near-infrared live detection technology has emerged. Near-infrared live detection, as an identity witness method, uses infrared light with different spectral bands than visible light. It can be blindly measured on near-infrared images without the user’s cooperation, which reduces the cumbersomeness of the live detection algorithm and improves its accuracy, and While reducing production costs, it can better guarantee the interests of related users and enterprises.
传统的近红外活体检测方法,大多分两步。首先,利用检脸器在可见光所成的彩色图片上检测人脸;然后在近红外图像对应位置提取人脸的LBP特征输入至活体判别器进行活体判断。然而,发明人意识到,这种方式下,每一步骤都是一个独立的任务,所使用到的检脸器和活体判别器都需要单独分开训练,模型之间的契合度不高,活体判别器的准确性容易受到检脸器的影响,导致训练得到的模型的准确性低。。The traditional near-infrared living body detection method is mostly divided into two steps. First, the face detector is used to detect the human face on the color picture formed by visible light; then the LBP feature of the human face is extracted at the corresponding position of the near-infrared image and input to the living body discriminator for living body judgment. However, the inventor realizes that in this way, each step is an independent task, and the face detector and the living body discriminator used need to be trained separately. The fit between the models is not high, and the living body discrimination The accuracy of the detector is easily affected by the face detector, resulting in low accuracy of the trained model. .
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种活体检测模型的训练方法、装置、计算机设备和存储介质。According to various embodiments disclosed in the present application, a training method, device, computer device, and storage medium of a living body detection model are provided.
一种活体检测模型的训练方法,所述方法包括:A method for training a living body detection model, the method comprising:
获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;Acquiring an initial living body detection model, where the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;Acquiring a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;Training the initial candidate region generation network according to the first training sample set until convergence, to obtain a first candidate region generation network;
根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;Training the initial living body classification network according to the first candidate region generation network and the second training sample set until convergence, to obtain a first living body classification network;
将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信 息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及Adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence , Get the target candidate region generation network; and
根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain a target living body classification network, and obtain training based on the target candidate region generating network and the target living body classification network Good target live detection model.
一种活体检测模型的训练装置,所述装置包括:A training device for a living body detection model, the device comprising:
初始模型获取模块,用于获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;An initial model acquisition module for acquiring an initial living body detection model, the initial living body detection model including an initial candidate region generation network and an initial living body classification network;
训练样本获取模块,用于获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;The training sample acquisition module is used to acquire a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and a corresponding target Living body position information;
第一训练模块,用于根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;The first training module is configured to train the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network;
第二训练模块,用于根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;The second training module is configured to train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain a first living body classification network;
输入模块,用于将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The input module is used to input the color image into the first candidate area generation network to obtain current face candidate area position information, and input the current face candidate area position information and the near-infrared image into the In the first living body classification network, the current living body position information is obtained;
参数调整模块,用于根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及The parameter adjustment module is configured to adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to input the color image to the first candidate region to generate The steps in the network until they converge, and the target candidate area is obtained to generate the network; and
活体检测模型获得模块,根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。The living body detection model obtaining module trains the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain the target living body classification network, and generates the network and the said target candidate region according to the target candidate region. The target living body classification network obtains a trained target living body detection model.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;Acquiring an initial living body detection model, where the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括 彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;Acquiring a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;Training the initial candidate region generation network according to the first training sample set until convergence, to obtain a first candidate region generation network;
根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;Training the initial living body classification network according to the first candidate region generation network and the second training sample set until convergence, to obtain a first living body classification network;
将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及Adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence , Get the target candidate region generation network; and
根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain a target living body classification network, and obtain training based on the target candidate region generating network and the target living body classification network Good target live detection model.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;Acquiring an initial living body detection model, where the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;Acquiring a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;Training the initial candidate region generation network according to the first training sample set until convergence, to obtain a first candidate region generation network;
根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;Training the initial living body classification network according to the first candidate region generation network and the second training sample set until convergence, to obtain a first living body classification network;
将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及Adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence , Get the target candidate region generation network; and
根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain a target living body classification network, and obtain training based on the target candidate region generating network and the target living body classification network Good target live detection model.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和 优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the present application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1为根据一个或多个实施例中活体检测模型的训练方法的应用场景图。Fig. 1 is an application scenario diagram of a method for training a living body detection model according to one or more embodiments.
图2为根据一个或多个实施例中活体检测模型的训练方法的流程示意图。Fig. 2 is a schematic flowchart of a method for training a living body detection model according to one or more embodiments.
图3为根据一个或多个实施例中得到目标人脸候选区域位置信息的步骤流程示意图。Fig. 3 is a schematic flowchart of steps for obtaining location information of a target face candidate area according to one or more embodiments.
图4为根据一个或多个实施例中活体检测模型的训练装置的框图。Fig. 4 is a block diagram of a training device for a living body detection model according to one or more embodiments.
图5为根据一个或多个实施例中计算机设备的框图。Figure 5 is a block diagram of a computer device according to one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
本申请提供的活体检测模型的训练方法,可以应用于如图1所示的应用环境中。在该应用环境中,计算机设备102首先获取包括初始候选区域生成网络及初始活体分类网络的初始活体检测模型,并根据第一训练样本集训练初始候选区域生成网络直至收敛,得到第一候选区域生成网络,然后根据第一候选区域生成网络及第二训练样本集训练初始活体分类网络直至收敛,得到第一活体分类网络,将彩色图像输入到第一候选区域生成网络中,得到当前人脸候选区域位置信息,将当前人脸候选区域位置信息及近红外图像输入第一活体分类网络中,得到当前活体位置信息,进一步根据当前活体位置信息及目标活体位置信息的差异调整第一候选区域生成网络的参数,并返回将彩色图像输入到第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络,根据目标候选区域生成网络及第二训练样本集训练第一活体分类网络直至收敛,得到目标活体分类网络,最后根据目标候选区域生成网络及目标活体分类网络得到训练好的目标活体检测模型。进一步,计算机设备102在训练得到目标活体检测模型后,可以存储至本地或者发送至计算机设备104。The training method of the living body detection model provided in this application can be applied to the application environment as shown in FIG. 1. In this application environment, the computer device 102 first obtains the initial living body detection model including the initial candidate region generation network and the initial living body classification network, and trains the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network. Network, and then train the initial live classification network according to the first candidate area generation network and the second training sample set until it converges to obtain the first live classification network, and input the color image into the first candidate area generation network to obtain the current face candidate area Position information, input the current position information of the face candidate area and the near-infrared image into the first living body classification network to obtain the current living body position information, and further adjust the first candidate area generation network according to the difference between the current living body position information and the target living body position information Parameters, and return to the step of inputting the color image into the first candidate region generation network until convergence, and obtain the target candidate region generation network. According to the target candidate region generation network and the second training sample set, train the first living body classification network until convergence, and get The target living body classification network, and finally the trained target living body detection model is obtained according to the target candidate area generation network and the target living body classification network. Further, after the computer device 102 obtains the target living body detection model through training, it can be stored locally or sent to the computer device 104.
计算机设备102以及算机设备104可以但不限于是各种个人计算机、笔记本电脑。The computer device 102 and the computer device 104 may be, but are not limited to, various personal computers and notebook computers.
在一些实施例中,如图2所示,提供了一种活体检测模型的训练方法,以该方法应用于上述计算机设备102为例进行说明,包括以下步骤:In some embodiments, as shown in FIG. 2, a training method of a living body detection model is provided. Taking the method applied to the above-mentioned computer device 102 as an example for description, the method includes the following steps:
步骤202,获取初始活体检测模型,初始活体检测模型包括初始候选区域生成网络及初始活体分类网络。Step 202: Obtain an initial living body detection model. The initial living body detection model includes an initial candidate region generation network and an initial living body classification network.
初始活体检测模型可以是为了训练活体检测模型而预先确定的用于进行活体检测的模型,初始活体检测模型可以是未经训练的活体检测模型或者训练未完成的活体检测模 型。初始活体检测模型,包括初始候选区域生成网络及初始活体分类网络。候选初始候选区域用于训练得到目标候选区域生成网络,目标候选区域生成网络用于从输入图像中提取候选区域;初始活体分类网络用于训练得到目标活体分类网络,目标活体分类网络用于根据输入图像进行活体分类得到活体检测结果。The initial living body detection model may be a predetermined model used for living body detection for training the living body detection model, and the initial living body detection model may be an untrained living body detection model or a living body detection model that has not been trained. The initial living body detection model includes the initial candidate region generation network and the initial living body classification network. The candidate initial candidate area is used to train to obtain the target candidate area generation network, the target candidate area generation network is used to extract the candidate area from the input image; the initial living body classification network is used to train the target living body classification network, and the target living body classification network is used according to the input The image is classified into living body to obtain the result of living body detection.
在一些实施例中,步骤202之前,还包括以下步骤:In some embodiments, before step 202, the following steps are further included:
首先,可以确定初始活体检测模型的网络结构信息。具体来说,由于初始活体检测模型包括初始候选区域生成网络和初始活体分类网络,因此,可以分别确定初始候选区域生成网络的网络结构信息,以及初始活体分类网络的网络结构信息。First, the network structure information of the initial live detection model can be determined. Specifically, since the initial living body detection model includes the initial candidate region generating network and the initial living body classification network, the network structure information of the initial candidate region generating network and the network structure information of the initial living body classification network can be determined respectively.
可以理解的是,初始候选区域生成网络和初始活体分类网络可以是各种神经网络,为此可以分别确定初始候选区域生成网络和初始活体分类网络是哪种神经网络,包括几层神经元,每层有多少个神经元,各层神经元之间的连接顺序关系,每层神经元都包括哪些参数,每层神经元对应的激活函数类型等等。可理解的是,对于不同的神经网络类型,所需要确定的网络结构信息也是不同的。It is understandable that the initial candidate region generation network and the initial living body classification network can be various neural networks. For this reason, the initial candidate region generation network and the initial living body classification network can be determined respectively which kind of neural network, including several layers of neurons, each How many neurons are in a layer, the connection sequence relationship between neurons in each layer, what parameters each layer of neurons includes, the type of activation function corresponding to each layer of neurons, and so on. It is understandable that for different neural network types, the network structure information that needs to be determined is also different.
然后,可以初始化初始活体检测模型中初始候选区域生成网络和初始活体分类网络的网络参数的参数值。在一些实施例中,可以将初始候选区域生成网络和初始活体分类网络的各个网络参数用一些不同的小随机数进行初始化。“小随机数”用来保证网络不会因权值过大而进入饱和状态,从而导致训练失败,“不同”用来保证网络可以正常地学习。Then, the parameter values of the network parameters of the initial candidate region generation network and the initial living body classification network in the initial living body detection model can be initialized. In some embodiments, each network parameter of the initial candidate region generation network and the initial living body classification network may be initialized with some different small random numbers. "Small random number" is used to ensure that the network will not enter a saturated state due to excessive weights, resulting in training failure, and "different" is used to ensure that the network can learn normally.
步骤204,获取第一训练样本集及第二训练样本集;第二训练样本集对应的训练样本中包括彩色图像、与彩色图像对应的近红外图像及对应的目标活体位置信息。Step 204: Obtain a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information.
第一训练样本集及第二训练样本集均为已标注的包含人脸的图像样本集。第一训练样本集中的训练样本(以下简称第一训练样本)中包括彩色图像、目标人脸图像及对应的目标人脸候选区域位置信息,彩色图像指的是由摄像头在自然光下采集的RGB图像,目标人脸图像指的是彩色图像中人脸区域对应的图像,目标人脸候选区域位置信息指的彩色图像中人脸区域对应的位置坐标,可以理解的是,彩色图像为第一训练样本对应的输入数据,目标人脸图像及对应的目标人脸候选区域位置信息为该第一训练样本对应的训练标签。The first training sample set and the second training sample set are both labeled image sample sets containing human faces. The training samples in the first training sample set (hereinafter referred to as the first training samples) include color images, target face images and corresponding target face candidate area location information. The color images refer to the RGB images collected by the camera under natural light. , The target face image refers to the image corresponding to the face area in the color image, and the location information of the target face candidate area refers to the position coordinates corresponding to the face area in the color image. It is understandable that the color image is the first training sample The corresponding input data, the target face image and the corresponding target face candidate region location information are the training labels corresponding to the first training sample.
第二训练样本集对应的训练样本(以下简称第二训练样本)中包括彩色图像、与彩色图像对应的近红外图像、目标活体检测结果及对应的目标活体位置信息,可以理解的是,目标活体检测结果及对应的目标活体位置信息为该第二训练样本对应的训练标签,目标活体检测结果用于表征待检测的人脸图像中的人脸是否是活体人脸;目标活体位置信息指的是目标活体检测结果对应的人脸图像的位置坐标。The training sample corresponding to the second training sample set (hereinafter referred to as the second training sample) includes the color image, the near-infrared image corresponding to the color image, the target living body detection result and the corresponding target living body position information. It can be understood that the target living body The detection result and the corresponding target living body position information are the training labels corresponding to the second training sample. The target living body detection result is used to characterize whether the face in the face image to be detected is a living body face; the target living body position information refers to The position coordinates of the face image corresponding to the target living body detection result.
在一些实施例中,活体检测结果可以为用于表征人脸图像中的人脸是活体人脸的是检测结果标识(例如,数字1或者向量(1,0))或者用于表征人脸图像中的人脸不是活体人脸的否检测结果标识(例如,数字0或者向量(0,1));在另一些实施例中,活体检测结果还可以包括人脸图像中的人脸是活体人脸的概率和/或人脸图像中的人脸是非活体人脸的概率,例如,活体检测结果可以是包括第一概率和第二概率的向量,第一概率用于表征人脸图像 中的人脸是活体人脸的概率,第二概率用于表征人脸图像中的人脸是非活体人脸的概率。In some embodiments, the living body detection result may be a detection result identifier (for example, the number 1 or the vector (1,0)) used to characterize the human face in the face image, or it may be used to characterize the human face image. If the face in is not a live face, the detection result identifier (for example, the number 0 or the vector (0,1)); in other embodiments, the live detection result may also include that the face in the face image is a live person The probability of the face and/or the probability that the face in the face image is a non-living human face. For example, the live detection result may be a vector including a first probability and a second probability, and the first probability is used to characterize the person in the face image The probability that the face is a living face, and the second probability is used to represent the probability that the face in the face image is a non-living face.
步骤206,根据第一训练样本集训练初始候选区域生成网络直至收敛,得到第一候选区域生成网络。Step 206: Train the initial candidate region generation network according to the first training sample set until convergence, and obtain the first candidate region generation network.
具体地,将第一训练样本中的彩色图像输入初始候选区域生成网络中,将彩色图像对应的目标人脸图像及对应的目标人脸候选区域位置信息作为期望的输出对初始候选区域生成网络进行训练,在训练的过程中不断对初始候选区域生成网络的参数进行调整,直至满足收敛条件时,停止训练,得到当前训练好的候选区域生成网络,即第一候选区域生成网络。在一些实施例中,收敛条件可以是训练时间超过预设时长、训练次数超过预设次数、实际输出与期望输出之间的差异小于差异阈值。Specifically, the color image in the first training sample is input into the initial candidate area generation network, and the target face image corresponding to the color image and the corresponding target face candidate area position information are used as the desired output to perform the initial candidate area generation network. Training: During the training process, the parameters of the initial candidate region generation network are continuously adjusted, until the convergence condition is met, the training is stopped, and the currently trained candidate region generation network is obtained, that is, the first candidate region generation network. In some embodiments, the convergence condition may be that the training time exceeds the preset duration, the number of training times exceeds the preset number, and the difference between the actual output and the expected output is less than the difference threshold.
可以理解的是,本实施例中可以采用各种方式对初始候选区域生成网络进行训练,例如,可以采用BP(Back Propagation,反向传播)算法或者SGD(Stochastic Gradient Descent,随机梯度下降)算法。It is understandable that various methods can be used to train the initial candidate region generation network in this embodiment, for example, a BP (Back Propagation) algorithm or a SGD (Stochastic Gradient Descent) algorithm can be used.
步骤208,根据第一候选区域生成网络及第二训练样本集训练初始活体分类网络直至收敛,得到第一活体分类网络。Step 208: Train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain the first living body classification network.
具体地,在对初始活体分类网络进行训练时,需要对当前训练好的候选区域生成网络的参数进行固定,即首先将第二训练样本中的彩色图像输入第一候选区域生成网络中,得到第一目标人脸图像及其对应的第一人脸候选区域位置信息,然后根据第一人脸候选区域位置信息、第二训练样本中与彩色图像对应的近红外图像、目标活体检测结果及对应的目标活体位置信息对初始活体分类网络进行训练,直至满足收敛条件时,停止训练,得到当前训练好的活体分类网络,即第一活体分类网络。Specifically, when training the initial living body classification network, it is necessary to fix the parameters of the currently trained candidate region generation network, that is, first input the color image in the second training sample into the first candidate region generation network to obtain the first candidate region generation network. A target face image and its corresponding first face candidate area location information, and then based on the first face candidate area location information, the near-infrared image corresponding to the color image in the second training sample, the target live body detection result, and the corresponding The target living body position information trains the initial living body classification network until the convergence condition is met, the training is stopped, and the currently trained living body classification network, namely the first living body classification network, is obtained.
训练过程时,首先根据第一人脸候选区域位置信息从近红外图像中截取对应位置的图像得到感兴趣区域图像,将感兴趣区域图像输入初始活体分类网络中,将目标活体检测结果及对应的目标活体位置信息作为期望的输出对初始活体分类网络的参数进行调整,直至满足收敛条件时,结束训练。During the training process, firstly, according to the position information of the first face candidate area, the image of the corresponding position is intercepted from the near-infrared image to obtain the image of the region of interest, and the image of the region of interest is input into the initial living body classification network, and the target living body detection result and the corresponding The target living body position information is used as the expected output to adjust the parameters of the initial living body classification network until the convergence condition is met, and the training ends.
步骤210,将彩色图像输入到第一候选区域生成网络中,得到当前人脸候选区域位置信息,将当前人脸候选区域位置信息及近红外图像输入第一活体分类网络中,得到当前活体位置信息。Step 210: Input the color image into the first candidate area generation network to obtain the current position information of the face candidate area, and input the current position information of the face candidate area and the near-infrared image into the first living body classification network to obtain the current living body position information .
具体地,将第二训练样本中的彩色图像输入第一候选区域生成网络中,可以得到该彩色图像对应的当前人脸图像及该当前人脸图像对应的当前人脸候选区域位置信息,进一步,将当前人脸候选区域位置信息及第二训练样本中与彩色图像对应的近红外图像输入第一活体分类网络中,通过第一活体分类网络首先从近红外图像中截取与当前人脸候选区域位置信息对应的图像区域,得到感兴趣区域图像,然后通过第一活体分类网络对该感兴趣区域图像进行活体分类,得到当前活体检测结果及对应的当前活体位置信息,该当前活体位置信息即对感兴趣区域图像进行位置回归得到的位置坐标。Specifically, by inputting the color image in the second training sample into the first candidate region generation network, the current face image corresponding to the color image and the current face candidate region location information corresponding to the current face image can be obtained. Further, Input the position information of the current face candidate area and the near-infrared image corresponding to the color image in the second training sample into the first living body classification network. The first living body classification network first intercepts the position of the current face candidate area from the near-infrared image Information corresponding to the image area, the region of interest image is obtained, and then the image of the region of interest is classified by the first living body classification network to obtain the current living body detection result and the corresponding current living body position information. The current living body position information is the sensitive The position coordinates obtained by the position regression of the image of the region of interest.
步骤212,根据当前活体位置信息及目标活体位置信息的差异调整第一候选区域生成 网络的参数,并返回将彩色图像输入到第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络。Step 212: Adjust the parameters of the first candidate region generating network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generating network until convergence, to obtain the target candidate region generating network .
差异可以是误差,误差可以是平均绝对误差(mean absolute error,MAE)、均方误差(mean squared error,MSE)或均方根误差(root mean squared error,RMSE)等。The difference can be an error, and the error can be a mean absolute error (MAE), a mean squared error (MSE), or a root mean squared error (RMSE), etc.
具体地,可以根据当前活体位置信息及目标活体位置信息的误差构建代价函数(cost function),通常也被称为损失函数(loss function),应理解,代价函数用于反映当前活体位置信息及目标活体位置信息的之间的差异,可以包括用于防止过拟合的正则化项。本实施例中,由于候选区域生成网络与活体分类网络中人脸区域的位置信息是对应的,二者代价函数一致,且存在梯度反传,因此可以通过最小化活体分类网络的代价函数来调节候选区域生成网络的参数。Specifically, the cost function can be constructed according to the current living body position information and the error of the target living body position information, which is usually also called the loss function. It should be understood that the cost function is used to reflect the current living body position information and the target. The difference between the living body position information may include a regularization term for preventing overfitting. In this embodiment, since the location information of the face region in the candidate region generation network and the living body classification network are corresponding, the cost functions of the two are the same, and there is gradient back propagation, so it can be adjusted by minimizing the cost function of the living body classification network The candidate area generates the parameters of the network.
在一些实施例中,可以通过梯度下降法来调整第一候选区域生成网络的参数,具体地,可以根据当前活体位置信息及目标活体位置信息的误差所确定的梯度(例如,代价函数对模型参数的偏导数)反向传播到第一候选区域生成网络,以调节第一候选区域生成网络的参数。In some embodiments, the parameters of the first candidate region generation network can be adjusted by the gradient descent method. Specifically, the gradient determined according to the error of the current living body position information and the target living body position information (for example, the cost function versus the model parameter The partial derivative of) propagates back to the first candidate region generating network to adjust the parameters of the first candidate region generating network.
重复步骤210-步骤212对第一候选区域生成网络进行多次训练,直至满足收敛条件时,停止训练,得到训练好的目标候选区域生成网络。Repeat step 210-step 212 to train the first candidate region generation network for multiple times, until the convergence condition is met, stop training, and obtain a trained target candidate region generation network.
步骤214,根据目标候选区域生成网络及第二训练样本集训练第一活体分类网络直至收敛,得到目标活体分类网络,根据目标候选区域生成网络及目标活体分类网络得到训练好的目标活体检测模型。Step 214: Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain the target living body classification network, and obtain the trained target living body detection model according to the target candidate region generation network and the target living body classification network.
具体地,固定目标候选区域生成网络的参数,通过第二训练样本集对第一活体分类网络进行训练。首先将第二训练样本中的彩色图像输入目标候选区域生成网络中,得到第二目标人脸图像及其对应的第二人脸候选区域位置信息,然后根据第二人脸候选区域位置信息、第二训练样本中与彩色图像对应的近红外图像、目标活体检测结果及对应的目标活体位置信息对第一活体分类网络进行训练,直至满足收敛条件时,停止训练,得到当前训练好的活体分类网络,即目标活体分类网络。Specifically, the parameters of the target candidate region generation network are fixed, and the first living body classification network is trained through the second training sample set. First, the color image in the second training sample is input into the target candidate area generation network to obtain the second target face image and its corresponding second face candidate area location information, and then according to the second face candidate area location information, the first 2. In the training sample, the near-infrared image corresponding to the color image, the target live body detection result and the corresponding target live body position information train the first live body classification network, until the convergence condition is met, the training is stopped, and the currently trained live body classification network is obtained , Namely the target living body classification network.
训练时,首先根据第二人脸候选区域位置信息从近红外图像中截取对应位置的图像得到感兴趣区域图像,将感兴趣区域图像输入第一活体分类网络中,将目标活体检测结果及对应的目标活体位置信息作为期望的输出对第一活体分类网络的参数进行调整,直至满足收敛条件时,结束训练。During training, firstly, according to the position information of the second face candidate area, the image of the corresponding position is intercepted from the near-infrared image to obtain the image of the region of interest, and the image of the region of interest is input into the first living body classification network, and the target living body detection result and the corresponding The target living body position information is used as the desired output to adjust the parameters of the first living body classification network until the convergence condition is met, and the training ends.
在得到目标候选区域生成网络和目标活体分类网络后,将目标候选区域生成网络的输出端与目标活体分类网络的输入端相连接,可以得到训练好的目标活体检测模型。After the target candidate region generation network and the target living body classification network are obtained, the output terminal of the target candidate region generation network is connected with the input terminal of the target living body classification network to obtain a trained target living body detection model.
上述活体检测模型的训练方法中,首先训练初始候选区域生成网络,得到第一候选区域生成网络,然后固定第一候选区域生成网络的参数,训练初始活体分类网络,得到第一活体分类网络,接着根据第一候选区域生成网络和第一活体分类网络得到当前活体位置信息,根据当前活体位置信息及目标活体位置信息的差异反向传播调整第一候选区域生成网 络的参数,以得到目标候选区域生成网络,固定目标候选区域生成网络继续训练第一活体分类网络,得到目标活体分类网络,最后根据目标候选区域生成网络及目标活体分类网络得到训练好的目标活体检测模型。本申请中将人脸检测与活体分类整合到一个模型中,采用端到端的模型训练方法,在进行训练时,由于活体分类网络的损失可以反向传播至候选区域生成网络,网络之间的契合度高,得到的活体检测模型相较于传统技术中两个单独的模型来说准确性得到了明显的提升。In the training method of the above-mentioned living body detection model, the initial candidate region generation network is first trained to obtain the first candidate region generation network, and then the parameters of the first candidate region generation network are fixed, the initial living body classification network is trained, and the first living body classification network is obtained. According to the first candidate region generating network and the first living body classification network, the current living body position information is obtained, and the parameters of the first candidate region generating network are adjusted according to the difference between the current living body position information and the target living body position information to obtain the target candidate region generation. Network, the fixed target candidate region generation network continues to train the first living body classification network to obtain the target living body classification network, and finally the trained target living body detection model is obtained according to the target candidate region generating network and the target living body classification network. In this application, face detection and living body classification are integrated into one model, and an end-to-end model training method is adopted. During training, the loss of the living body classification network can be back propagated to the candidate region generation network, and the fit between the networks Compared with the two separate models in traditional technology, the accuracy of the obtained living body detection model has been significantly improved.
在一些实施例中,上述方法还包括:获取目标活体检测模型;获取待检测人脸对应的待检测彩色图像和待检测近红外图像;将待检测彩色图像输入至目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息;将目标人脸候选区域位置信息及待检测近红外图像输入至目标活体检测模型对应的目标活体分类网络中,得到活体检测结果。In some embodiments, the above method further includes: obtaining a target living body detection model; obtaining a to-be-detected color image and a to-be-detected near-infrared image corresponding to the face to be detected; and inputting the to-be-detected color image into a target candidate corresponding to the target living body detection model The region generation network obtains the position information of the target face candidate area; the position information of the target face candidate area and the near-infrared image to be detected are input into the target living body classification network corresponding to the target living body detection model, and the living body detection result is obtained.
待检测彩色图像指的是用于进行活体检测以判断待检测人脸是否为活体人脸的彩色图像,待检测近红外图像指的是用于进行活体检测以判断待检测人脸是否为活体人脸的近红外图像。The color image to be detected refers to the color image used for live detection to determine whether the face to be detected is a live face, and the near-infrared image to be detected refers to the color image used for live detection to determine whether the face to be detected is a live person Near infrared image of face.
本实施例中,通过将待检测彩色图像输入至目标候选区域生成网络中,可以得到目标人脸图像及对应的目标人脸候选区域位置信息,将目标人脸候选区域位置信息及待检测近红外图像输入至目标活体分类网络中,目标活体分类网络可先根据目标人脸候选区域位置信息从待检测近红外图像截取对应位置的图像,得到感兴趣区域图像,对感兴趣区域图像进行活体分类,得到待检测人脸对应的活体检测结果。In this embodiment, by inputting the color image to be detected into the target candidate area generation network, the target face image and the corresponding target face candidate area location information can be obtained, and the target face candidate area location information and the near-infrared The image is input to the target living body classification network. The target living body classification network can first intercept the image of the corresponding position from the near-infrared image to be detected according to the position information of the target face candidate area to obtain the image of the region of interest, and classify the image of the region of interest in vivo. Obtain the live body detection result corresponding to the face to be detected.
上述实施例中,由于采用了比较准确的端到端的目标活体检测模型进行活体检测,提高了活体检测的准确性。In the foregoing embodiment, since a relatively accurate end-to-end target living body detection model is used for living body detection, the accuracy of living body detection is improved.
在一些实施例中,如图3所示,目标候选区域生成网络包括第一卷积层、第二卷积层及第一池化层,将待检测彩色图像输入至目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息,包括:In some embodiments, as shown in FIG. 3, the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer, and the color image to be detected is input to the target corresponding to the target living detection model. The candidate area generation network obtains the position information of the target face candidate area, including:
步骤302,将待检测彩色图像输入第一卷积层中,通过第一卷积层对待检测彩色图像进行卷积运算,得到第一特征矩阵。Step 302: Input the color image to be detected into a first convolution layer, and perform a convolution operation on the color image to be detected through the first convolution layer to obtain a first feature matrix.
具体地,目标候选区域生成网络包括至少一个卷积层,卷积层对待检测彩色图像进行卷积运算,得到第一特征矩阵。卷积运算是指利用卷积核进行乘积的运算。经过卷积核卷积可以降低特征维度,并且表达出图像的局部特征,不同的卷积窗口具有不同的表达能力。卷积窗口的大小是根据图像对应的特征向量的纬度(embedding size)以及滤波宽度(filter width)决定的,滤波宽度是由实验调整得到的,在一些实施例中,滤波宽度分别选择3,4,5,6,7,8几个值,假设特征向量的纬度为128维,那么卷积窗口可以分别选择128*3,128*4,128*5,128*6,128*7,128*8。一个卷积核对应一个输出,比如,如果卷积层中有10个卷积核,经过10个卷积核的作用将会得到10个输出,即得到10维的第一特征矩阵。Specifically, the target candidate region generation network includes at least one convolution layer, and the convolution layer performs a convolution operation on the color image to be detected to obtain the first feature matrix. Convolution operation refers to the operation of multiplying products using a convolution kernel. Convolution through the convolution kernel can reduce the feature dimension and express the local features of the image. Different convolution windows have different expression capabilities. The size of the convolution window is determined according to the latitude (embedding size) and filter width (filter width) of the feature vector corresponding to the image. The filter width is adjusted by experiment. In some embodiments, the filter width is selected as 3 and 4 respectively. , 5, 6, 7, 8 several values, assuming that the latitude of the feature vector is 128 dimensions, then the convolution window can be selected respectively 128*3, 128*4, 128*5, 128*6, 128*7, 128* 8. One convolution kernel corresponds to one output. For example, if there are 10 convolution kernels in the convolution layer, 10 outputs will be obtained after the effect of 10 convolution kernels, that is, a 10-dimensional first feature matrix is obtained.
步骤304,将第一特征矩阵输入第一池化层中,通过第一池化层对第一特征矩阵中的每个向量中最大的权重进行投影得到归一化的第二特征矩阵。Step 304: Input the first feature matrix into the first pooling layer, and project the largest weight in each vector in the first feature matrix through the first pooling layer to obtain a normalized second feature matrix.
具体地,目标候选区域生成网络包括至少一个池化层。在一些实施例中,池化层采用最大池化层(max-pooling),即用于将卷积层得到的每个向量中的能量最大的元素(即权重最大元素)投影到下一层的输入,这样做的目的是为了保证不同特征向量和不同卷积核的输出归一化,并保持最大信息没有丢失。第一特征矩阵是由多个向量组成的,将每个向量中最大的权重进行投影得到归一化的第二特征矩阵。Specifically, the target candidate region generation network includes at least one pooling layer. In some embodiments, the pooling layer adopts max-pooling, which is used to project the element with the largest energy in each vector obtained by the convolution layer (ie, the element with the largest weight) to the next layer. Input, the purpose of this is to ensure that the output of different feature vectors and different convolution kernels are normalized, and the maximum information is not lost. The first feature matrix is composed of multiple vectors, and the largest weight in each vector is projected to obtain a normalized second feature matrix.
步骤306,将第二特征矩阵输入第二卷积层中,通过第二卷积层对第二特征矩阵进行卷积计算,得到目标人脸候选区域位置信息。Step 306: Input the second feature matrix into the second convolutional layer, and perform convolution calculation on the second feature matrix through the second convolutional layer to obtain position information of the target face candidate region.
具体地,本实施例中的候选区域生成网络采用全卷积网络(Fully Convolutional Networks),图像在经过池化层后直接输入第二卷积层中,用第二卷积层代替全连接层,对第二特征矩阵进行卷积计算,得到待检测彩色图像对应的目标人脸图像及对应的目标人脸候选区域位置信息。Specifically, the candidate region generation network in this embodiment adopts Fully Convolutional Networks. After the image passes through the pooling layer, it is directly input into the second convolutional layer, and the second convolutional layer is used instead of the fully connected layer. Perform convolution calculation on the second feature matrix to obtain the target face image corresponding to the color image to be detected and the corresponding target face candidate region position information.
上述实施例中,通过用卷积层代替全连接层的方式,由于卷积核的计算是并行的,且不需要同时读入内存中,因此,可以节省存储开销同时可以提升候选区域生成网络进行人脸分类及位置回归的效率。In the above embodiment, by using the convolutional layer instead of the fully connected layer, since the calculation of the convolution kernel is parallel and does not need to be read into the memory at the same time, the storage overhead can be saved and the candidate region generation network can be improved. The efficiency of face classification and position regression.
在一些实施例中,将目标人脸候选区域位置信息及待检测近红外图像输入至目标活体检测模型对应的目标活体分类网络中,得到活体检测结果,包括:根据目标人脸候选区域位置信息从待检测近红外图像上截取对应的感兴趣区域图像,将感兴趣区域图像输入第三卷积层中,通过第三卷积层对感兴趣区域图像进行卷积运算,得到第三特征矩阵;将第三特征矩阵输入第二池化层中,通过第二池化层对第三特征矩阵中的每个向量中最大的权重进行投影得到归一化的第四特征矩阵;将第四特征矩阵输入至第四卷积层中,通过第四卷积层对第四特征矩阵进行卷积计算,得到活体检测结果。In some embodiments, inputting the position information of the target face candidate region and the near-infrared image to be detected into the target living body classification network corresponding to the target living body detection model to obtain the living body detection result includes: The corresponding region of interest image is intercepted from the near-infrared image to be detected, the region of interest image is input into the third convolution layer, and the third convolution layer performs convolution operation on the region of interest image to obtain the third feature matrix; The third feature matrix is input to the second pooling layer, and the largest weight in each vector in the third feature matrix is projected through the second pooling layer to obtain the normalized fourth feature matrix; the fourth feature matrix is input In the fourth convolutional layer, the fourth feature matrix is convolved and calculated through the fourth convolutional layer to obtain the live body detection result.
本实施例中,活体分类网络采用全卷积网络,包括至少一个第三卷积层、至少一个第四卷积层及至少一个第二池化层。在将根据目标人脸候选区域位置信息从待检测近红外图像上截取对应的感兴趣区域图像后,首先将感兴趣区域图像输入第三卷积层中,通过第三卷积层进行卷积运算以表达出局部特性,得到第三特征矩阵,接着将第三特征矩阵输入与第三卷积层相连接的第二池化层,得到第四特征矩阵,由于第四特征矩阵是通过对第三特征矩阵中的每个向量中最大的权重进行投影得到的,参数数量明显减少,从而可以降低特征维度,最后将得到的第四特征矩阵输入到与第二池化层相连接的第四卷积层中,通过第四卷积层对第四特征矩阵进行卷积计算,得到活体检测结果及对应的活体位置信息。可以理解的是这里的活体位置信息指的是对感兴趣区域图像进行位置回归得到的位置信息,可以是活体人脸对应的位置信息或者非活体人脸对应的位置信息。本实施例中,由于采用了全卷积网络,不仅节省了存储开销,同时也可以提升活体检测效率。In this embodiment, the living body classification network adopts a full convolutional network, which includes at least one third convolutional layer, at least one fourth convolutional layer, and at least one second pooling layer. After intercepting the corresponding region of interest image from the near-infrared image to be detected according to the position information of the target face candidate region, first input the region of interest image into the third convolutional layer, and perform the convolution operation through the third convolutional layer In order to express the local characteristics, the third feature matrix is obtained, and then the third feature matrix is input to the second pooling layer connected with the third convolutional layer to obtain the fourth feature matrix. The largest weight in each vector in the feature matrix is projected, and the number of parameters is significantly reduced, so that the feature dimension can be reduced. Finally, the fourth feature matrix obtained is input to the fourth convolution connected to the second pooling layer. In the layer, the fourth feature matrix is subjected to convolution calculation through the fourth convolution layer to obtain the living body detection result and the corresponding living body position information. It can be understood that the living body position information here refers to the position information obtained by performing position regression on the image of the region of interest, and may be the position information corresponding to the living body face or the position information corresponding to the non-living body face. In this embodiment, since the full convolutional network is adopted, not only the storage overhead is saved, but the living body detection efficiency can also be improved.
在一些实施例中,根据目标人脸候选区域位置信息从待检测近红外图像上截取对应的 感兴趣区域图像,包括:根据预先标定的摄像头参数矩阵,将目标人脸候选区域位置信息对应到待检测近红外图像上,定位出待检测近红外图像中的人脸位置,根据定位出的人脸位置截取出对应的感兴趣区域图像。In some embodiments, capturing the corresponding region of interest image from the near-infrared image to be detected according to the position information of the target face candidate region includes: according to a pre-calibrated camera parameter matrix, corresponding the position information of the target face candidate region to the target face candidate region. On the detection near-infrared image, locate the face position in the near-infrared image to be detected, and cut out the corresponding region of interest image according to the positioned face position.
本实施例中,利用双摄像头模组分别采集彩色图像及近红外图像,并预先标定彩色图像对应的摄像头模组与近红外图像对应的摄像头模组之间的摄像头参数矩阵,当根据目标候选区域生成网络进行位置回归得到待检测人脸对应的目标人脸候选区域位置信息后,可以根据摄像头参数矩阵对目标人脸候选区域位置信息进行矩阵变换,得到近红外图像中对应的位置信息,从而可以根据该位置信息从近红外图像上定位出人脸位置,截取该人脸位置对应的图像区域即得到感兴趣区域图像。In this embodiment, the dual camera modules are used to collect color images and near-infrared images, and the camera parameter matrix between the camera module corresponding to the color image and the camera module corresponding to the near-infrared image is pre-calibrated. After the generation network performs position regression to obtain the position information of the target face candidate area corresponding to the face to be detected, the position information of the target face candidate area can be matrix transformed according to the camera parameter matrix to obtain the corresponding position information in the near-infrared image. According to the position information, the position of the face is located from the near-infrared image, and the image area corresponding to the position of the face is intercepted to obtain the image of the region of interest.
上述实施例中,通过预先标定摄像头参数矩阵,可以准确快递的从近红外图像上截取感兴趣区域图像,从而提高活体检测的效率和准确性。In the above embodiment, by pre-calibrating the camera parameter matrix, the image of the region of interest can be accurately captured from the near-infrared image, thereby improving the efficiency and accuracy of living body detection.
在一些实施例中,在获取待检测人脸对应的待检测彩色图像和待检测近红外图像之前,上述方法还包括:利用双摄像头模组采集待检测人脸对应的彩色图像和近红外图像,对采集到的彩色图像进行人脸检测;当根据人脸检测结果判断出检测到人脸时,将采集到的彩色图像及近红外图像分别确定为待检测彩色图像及待检测近红外图像;当根据人脸检测结果判断出未检测到人脸时,返回利用双摄像头模组采集待检测人脸对应的彩色图像和近红外图像的步骤。In some embodiments, before acquiring the to-be-detected color image and the to-be-detected near-infrared image corresponding to the to-be-detected face, the above method further includes: using dual camera modules to collect the color image and the near-infrared image corresponding to the to-be-detected face, Perform face detection on the collected color images; when it is determined that a human face is detected according to the face detection results, the collected color image and near-infrared image are determined as the to-be-detected color image and the to-be-detected near-infrared image, respectively; When it is judged that no human face is detected according to the face detection result, return to the step of using the dual camera module to collect the color image and the near-infrared image corresponding to the face to be detected.
本实施例中,通过双摄像头模组采集到彩色图像和近红外图像后,对彩色图像进行人脸检测,当彩色图像中检测到人脸时,由于近红外图像和彩色图像是同时采集的,因此,近红外图像中必然也会包含人脸区域,因此,可以将此时采集到的彩色图像及近红外图像分别确定为待检测彩色图像及待检测近红外图像;反之,若彩色图像中未检测到人脸,则近红外图像中必然也不包含人脸区域,此时,需要继续采集待检测人脸对应的彩色图像和近红外图像,以采集到能够用于活体检测的包含了人脸的图像。In this embodiment, after the color image and the near-infrared image are collected by the dual camera module, the color image is subjected to face detection. When a human face is detected in the color image, since the near-infrared image and the color image are collected at the same time, Therefore, the near-infrared image will inevitably also include the face area. Therefore, the color image and the near-infrared image collected at this time can be determined as the to-be-detected color image and the to-be-detected near-infrared image; on the contrary, if there is no color image If a human face is detected, the near-infrared image must not contain the face area. At this time, it is necessary to continue to collect the color image and the near-infrared image corresponding to the face to be detected to collect the human face that can be used for live detection Image.
上述实施例中,通过双摄像头模组采集待检测人脸对应的彩色图像和近红外图像,只要对彩色图像进行人脸检测,即可准确的判断出是否采集到能够用于活体检测的包含了人脸的图像,提高了图像采集的效率,从而可以提高活体检测的效率。In the above embodiment, the color image and the near-infrared image corresponding to the face to be detected are collected by the dual camera module. As long as the face detection is performed on the color image, it can be accurately determined whether the collected data can be used for living body detection. The image of the human face improves the efficiency of image collection, thereby improving the efficiency of living body detection.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart of FIGS. 2-3 are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless there is a clear description in this article, there is no strict order for the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 2-3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在一些实施例中,如图4所示,提供了一种活体检测模型的训练装置400,包括:In some embodiments, as shown in FIG. 4, a training device 400 for a living body detection model is provided, including:
初始模型获取模块402,用于获取初始活体检测模型,初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;The initial model acquisition module 402 is used to acquire an initial living body detection model, and the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
训练样本获取模块404,用于获取第一训练样本集及第二训练样本集;第二训练样本集对应的训练样本中包括彩色图像、与彩色图像对应的近红外图像及对应的目标活体位置信息;The training sample acquisition module 404 is used to acquire the first training sample set and the second training sample set; the training samples corresponding to the second training sample set include color images, near-infrared images corresponding to the color images, and corresponding target living body position information ;
第一训练模块406,用于根据第一训练样本集训练初始候选区域生成网络直至收敛,得到第一候选区域生成网络;The first training module 406 is configured to train the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network;
第二训练模块408,用于根据第一候选区域生成网络及第二训练样本集训练初始活体分类网络直至收敛,得到第一活体分类网络;The second training module 408 is configured to train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain the first living body classification network;
输入模块410,用于将彩色图像输入到第一候选区域生成网络中,得到当前人脸候选区域位置信息,将当前人脸候选区域位置信息及近红外图像输入第一活体分类网络中,得到当前活体位置信息;The input module 410 is used to input the color image into the first candidate area generation network to obtain the current position information of the face candidate area, and input the current position information of the face candidate area and the near-infrared image into the first living body classification network to obtain the current Living body position information;
参数调整模块412,用于根据当前活体位置信息及目标活体位置信息的差异调整第一候选区域生成网络的参数,并返回将彩色图像输入到第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;The parameter adjustment module 412 is used to adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence, to obtain the target Candidate area generation network;
活体检测模型获得模块414,根据目标候选区域生成网络及第二训练样本集训练第一活体分类网络直至收敛,得到目标活体分类网络,根据目标候选区域生成网络及目标活体分类网络得到训练好的目标活体检测模型。The living body detection model obtaining module 414 trains the first living body classification network according to the target candidate area generation network and the second training sample set until convergence, and obtains the target living body classification network, and obtains the trained target according to the target candidate area generation network and the target living body classification network Live detection model.
在一些实施例中,上述装置还包括:活体检测模块,用于获取目标活体检测模型;获取待检测人脸对应的待检测彩色图像和待检测近红外图像;将待检测彩色图像输入至目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息;将目标人脸候选区域位置信息及待检测近红外图像输入至目标活体检测模型对应的目标活体分类网络中,得到活体检测结果。In some embodiments, the above-mentioned device further includes: a living body detection module for obtaining a target living body detection model; obtaining a color image to be detected and a near-infrared image to be detected corresponding to a face to be detected; and inputting the color image to be detected into the target living body The target candidate area generation network corresponding to the detection model obtains the position information of the target face candidate area; the position information of the target face candidate area and the near-infrared image to be detected are input into the target living body classification network corresponding to the target living body detection model to obtain the living body detection result.
在一些实施例中,目标候选区域生成网络包括第一卷积层、第二卷积层及第一池化层,活体检测模块还用于将待检测彩色图像输入第一卷积层中,通过第一卷积层对待检测彩色图像进行卷积运算,得到第一特征矩阵;将第一特征矩阵输入第一池化层中,通过第一池化层对第一特征矩阵中的每个向量中最大的权重进行投影得到归一化的第二特征矩阵;将第二特征矩阵输入第二卷积层中,通过第二卷积层对第二特征矩阵进行卷积计算,得到目标人脸候选区域位置信息。In some embodiments, the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer. The living body detection module is also used to input the color image to be detected into the first convolutional layer. The first convolution layer performs a convolution operation on the color image to be detected to obtain the first feature matrix; the first feature matrix is input into the first pooling layer, and each vector in the first feature matrix is transferred by the first pooling layer The largest weight is projected to obtain the normalized second feature matrix; the second feature matrix is input into the second convolution layer, and the second feature matrix is convolved through the second convolution layer to obtain the target face candidate area location information.
在一些实施例中,目标活体分类网络包括第三卷积层、第四卷积层及第二池化层,活体检测模块还用于根据目标人脸候选区域位置信息从待检测近红外图像上截取对应的感兴趣区域图像,将感兴趣区域图像输入第三卷积层中,通过第三卷积层对感兴趣区域图像进行卷积运算,得到第三特征矩阵;将第三特征矩阵输入第二池化层中,通过第二池化层对第三特征矩阵中的每个向量中最大的权重进行投影得到归一化的第四特征矩阵;将第四特征矩阵输入至第四卷积层中,通过第四卷积层对第四特征矩阵进行卷积计算,得到活体 检测结果。In some embodiments, the target living body classification network includes a third convolutional layer, a fourth convolutional layer, and a second pooling layer. The living body detection module is also used to obtain information from the near-infrared image to be detected based on the position information of the target face candidate area. The corresponding region of interest image is intercepted, the region of interest image is input into the third convolution layer, and the convolution operation is performed on the region of interest image through the third convolution layer to obtain the third feature matrix; the third feature matrix is input to the first In the second pooling layer, the largest weight in each vector in the third feature matrix is projected by the second pooling layer to obtain a normalized fourth feature matrix; the fourth feature matrix is input to the fourth convolutional layer In, the fourth feature matrix is convolved through the fourth convolution layer to obtain the result of living body detection.
在一些实施例中,活体检测模块还用于根据预先标定的摄像头参数矩阵,将目标人脸候选区域位置信息对应到待检测近红外图像上,定位出待检测近红外图像中的人脸位置,根据定位出的人脸位置截取出对应的感兴趣区域图像。In some embodiments, the living body detection module is also used to map the position information of the candidate face region of the target to the near-infrared image to be detected according to the pre-calibrated camera parameter matrix, and locate the position of the face in the near-infrared image to be detected, According to the located face position, the corresponding region of interest image is cut out.
在一些实施例中,上述装置还包括:图像采集模块,用于利用双摄像头模组采集待检测人脸对应的彩色图像和近红外图像,对采集到的彩色图像进行人脸检测;当根据人脸检测结果判断出检测到人脸时,将采集到的彩色图像及近红外图像分别确定为待检测彩色图像及待检测近红外图像;当根据人脸检测结果判断出未检测到人脸时,返回利用双摄像头模组采集待检测人脸对应的彩色图像和近红外图像的步骤。In some embodiments, the above-mentioned device further includes: an image acquisition module, which is used to collect color images and near-infrared images corresponding to the face to be detected by using dual camera modules, and perform face detection on the collected color images; When the face detection result determines that a human face is detected, the collected color image and the near-infrared image are respectively determined as the to-be-detected color image and the to-be-detected near-infrared image; when it is determined that no human face is detected according to the face detection result, Return to the step of using the dual camera module to collect the color image and the near-infrared image corresponding to the face to be detected.
关于活体检测模型的训练装置的具体限定可以参见上文中对于活体检测模型的训练方法的限定,在此不再赘述。上述活体检测模型的训练装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the training device of the living body detection model, please refer to the above definition of the training method of the living body detection model, which will not be repeated here. The various modules in the training device for the above-mentioned living body detection model can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在一个实施例中,提供了一种计算机设备,其内部结构图可以如图5所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储训练样本数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种活体检测模型的训练方法。In one embodiment, a computer device is provided, and its internal structure diagram may be as shown in FIG. 5. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store training sample data. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instruction is executed by the processor to realize a training method of a living body detection model.
本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时实现本申请任意一个实施例中提供的活体检测模型的训练方法的步骤。A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the method for training a living body detection model provided in any one of the embodiments of the present application is implemented step.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器实现本申请任意一个实施例中提供的活体检测模型的训练方法的步骤。One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors implement any one of the embodiments of the present application. Provide the steps of the training method of the living body detection model.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的 流程。本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered as the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and their description is relatively specific and detailed, but they should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种活体检测模型的训练方法,包括:A training method of a living body detection model includes:
    获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;Acquiring an initial living body detection model, where the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
    获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;Acquiring a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
    根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;Training the initial candidate region generation network according to the first training sample set until convergence, to obtain a first candidate region generation network;
    根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;Training the initial living body classification network according to the first candidate region generation network and the second training sample set until convergence, to obtain a first living body classification network;
    将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
    根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及Adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence , Get the target candidate region generation network; and
    根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain a target living body classification network, and obtain training based on the target candidate region generating network and the target living body classification network Good target live detection model.
  2. 根据权利要求1所述的方法,其特征在于,还包括:The method according to claim 1, further comprising:
    获取所述目标活体检测模型;Acquiring the target living body detection model;
    获取待检测人脸对应的待检测彩色图像和待检测近红外图像;Obtain the to-be-detected color image and the to-be-detected near-infrared image corresponding to the face to be detected;
    将所述待检测彩色图像输入至所述目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息;及Inputting the to-be-detected color image into the target candidate region generation network corresponding to the target living body detection model to obtain position information of the target face candidate region; and
    将所述目标人脸候选区域位置信息及所述待检测近红外图像输入至所述目标活体检测模型对应的目标活体分类网络中,得到活体检测结果。Inputting the position information of the target face candidate area and the near-infrared image to be detected into the target living body classification network corresponding to the target living body detection model to obtain a living body detection result.
  3. 根据权利要求2所述的方法,其特征在于,所述目标候选区域生成网络包括第一卷积层、第二卷积层及第一池化层,所述将所述待检测彩色图像输入至所述目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息,包括:The method according to claim 2, wherein the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer, and the color image to be detected is input to The target candidate region generation network corresponding to the target living body detection model to obtain the position information of the target face candidate region includes:
    将所述待检测彩色图像输入所述第一卷积层中,通过所述第一卷积层对所述待检测彩色图像进行卷积运算,得到第一特征矩阵;Input the to-be-detected color image into the first convolutional layer, and perform a convolution operation on the to-be-detected color image through the first convolutional layer to obtain a first feature matrix;
    将所述第一特征矩阵输入所述第一池化层中,通过所述第一池化层对所述第一特征矩阵中的每个向量中最大的权重进行投影得到归一化的第二特征矩阵;及The first feature matrix is input into the first pooling layer, and the largest weight in each vector in the first feature matrix is projected by the first pooling layer to obtain a normalized second Feature matrix; and
    将所述第二特征矩阵输入所述第二卷积层中,通过所述第二卷积层对所述第二特征矩阵进行卷积计算,得到目标人脸候选区域位置信息。The second feature matrix is input into the second convolutional layer, and the second feature matrix is subjected to convolution calculation by the second convolutional layer to obtain the position information of the target face candidate region.
  4. 根据权利要求2所述的方法,其特征在于,所述目标活体分类网络包括第三卷积层、第四卷积层及第二池化层,所述将所述目标人脸候选区域位置信息及所述待检测近红外图像输入至所述目标活体检测模型对应的目标活体分类网络中,得到活体检测结果,包括:The method according to claim 2, wherein the target living body classification network includes a third convolutional layer, a fourth convolutional layer, and a second pooling layer, and the position information of the target face candidate region And the input of the near-infrared image to be detected into the target living body classification network corresponding to the target living body detection model to obtain the living body detection result includes:
    根据所述目标人脸候选区域位置信息从所述待检测近红外图像上截取对应的感兴趣区域图像,将所述感兴趣区域图像输入第三卷积层中,通过所述第三卷积层对所述感兴趣区域图像进行卷积运算,得到第三特征矩阵;According to the position information of the target face candidate region, the corresponding region of interest image is intercepted from the near-infrared image to be detected, and the region of interest image is input into the third convolutional layer, and the third convolutional layer is passed through the third convolutional layer. Performing a convolution operation on the image of the region of interest to obtain a third feature matrix;
    将所述第三特征矩阵输入所述第二池化层中,通过所述第二池化层对所述第三特征矩阵中的每个向量中最大的权重进行投影得到归一化的第四特征矩阵;及The third feature matrix is input into the second pooling layer, and the largest weight in each vector in the third feature matrix is projected by the second pooling layer to obtain a normalized fourth Feature matrix; and
    将所述第四特征矩阵输入至第四卷积层中,通过所述第四卷积层对第四特征矩阵进行卷积计算,得到活体检测结果。The fourth feature matrix is input into the fourth convolutional layer, and the fourth feature matrix is subjected to convolution calculation through the fourth convolutional layer to obtain the living body detection result.
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述目标人脸候选区域位置信息从所述待检测近红外图像上截取对应的感兴趣区域图像,包括:The method according to claim 4, wherein the intercepting a corresponding region of interest image from the near-infrared image to be detected according to the position information of the candidate face region of the target face comprises:
    根据预先标定的摄像头参数矩阵,将所述目标人脸候选区域位置信息对应到所述待检测近红外图像上,定位出所述待检测近红外图像中的人脸位置,及根据定位出的人脸位置截取出对应的感兴趣区域图像。According to the pre-calibrated camera parameter matrix, the position information of the target face candidate area is mapped to the near-infrared image to be detected, the position of the face in the near-infrared image to be detected is located, and the person located is located The face position intercepts the corresponding region of interest image.
  6. 根据权利要求2至5任意一项所述的方法,其特征在于,在所述获取待检测人脸对应的待检测彩色图像和待检测近红外图像之前,所述方法还包括:The method according to any one of claims 2 to 5, characterized in that, before said acquiring the to-be-detected color image and the to-be-detected near-infrared image corresponding to the to-be-detected face, the method further comprises:
    利用双摄像头模组采集所述待检测人脸对应的彩色图像和近红外图像,对采集到的彩色图像进行人脸检测;Collecting the color image and the near-infrared image corresponding to the face to be detected by using the dual camera module, and performing face detection on the collected color image;
    当根据人脸检测结果判断出检测到人脸时,将采集到的彩色图像及近红外图像分别确定为待检测彩色图像及待检测近红外图像;及When it is determined that a human face is detected based on the face detection result, the collected color image and the near-infrared image are determined as the to-be-detected color image and the to-be-detected near-infrared image, respectively; and
    当根据人脸检测结果判断出未检测到人脸时,返回所述利用双摄像头模组采集所述待检测人脸对应的彩色图像和近红外图像的步骤。When it is determined according to the face detection result that no human face is detected, return to the step of using the dual camera module to collect the color image and the near-infrared image corresponding to the face to be detected.
  7. 根据权利要求1所述的方法,其特征在于,在所述获取初始活体检测模型之前,所述方法还包括:The method according to claim 1, characterized in that, before said obtaining the initial living body detection model, the method further comprises:
    确定所述初始活体检测模型的网络结构信息;Determining the network structure information of the initial living body detection model;
    初始化所述初始活体检测模型中所述初始候选区域生成网络和所述初始活体分类网络的网络参数的参数值。Initializing the parameter values of the network parameters of the initial candidate region generation network and the initial living body classification network in the initial living body detection model.
  8. 一种活体检测模型的训练装置,包括:A training device for a living body detection model includes:
    初始模型获取模块,用于获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;An initial model acquisition module for acquiring an initial living body detection model, the initial living body detection model including an initial candidate region generation network and an initial living body classification network;
    训练样本获取模块,用于获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活 体位置信息;The training sample acquisition module is used to acquire a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and a corresponding target Living body position information;
    第一训练模块,用于根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;The first training module is configured to train the initial candidate region generation network according to the first training sample set until it converges to obtain the first candidate region generation network;
    第二训练模块,用于根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;The second training module is configured to train the initial living body classification network according to the first candidate region generation network and the second training sample set until it converges to obtain a first living body classification network;
    输入模块,用于将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The input module is used to input the color image into the first candidate area generation network to obtain current face candidate area position information, and input the current face candidate area position information and the near-infrared image into the In the first living body classification network, the current living body position information is obtained;
    参数调整模块,用于根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及The parameter adjustment module is configured to adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to input the color image to the first candidate region to generate The steps in the network until they converge, and the target candidate area is obtained to generate the network; and
    活体检测模型获得模块,根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。The living body detection model obtaining module trains the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain the target living body classification network, and generates the network and the said target candidate region according to the target candidate region. The target living body classification network obtains a trained target living body detection model.
  9. 根据权利要求8所述的装置,其特征在于,所述装置还包括:活体检测模块,用于获取所述目标活体检测模型;获取待检测人脸对应的待检测彩色图像和待检测近红外图像;将所述待检测彩色图像输入至所述目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息;将所述目标人脸候选区域位置信息及所述待检测近红外图像输入至所述目标活体检测模型对应的目标活体分类网络中,得到活体检测结果。8. The device according to claim 8, wherein the device further comprises: a living body detection module for obtaining the target living body detection model; obtaining the color image to be detected and the near-infrared image to be detected corresponding to the face to be detected Input the color image to be detected into the target candidate region generation network corresponding to the target living body detection model to obtain the position information of the target face candidate region; combine the position information of the target face candidate region and the near-infrared to be detected The image is input into the target living body classification network corresponding to the target living body detection model, and the living body detection result is obtained.
  10. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
    获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;Acquiring an initial living body detection model, where the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
    获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;Acquiring a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
    根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;Training the initial candidate region generation network according to the first training sample set until convergence, to obtain a first candidate region generation network;
    根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;Training the initial living body classification network according to the first candidate region generation network and the second training sample set until convergence, to obtain a first living body classification network;
    将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
    根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域 生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及Adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence , Get the target candidate region generation network; and
    根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain a target living body classification network, and obtain training based on the target candidate region generating network and the target living body classification network Good target live detection model.
  11. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:
    获取所述目标活体检测模型;Acquiring the target living body detection model;
    获取待检测人脸对应的待检测彩色图像和待检测近红外图像;Obtain the to-be-detected color image and the to-be-detected near-infrared image corresponding to the face to be detected;
    将所述待检测彩色图像输入至所述目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息;及Inputting the to-be-detected color image into the target candidate region generation network corresponding to the target living body detection model to obtain position information of the target face candidate region; and
    将所述目标人脸候选区域位置信息及所述待检测近红外图像输入至所述目标活体检测模型对应的目标活体分类网络中,得到活体检测结果。Inputting the position information of the target face candidate area and the near-infrared image to be detected into the target living body classification network corresponding to the target living body detection model to obtain a living body detection result.
  12. 根据权利要求11所述的计算机设备,其特征在于,所述目标候选区域生成网络包括第一卷积层、第二卷积层及第一池化层,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 11, wherein the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer, and the processor executes the computer-readable The following steps are also performed when ordering:
    将所述待检测彩色图像输入所述第一卷积层中,通过所述第一卷积层对所述待检测彩色图像进行卷积运算,得到第一特征矩阵;Input the to-be-detected color image into the first convolutional layer, and perform a convolution operation on the to-be-detected color image through the first convolutional layer to obtain a first feature matrix;
    将所述第一特征矩阵输入所述第一池化层中,通过所述第一池化层对所述第一特征矩阵中的每个向量中最大的权重进行投影得到归一化的第二特征矩阵;及The first feature matrix is input into the first pooling layer, and the largest weight in each vector in the first feature matrix is projected by the first pooling layer to obtain a normalized second Feature matrix; and
    将所述第二特征矩阵输入所述第二卷积层中,通过所述第二卷积层对所述第二特征矩阵进行卷积计算,得到目标人脸候选区域位置信息。The second feature matrix is input into the second convolutional layer, and the second feature matrix is subjected to convolution calculation by the second convolutional layer to obtain the position information of the target face candidate region.
  13. 根据权利要求11所述的计算机设备,其特征在于,所述目标活体分类网络包括第三卷积层、第四卷积层及第二池化层,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 11, wherein the target living body classification network includes a third convolutional layer, a fourth convolutional layer, and a second pooling layer, and the processor executes the computer-readable instructions It also performs the following steps:
    根据所述目标人脸候选区域位置信息从所述待检测近红外图像上截取对应的感兴趣区域图像,将所述感兴趣区域图像输入第三卷积层中,通过所述第三卷积层对所述感兴趣区域图像进行卷积运算,得到第三特征矩阵;According to the position information of the target face candidate region, the corresponding region of interest image is intercepted from the near-infrared image to be detected, and the region of interest image is input into the third convolutional layer, and the third convolutional layer is passed through the third convolutional layer. Performing a convolution operation on the image of the region of interest to obtain a third feature matrix;
    将所述第三特征矩阵输入所述第二池化层中,通过所述第二池化层对所述第三特征矩阵中的每个向量中最大的权重进行投影得到归一化的第四特征矩阵;及The third feature matrix is input into the second pooling layer, and the largest weight in each vector in the third feature matrix is projected by the second pooling layer to obtain a normalized fourth Feature matrix; and
    将所述第四特征矩阵输入至第四卷积层中,通过所述第四卷积层对第四特征矩阵进行卷积计算,得到活体检测结果。The fourth feature matrix is input into the fourth convolutional layer, and the fourth feature matrix is subjected to convolution calculation through the fourth convolutional layer to obtain the living body detection result.
  14. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    根据预先标定的摄像头参数矩阵,将所述目标人脸候选区域位置信息对应到所述待检 测近红外图像上,定位出所述待检测近红外图像中的人脸位置,及根据定位出的人脸位置截取出对应的感兴趣区域图像。According to the pre-calibrated camera parameter matrix, the position information of the target face candidate area is mapped to the near-infrared image to be detected, the position of the face in the near-infrared image to be detected is located, and the person located is located The face position intercepts the corresponding region of interest image.
  15. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
    获取初始活体检测模型,所述初始活体检测模型包括初始候选区域生成网络及初始活体分类网络;Acquiring an initial living body detection model, where the initial living body detection model includes an initial candidate region generation network and an initial living body classification network;
    获取第一训练样本集及第二训练样本集;所述第二训练样本集对应的训练样本中包括彩色图像、与所述彩色图像对应的近红外图像及对应的目标活体位置信息;Acquiring a first training sample set and a second training sample set; the training samples corresponding to the second training sample set include a color image, a near-infrared image corresponding to the color image, and corresponding target living body position information;
    根据所述第一训练样本集训练所述初始候选区域生成网络直至收敛,得到第一候选区域生成网络;Training the initial candidate region generation network according to the first training sample set until convergence, to obtain a first candidate region generation network;
    根据所述第一候选区域生成网络及所述第二训练样本集训练所述初始活体分类网络直至收敛,得到第一活体分类网络;Training the initial living body classification network according to the first candidate region generation network and the second training sample set until convergence, to obtain a first living body classification network;
    将所述彩色图像输入到所述第一候选区域生成网络中,得到当前人脸候选区域位置信息,将所述当前人脸候选区域位置信息及所述近红外图像输入所述第一活体分类网络中,得到当前活体位置信息;The color image is input into the first candidate region generation network to obtain current face candidate region position information, and the current face candidate region position information and the near-infrared image are input into the first living body classification network In, get the current living body position information;
    根据所述当前活体位置信息及所述目标活体位置信息的差异调整所述第一候选区域生成网络的参数,并返回将所述彩色图像输入到所述第一候选区域生成网络中的步骤直至收敛,得到目标候选区域生成网络;及Adjust the parameters of the first candidate region generation network according to the difference between the current living body position information and the target living body position information, and return to the step of inputting the color image into the first candidate region generation network until convergence , Get the target candidate region generation network; and
    根据所述目标候选区域生成网络及所述第二训练样本集训练所述第一活体分类网络直至收敛,得到目标活体分类网络,根据所述目标候选区域生成网络及所述目标活体分类网络得到训练好的目标活体检测模型。Train the first living body classification network according to the target candidate region generation network and the second training sample set until convergence to obtain a target living body classification network, and obtain training based on the target candidate region generating network and the target living body classification network Good target live detection model.
  16. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 15, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    获取所述目标活体检测模型;Acquiring the target living body detection model;
    获取待检测人脸对应的待检测彩色图像和待检测近红外图像;Obtain the to-be-detected color image and the to-be-detected near-infrared image corresponding to the face to be detected;
    将所述待检测彩色图像输入至所述目标活体检测模型对应的目标候选区域生成网络,得到目标人脸候选区域位置信息;及Inputting the to-be-detected color image into the target candidate region generation network corresponding to the target living body detection model to obtain position information of the target face candidate region; and
    将所述目标人脸候选区域位置信息及所述待检测近红外图像输入至所述目标活体检测模型对应的目标活体分类网络中,得到活体检测结果。Inputting the position information of the target face candidate area and the near-infrared image to be detected into the target living body classification network corresponding to the target living body detection model to obtain a living body detection result.
  17. 根据权利要求16所述的存储介质,其特征在于,所述目标候选区域生成网络包括第一卷积层、第二卷积层及第一池化层,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 16, wherein the target candidate region generation network includes a first convolutional layer, a second convolutional layer, and a first pooling layer, and the computer-readable instructions are processed by the The following steps are also performed when the device is executed:
    将所述待检测彩色图像输入所述第一卷积层中,通过所述第一卷积层对所述待检测彩色图像进行卷积运算,得到第一特征矩阵;Input the to-be-detected color image into the first convolutional layer, and perform a convolution operation on the to-be-detected color image through the first convolutional layer to obtain a first feature matrix;
    将所述第一特征矩阵输入所述第一池化层中,通过所述第一池化层对所述第一特征矩阵中的每个向量中最大的权重进行投影得到归一化的第二特征矩阵;及The first feature matrix is input into the first pooling layer, and the largest weight in each vector in the first feature matrix is projected by the first pooling layer to obtain a normalized second Feature matrix; and
    将所述第二特征矩阵输入所述第二卷积层中,通过所述第二卷积层对所述第二特征矩阵进行卷积计算,得到目标人脸候选区域位置信息。The second feature matrix is input into the second convolutional layer, and the second feature matrix is subjected to convolution calculation by the second convolutional layer to obtain the position information of the target face candidate region.
  18. 根据权利要求16所述的存储介质,其特征在于,所述目标活体分类网络包括第三卷积层、第四卷积层及第二池化层,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 16, wherein the target living body classification network includes a third convolutional layer, a fourth convolutional layer, and a second pooling layer, and the computer-readable instructions are executed by the processor The following steps are also performed during execution:
    根据所述目标人脸候选区域位置信息从所述待检测近红外图像上截取对应的感兴趣区域图像,将所述感兴趣区域图像输入第三卷积层中,通过所述第三卷积层对所述感兴趣区域图像进行卷积运算,得到第三特征矩阵;According to the position information of the target face candidate region, the corresponding region of interest image is intercepted from the near-infrared image to be detected, and the region of interest image is input into the third convolutional layer, and the third convolutional layer is passed through the third convolutional layer. Performing a convolution operation on the image of the region of interest to obtain a third feature matrix;
    将所述第三特征矩阵输入所述第二池化层中,通过所述第二池化层对所述第三特征矩阵中的每个向量中最大的权重进行投影得到归一化的第四特征矩阵;及The third feature matrix is input into the second pooling layer, and the largest weight in each vector in the third feature matrix is projected by the second pooling layer to obtain a normalized fourth Feature matrix; and
    将所述第四特征矩阵输入至第四卷积层中,通过所述第四卷积层对第四特征矩阵进行卷积计算,得到活体检测结果。The fourth feature matrix is input into the fourth convolutional layer, and the fourth feature matrix is subjected to convolution calculation through the fourth convolutional layer to obtain the living body detection result.
  19. 根据权利要求18所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium according to claim 18, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    根据预先标定的摄像头参数矩阵,将所述目标人脸候选区域位置信息对应到所述待检测近红外图像上,定位出所述待检测近红外图像中的人脸位置,及根据定位出的人脸位置截取出对应的感兴趣区域图像。According to the pre-calibrated camera parameter matrix, the position information of the target face candidate area is mapped to the near-infrared image to be detected, the position of the face in the near-infrared image to be detected is located, and the person located is located The face position intercepts the corresponding region of interest image.
  20. 根据权利要求16至19所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claims 16 to 19, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    利用双摄像头模组采集所述待检测人脸对应的彩色图像和近红外图像,对采集到的彩色图像进行人脸检测;Collecting the color image and the near-infrared image corresponding to the face to be detected by using the dual camera module, and performing face detection on the collected color image;
    当根据人脸检测结果判断出检测到人脸时,将采集到的彩色图像及近红外图像分别确定为待检测彩色图像及待检测近红外图像;及When it is determined that a human face is detected based on the face detection result, the collected color image and the near-infrared image are determined as the to-be-detected color image and the to-be-detected near-infrared image, respectively; and
    当根据人脸检测结果判断出未检测到人脸时,返回所述利用双摄像头模组采集所述待检测人脸对应的彩色图像和近红外图像的步骤。When it is determined according to the face detection result that no human face is detected, return to the step of using the dual camera module to collect the color image and the near-infrared image corresponding to the face to be detected.
PCT/CN2019/116269 2019-10-10 2019-11-07 Training method and apparatus for living body detection model, computer device, and storage medium WO2021068322A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910958191.5A CN110941986B (en) 2019-10-10 2019-10-10 Living body detection model training method, living body detection model training device, computer equipment and storage medium
CN201910958191.5 2019-10-10

Publications (1)

Publication Number Publication Date
WO2021068322A1 true WO2021068322A1 (en) 2021-04-15

Family

ID=69906043

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116269 WO2021068322A1 (en) 2019-10-10 2019-11-07 Training method and apparatus for living body detection model, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN110941986B (en)
WO (1) WO2021068322A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139460A (en) * 2021-04-22 2021-07-20 广州织点智能科技有限公司 Face detection model training method, face detection method and related device thereof
CN113239762A (en) * 2021-04-29 2021-08-10 中国农业大学 Vision and infrared signal-based living body detection method and device
CN113283388A (en) * 2021-06-24 2021-08-20 中国平安人寿保险股份有限公司 Training method, device and equipment of living human face detection model and storage medium
CN113343826A (en) * 2021-05-31 2021-09-03 北京百度网讯科技有限公司 Training method of human face living body detection model, human face living body detection method and device
CN113378715A (en) * 2021-06-10 2021-09-10 北京华捷艾米科技有限公司 Living body detection method based on color face image and related equipment
CN113379772A (en) * 2021-07-06 2021-09-10 新疆爱华盈通信息技术有限公司 Mobile temperature measurement method based on background elimination and tracking algorithm in complex environment
CN113658113A (en) * 2021-07-28 2021-11-16 武汉联影医疗科技有限公司 Medical image detection method and training method of medical image detection model
CN113807407A (en) * 2021-08-25 2021-12-17 西安电子科技大学广州研究院 Target detection model training method, model performance detection method and device
CN114049289A (en) * 2021-11-10 2022-02-15 合肥工业大学 Near infrared-visible light face image synthesis method based on contrast learning and StyleGAN2
CN114067445A (en) * 2021-11-26 2022-02-18 中科海微(北京)科技有限公司 Data processing method, device and equipment for face authenticity identification and storage medium
CN114965441A (en) * 2022-07-28 2022-08-30 中国科学院国家天文台 Training method of element probabilistic prediction model and element probabilistic prediction method
CN115147902A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Training method and device for human face living body detection model and computer program product
CN115512427A (en) * 2022-11-04 2022-12-23 北京城建设计发展集团股份有限公司 User face registration method and system combined with matched biopsy
CN115601818A (en) * 2022-11-29 2023-01-13 海豚乐智科技(成都)有限责任公司(Cn) Lightweight visible light living body detection method and device
WO2023124869A1 (en) * 2021-12-30 2023-07-06 杭州萤石软件有限公司 Liveness detection method, device and apparatus, and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582155B (en) * 2020-05-07 2024-02-09 腾讯科技(深圳)有限公司 Living body detection method, living body detection device, computer equipment and storage medium
CN113822302A (en) * 2020-06-18 2021-12-21 北京金山数字娱乐科技有限公司 Training method and device for target detection model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830188A (en) * 2018-05-30 2018-11-16 西安理工大学 Vehicle checking method based on deep learning
CN108921071A (en) * 2018-06-24 2018-11-30 深圳市中悦科技有限公司 Human face in-vivo detection method, device, storage medium and processor
CN109034059A (en) * 2018-07-25 2018-12-18 深圳市中悦科技有限公司 Silent formula human face in-vivo detection method, device, storage medium and processor
US20190034702A1 (en) * 2017-07-26 2019-01-31 Baidu Online Network Technology (Beijing) Co., Ltd. Living body detecting method and apparatus, device and storage medium
CN109446981A (en) * 2018-10-25 2019-03-08 腾讯科技(深圳)有限公司 A kind of face's In vivo detection, identity identifying method and device
US20190095701A1 (en) * 2017-09-27 2019-03-28 Lenovo (Beijing) Co., Ltd. Living-body detection method, device and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038474B (en) * 2017-12-28 2020-04-14 深圳励飞科技有限公司 Face detection method, convolutional neural network parameter training method, device and medium
CN108537152B (en) * 2018-03-27 2022-01-25 百度在线网络技术(北京)有限公司 Method and apparatus for detecting living body
CN108875833B (en) * 2018-06-22 2021-07-16 北京儒博科技有限公司 Neural network training method, face recognition method and device
CN108898112A (en) * 2018-07-03 2018-11-27 东北大学 A kind of near-infrared human face in-vivo detection method and system
CN109255322B (en) * 2018-09-03 2019-11-19 北京诚志重科海图科技有限公司 A kind of human face in-vivo detection method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190034702A1 (en) * 2017-07-26 2019-01-31 Baidu Online Network Technology (Beijing) Co., Ltd. Living body detecting method and apparatus, device and storage medium
US20190095701A1 (en) * 2017-09-27 2019-03-28 Lenovo (Beijing) Co., Ltd. Living-body detection method, device and storage medium
CN108830188A (en) * 2018-05-30 2018-11-16 西安理工大学 Vehicle checking method based on deep learning
CN108921071A (en) * 2018-06-24 2018-11-30 深圳市中悦科技有限公司 Human face in-vivo detection method, device, storage medium and processor
CN109034059A (en) * 2018-07-25 2018-12-18 深圳市中悦科技有限公司 Silent formula human face in-vivo detection method, device, storage medium and processor
CN109446981A (en) * 2018-10-25 2019-03-08 腾讯科技(深圳)有限公司 A kind of face's In vivo detection, identity identifying method and device

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139460A (en) * 2021-04-22 2021-07-20 广州织点智能科技有限公司 Face detection model training method, face detection method and related device thereof
CN113239762A (en) * 2021-04-29 2021-08-10 中国农业大学 Vision and infrared signal-based living body detection method and device
CN113343826A (en) * 2021-05-31 2021-09-03 北京百度网讯科技有限公司 Training method of human face living body detection model, human face living body detection method and device
CN113343826B (en) * 2021-05-31 2024-02-13 北京百度网讯科技有限公司 Training method of human face living body detection model, human face living body detection method and human face living body detection device
CN113378715A (en) * 2021-06-10 2021-09-10 北京华捷艾米科技有限公司 Living body detection method based on color face image and related equipment
CN113378715B (en) * 2021-06-10 2024-01-05 北京华捷艾米科技有限公司 Living body detection method based on color face image and related equipment
CN113283388A (en) * 2021-06-24 2021-08-20 中国平安人寿保险股份有限公司 Training method, device and equipment of living human face detection model and storage medium
CN113379772B (en) * 2021-07-06 2022-10-11 新疆爱华盈通信息技术有限公司 Mobile temperature measurement method based on background elimination and tracking algorithm in complex environment
CN113379772A (en) * 2021-07-06 2021-09-10 新疆爱华盈通信息技术有限公司 Mobile temperature measurement method based on background elimination and tracking algorithm in complex environment
CN113658113A (en) * 2021-07-28 2021-11-16 武汉联影医疗科技有限公司 Medical image detection method and training method of medical image detection model
CN113658113B (en) * 2021-07-28 2024-02-27 武汉联影医疗科技有限公司 Medical image detection method and training method of medical image detection model
CN113807407B (en) * 2021-08-25 2023-04-18 西安电子科技大学广州研究院 Target detection model training method, model performance detection method and device
CN113807407A (en) * 2021-08-25 2021-12-17 西安电子科技大学广州研究院 Target detection model training method, model performance detection method and device
CN114049289A (en) * 2021-11-10 2022-02-15 合肥工业大学 Near infrared-visible light face image synthesis method based on contrast learning and StyleGAN2
CN114049289B (en) * 2021-11-10 2024-03-05 合肥工业大学 Near infrared-visible light face image synthesis method based on contrast learning and StyleGAN2
CN114067445A (en) * 2021-11-26 2022-02-18 中科海微(北京)科技有限公司 Data processing method, device and equipment for face authenticity identification and storage medium
WO2023124869A1 (en) * 2021-12-30 2023-07-06 杭州萤石软件有限公司 Liveness detection method, device and apparatus, and storage medium
CN115147902A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Training method and device for human face living body detection model and computer program product
CN115147902B (en) * 2022-06-30 2023-11-07 北京百度网讯科技有限公司 Training method, training device and training computer program product for human face living body detection model
CN114965441A (en) * 2022-07-28 2022-08-30 中国科学院国家天文台 Training method of element probabilistic prediction model and element probabilistic prediction method
CN115512427A (en) * 2022-11-04 2022-12-23 北京城建设计发展集团股份有限公司 User face registration method and system combined with matched biopsy
CN115601818A (en) * 2022-11-29 2023-01-13 海豚乐智科技(成都)有限责任公司(Cn) Lightweight visible light living body detection method and device

Also Published As

Publication number Publication date
CN110941986B (en) 2023-08-01
CN110941986A (en) 2020-03-31

Similar Documents

Publication Publication Date Title
WO2021068322A1 (en) Training method and apparatus for living body detection model, computer device, and storage medium
US11403876B2 (en) Image processing method and apparatus, facial recognition method and apparatus, and computer device
US10325181B2 (en) Image classification method, electronic device, and storage medium
CN109034078B (en) Training method of age identification model, age identification method and related equipment
JP7058373B2 (en) Lesion detection and positioning methods, devices, devices, and storage media for medical images
WO2019096029A1 (en) Living body identification method, storage medium and computer device
WO2020215557A1 (en) Medical image interpretation method and apparatus, computer device and storage medium
WO2021017261A1 (en) Recognition model training method and apparatus, image recognition method and apparatus, and device and medium
WO2019184124A1 (en) Risk-control model training method, risk identification method and apparatus, and device and medium
WO2018188453A1 (en) Method for determining human face area, storage medium, and computer device
CN109344742B (en) Feature point positioning method and device, storage medium and computer equipment
WO2016124103A1 (en) Picture detection method and device
WO2022033220A1 (en) Face liveness detection method, system and apparatus, computer device, and storage medium
WO2020024395A1 (en) Fatigue driving detection method and apparatus, computer device, and storage medium
US10635894B1 (en) Systems and methods for passive-subject liveness verification in digital media
CN111368672A (en) Construction method and device for genetic disease facial recognition model
WO2021114612A1 (en) Target re-identification method and apparatus, computer device, and storage medium
WO2022057309A1 (en) Lung feature recognition method and apparatus, computer device, and storage medium
KR20150128510A (en) Apparatus and method for liveness test, and apparatus and method for image processing
WO2022033219A1 (en) Face liveness detection method, system and apparatus, computer device, and storage medium
CN112884782B (en) Biological object segmentation method, apparatus, computer device, and storage medium
CN110598638A (en) Model training method, face gender prediction method, device and storage medium
CN110956628B (en) Picture grade classification method, device, computer equipment and storage medium
CN111183455A (en) Image data processing system and method
WO2022134354A1 (en) Vehicle loss detection model training method and apparatus, vehicle loss detection method and apparatus, and device and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19948216

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19948216

Country of ref document: EP

Kind code of ref document: A1