CN113505682A

CN113505682A - Living body detection method and device

Info

Publication number: CN113505682A
Application number: CN202110753609.6A
Authority: CN
Inventors: 王晟
Original assignee: Hangzhou Ezviz Software Co Ltd
Current assignee: Hangzhou Ezviz Software Co Ltd
Priority date: 2021-07-02
Filing date: 2021-07-02
Publication date: 2021-10-15

Abstract

The embodiment of the application provides a method and a device for detecting a living body, wherein the method comprises the following steps: acquiring an infrared image, a depth image and an RGB image of an object to be detected, wherein the infrared image, the depth image and the RGB image are obtained by image acquisition at the same time and all comprise a human face; respectively preprocessing the acquired infrared image, depth image and RGB image according to a preset mode to obtain corresponding target images; and performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain detection result information representing whether the object to be detected is a living body. The embodiment of the application realizes mutual verification of the characteristics between the images, and improves the accuracy of the in-vivo detection result and the in-vivo detection efficiency.

Description

Living body detection method and device

Technical Field

The application relates to the technical field of identity verification, in particular to a method and a device for detecting a living body.

Background

With the continuous development of the technology level, the face recognition technology is more and more widely applied to security, finance, electronic commerce and other scenes needing identity verification. In order to prevent malicious attackers from carrying out malicious attacks by utilizing recorded videos, shot images, 3D face models and the like, living body detection becomes an important link in the face recognition process. The current in-vivo detection mainly includes three modes, the first mode is to perform in-vivo detection based on a single acquired image (such as an infrared image); the second way is to perform discrete biopsy based on multiple types of collected images, such as collected infrared images and depth images, perform one biopsy based on the infrared images, perform another biopsy based on the depth images, and perform weighting and other processing on two obtained biopsy results to determine whether the two biopsy results are biopsies; the third way is to require the user to coordinate blinking, mouth opening, head shaking, etc.

The accuracy of the result of the in-vivo detection in the first mode is poor; in the second mode, multiple times of in-vivo detection treatment are required, and the detection efficiency is low; in the third mode, user participation is required, which is not beneficial to improving user experience.

Disclosure of Invention

The embodiment of the application aims to provide a method and a device for detecting a living body, so as to solve the problems of poor accuracy, low efficiency, poor user experience and the like of the existing living body detection mode.

In order to solve the above technical problem, the embodiment of the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides a method for detecting a living body, including:

acquiring an infrared image, a depth image and an RGB image of an object to be detected; the infrared image, the depth image and the RGB image are obtained by image acquisition at the same time and all comprise human faces;

respectively preprocessing the infrared image, the depth image and the RGB image according to a preset mode to obtain corresponding target images;

and performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain detection result information representing whether the object to be detected is a living body.

In a second aspect, an embodiment of the present application provides a living body detection apparatus, including:

a memory for storing a pre-trained in vivo detection model;

the processor is used for acquiring an infrared image, a depth image and an RGB image of an object to be detected; respectively preprocessing the infrared image, the depth image and the RGB image according to a preset mode to obtain corresponding target images; performing living body detection processing based on mutual characteristic verification on each target image through the living body detection model to obtain a detection result representing whether the object to be detected is a living body; the infrared image, the depth image and the RGB image are obtained by image acquisition at the same time and all comprise human faces.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a communication interface, a memory, and a communication bus; the processor, the communication interface and the memory complete mutual communication through a bus; a memory for storing a computer program; and a processor for executing the program stored in the memory to realize the steps of the living body detection method.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the above-described living body detection method.

In the embodiment of the application, the acquired infrared image, the depth image and the RGB image of the object to be detected, which are acquired at the same time, are preprocessed to obtain corresponding target images; and performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain a detection result representing whether the object to be detected is a living body. Therefore, the living body detection processing is simultaneously carried out on the infrared image, the depth image and the RGB image based on the pre-trained living body detection model, and model training and living body detection processing do not need to be respectively carried out on each type of image, so that the detection efficiency is improved, the time required by model training is reduced, and the resources required by model deployment are reduced; moreover, the pre-trained living body detection model carries out living body detection based on a characteristic mutual evidence mode, and realizes the characteristic evidence among the infrared image, the depth image and the RGB image, thereby greatly improving the accuracy of the detection result; in addition, the living body detection process does not need user participation, the living body detection is completed under the condition that the user is not sensitive, and the user experience can be improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.

FIG. 1 is a schematic flow chart of a method for detecting a living organism according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart of a method for detecting a living organism according to an embodiment of the present disclosure;

FIG. 3 is a schematic flow chart of a method for detecting a living organism according to an embodiment of the present disclosure;

FIG. 4 is a fourth schematic flow chart of a method for detecting a living organism according to an embodiment of the present disclosure;

FIG. 5 is a schematic flow chart of a method for detecting a living organism according to an embodiment of the present disclosure;

FIG. 6 is a sixth schematic flow chart of a method for detecting a living organism according to an embodiment of the present disclosure;

FIG. 7 is a block diagram illustrating the components of a living body detecting apparatus according to an embodiment of the present disclosure;

fig. 8 is a schematic composition diagram of an electronic device provided in an embodiment of the present specification.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

FIG. 1 provides a schematic flow diagram of a method of in vivo detection according to one or more embodiments of the present disclosure; referring to fig. 1, the method may specifically include the following steps:

102, acquiring an infrared image, a depth image and an RGB image of an object to be detected; the infrared image, the depth image and the RGB image are obtained by image acquisition at the same time and all comprise human faces;

the in-vivo detection method provided by the embodiment of the application can be executed by an in-vivo detection device, the in-vivo detection device can be provided with a near infrared camera, a depth camera and an RGB camera, and when in-vivo detection processing is required to be carried out on an object to be detected, the near infrared camera, the depth camera and the RGB camera are used for carrying out image acquisition processing on the object to be detected at the same time to obtain corresponding infrared images, depth images and RGB images; the living body detection device acquires an infrared image acquired by the near-infrared camera, a depth image acquired by the depth camera and an RGB image acquired by the RGB camera. Alternatively, the near-infrared camera, the depth camera, and the RGB camera may be separate and independent from the living body detecting device, and may be capable of communicating with the living body detecting device. Correspondingly, when the living body detection needs to be carried out on the object to be detected, the near infrared camera, the depth camera and the RGB camera are triggered to simultaneously carry out image acquisition processing on the object to be detected at the same time, so that corresponding infrared images, depth images and RGB images are obtained, and the images acquired respectively are sent to the living body detection device. The near-infrared camera, the depth camera and the RGB camera can be triggered by the living body detection device to perform image acquisition operation, and can also be triggered by other equipment to perform image acquisition operation.

Further, the acquired infrared image, depth image, and RGB image may be images obtained by capturing an image of the head of the subject to be detected, or may be images obtained by capturing an image of the upper body or the whole body of the subject to be detected.

104, respectively preprocessing the infrared image, the depth image and the RGB image according to a preset mode to obtain corresponding target images;

in order to improve the accuracy of the in-vivo detection result, in one or more embodiments of the present application, after an infrared image, a depth image, and an RGB image of an object to be detected are acquired, the acquired infrared image is preprocessed according to a preset mode to obtain a target image corresponding to the infrared image; preprocessing the acquired depth image to obtain a target image corresponding to the depth image; and preprocessing the acquired RGB image to obtain a target image corresponding to the RGB image.

And 106, performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain detection result information representing whether the object to be detected is a living body.

In order to improve the accuracy of the in-vivo detection and improve the detection efficiency, the embodiment of the application performs the model training process in advance to obtain the in-vivo detection model. The living body detection model is a three-mode living body detection model, and can simultaneously perform living body detection processing based on mutual characteristic verification on a target image corresponding to an infrared image, a target image corresponding to a depth image and a target image corresponding to an RGB image. The specific training mode of the biopsy model is described in detail later. The form of the detection result information may be set by itself in practical applications as needed, for example, the detection result information is a probability that the object to be detected is a living body.

In one or more embodiments of the application, the acquired infrared image, depth image and RGB image of the object to be detected, which are acquired at the same time, are preprocessed to obtain corresponding target images; and performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain a detection result representing whether the object to be detected is a living body. Therefore, the living body detection processing is carried out on the infrared image, the depth image and the RGB image simultaneously based on the pre-trained living body detection model, and model training and living body detection processing do not need to be carried out on each type of image respectively, so that the detection efficiency is improved, the time required by model training is reduced, and the resources required by model deployment are reduced. Moreover, the pre-trained living body detection model carries out living body detection based on a characteristic mutual evidence mode, and realizes the characteristic evidence among the infrared image, the depth image and the RGB image, thereby greatly improving the accuracy of the detection result; in addition, the living body detection process does not need user participation, the living body detection is completed under the condition that the user is not sensitive, and the user experience can be improved.

In practical application, the positions of different objects to be detected relative to the cameras during image acquisition may be different, and therefore, the faces of the obtained images may be in an inclined state. In order to improve the accuracy of the in-vivo detection, in one or more embodiments of the present application, before the in-vivo detection process is performed, a correction process is performed on each acquired image. Specifically, as shown in fig. 2, step 104 may include the following steps S104-2 to step 104-6:

step 104-2, correcting the infrared image, the depth image and the RGB image according to a preset correction mode;

specifically, as shown in fig. 3, step 104-2 may include the following steps 104-22 to 104-28:

104-22, respectively carrying out face detection processing on the infrared image and the RGB image according to a preset face detection mode to obtain corresponding first face detection result information and second face detection result information;

the first face detection result information comprises first coordinate information of a face frame in the infrared image and second coordinate information of key points of the face, and the second coordinate information comprises left eye coordinate information, right eye coordinate information, nose coordinate information, left mouth angle coordinate information and right mouth angle coordinate information in the infrared image. The second face detection result comprises third coordinate information of a face frame in the RGB image and fourth coordinate information of a face key point; the fourth coordinate information includes coordinate information of a left eye, coordinate information of a right eye, coordinate information of a nose, coordinate information of a left mouth angle, and coordinate information of a right mouth angle in the RGB image. The face frame is taken as a rectangular frame for illustration, and the rectangular frame can be defined according to the upper left vertex and the lower right vertex, or the upper right vertex and the lower left vertex, so that the coordinate information of the upper left vertex and the lower right vertex can be determined as the coordinate information of the face frame, or the coordinate information of the upper right vertex and the lower left vertex can be determined as the coordinate information of the face frame; accordingly, each piece of coordinate information in the first detection result information may be denoted as P1{ Rect (x1, x2, y1, y2), Point [5] (x, y) }, where Rect (x1, x2, y1, y2) is coordinate information of a face frame, and Point [5] (x, y) represents an array formed by coordinate information of the above five face key points. Similarly, each piece of coordinate information in the second detection result information may be denoted as P2{ Rect (x1, x2, y1, y2), Point [5] (x, y) }.

Furthermore, the face detection mode can be set in practical application according to needs, for example, the face detection processing is respectively carried out on the acquired infrared image and the acquired RGB image through a pre-trained face detection model; since the training mode of the face detection model is the prior art, it is not described herein again.

104-26, converting the acquired depth image according to the first detection result information to obtain a target depth image;

because each pixel point of the infrared image corresponds to each pixel point of the depth image one to one, in order to facilitate processing of image features of the depth image, in one or more embodiments of the present application, the depth image is converted according to the first detection result information. Specifically, steps 104-26 may include: acquiring a depth value of a pixel point corresponding to the coordinate information of the nose, which is included in the first detection result information, from the depth image; respectively carrying out difference processing on the depth value of each pixel point of the depth image and the depth value of the pixel point corresponding to the acquired coordinate information of the nose to obtain a target depth value corresponding to each pixel point; and determining the depth image corresponding to each target depth value as a target depth image.

Further, after obtaining the target depth value of each pixel point, the method may further include: if the target depth value is determined to be smaller than the first preset value, updating the target depth value to the first preset value; and if the target depth value is determined to be larger than the second preset value, updating the target depth value to the second preset value. The first preset value may be 0, and the second preset value may be 255. Therefore, the acquired depth image is converted, so that the original pixel value represented by 16-bit binary is converted into 8-bit binary, namely the pixel value of each pixel point is between 0 and 255, and the data management is facilitated, and the feature fusion and separation processing is facilitated in the subsequent living body detection process.

104-28, correcting the infrared image and the target depth image according to the first detection result information, and correcting the RGB image according to the second detection result information;

specifically, according to coordinate information of eyes included in the first detection result information, correction processing is respectively performed on the infrared image and the target depth image, so that vertical coordinates of left eyes and right eyes in the corrected infrared image and the corrected target depth image are the same; and performing correction processing on the RGB image according to the coordinate information of the eyes included in the second detection result information so that the vertical coordinates of the left eye and the right eye in the corrected RGB image are the same. Wherein, according to the coordinate information of the eye included in the first detection result information, performing correction processing on the infrared image may include: taking the vertical coordinate of the left eye in the first detection result information as a reference, and rotating the whole infrared image leftwards (at the moment, the vertical coordinate of the left eye is smaller than the vertical coordinate of the right eye) or rightwards (at the moment, the vertical coordinate of the left eye is larger than the vertical coordinate of the right eye) so as to enable the vertical coordinate of the right eye to be the same as the vertical coordinate of the left eye; or, the vertical coordinate of the right eye in the first detection result information is used as a reference, and the whole infrared image is rotated leftwards or rightwards, so that the vertical coordinate of the left eye is the same as the vertical coordinate of the right eye; or taking a preset vertical coordinate as a reference, moving the whole infrared image upwards or downwards, and simultaneously rotating the whole infrared image leftwards or rightwards, so that the vertical coordinate of the left eye and the vertical coordinate of the right eye are both preset vertical coordinates. Further, the process of performing the correction processing on the depth image according to the coordinate information of the eye included in the first detection result information, and the process of performing the correction processing on the RGB image according to the coordinate information of the eye included in the second detection result information are the same as the process of performing the correction processing on the infrared image according to the coordinate information of the eye included in the first detection result information, and repeated points are not repeated here. Therefore, the infrared image, the target depth image and the RGB image are all corrected, the fact that the human faces in the images are in a vertical state but not in an inclined state is guaranteed, and accuracy of follow-up living body detection processing is further guaranteed.

Further, in order to improve the efficiency of the correction process, in one or more embodiments of the present application, step 104-2 may include: if the preset correction condition is met according to the first detection result information, correcting the infrared image and the target depth image according to the first detection result information; and if the preset correction condition is determined to be met according to the second detection result information, performing correction processing on the RGB graph according to the second detection result information. Specifically, a vertical coordinate of the left eye and a vertical coordinate of the right eye are obtained from the first detection result information, whether the obtained vertical coordinate of the left eye is the same as the vertical coordinate of the right eye or not is determined, if yes, it is determined that the preset correction condition is not met, and if not, it is determined that the preset correction condition is met. The process of determining that the preset correction condition is satisfied according to the second detection result information is the same as the process of determining that the preset correction condition is satisfied according to the first detection result information, and repeated details are not repeated here. Therefore, the correction conditions are used for judgment, so that only the images meeting the correction conditions can be subjected to correction processing instead of each image, and the correction processing efficiency can be improved.

Step 104-4, respectively carrying out face extraction processing on the corrected infrared image, the depth image and the RGB image to obtain corresponding face images;

specifically, determining first coordinate information of a face frame and second coordinate information of a face key point in the infrared image after correction processing; determining third coordinate information of a face frame and fourth coordinate information of a face key point in the RGB image after correction processing; according to the determined first coordinate information and second coordinate information, respectively carrying out face extraction processing on the corrected infrared image and the corrected target depth image to obtain corresponding face images; and according to the determined third coordinate information and the fourth coordinate information, performing face extraction processing on the RGB image after correction processing to obtain a corresponding face image.

Determining the first coordinate information of the face frame and the second coordinate information of the face key point in the corrected infrared image may include: correspondingly correcting the first coordinate information and the second coordinate information in the first detection result according to the movement condition of the infrared image in the correction processing process, and determining the corrected first coordinate information and the corrected second coordinate information as the first coordinate information of the face frame and the second coordinate information of the face key point in the corrected infrared image; or, performing face detection processing on the corrected infrared image in a preset face detection mode to obtain new first detection result information, and determining first coordinate information and second coordinate information in the new first detection result information as first coordinate information of a face frame and second coordinate information of a face key point in the corrected infrared image. The process of determining the third coordinate information of the face frame and the fourth coordinate information of the face key point in the RGB image after the correction processing is the same as the process of determining the first coordinate information of the face frame and the second coordinate information of the face key point in the infrared image after the correction processing, and repeated parts are not repeated here.

Further, the first coordinate information of the face frame and the second coordinate information of the key points of the face in the infrared image after the determined correction processing are recorded as P1 '{ Rect (x 1', x2 ', y 1', y2 '), Point [5] (x', y ') }, and the third coordinate information of the face frame and the fourth coordinate information of the key points of the face in the RGB image after the determined correction processing are recorded as P2' { Rect (x1 ', x 2', y1 ', y 2'), Point [5] (x ', y') }; performing face extraction processing on the corrected infrared image and the corrected target depth image according to the P1' to obtain corresponding face images; and carrying out face extraction processing on the RGB image after correction processing according to P2' to obtain a corresponding face image.

And step 104-6, determining each face image as a target image.

Specifically, a face image extracted from the corrected infrared image, a face image extracted from the corrected target depth image, and a face image extracted from the corrected RGB image are determined as the target images.

Taking a depth image and a face frame as an example, a schematic diagram of the depth image before and after the preprocessing is shown in fig. 4, where a is the depth image obtained before the preprocessing, and b is a target image obtained after the preprocessing of the depth image shown in a, that is, the target image obtained after the correction and face extraction are performed. It should be noted that, since the acquired depth image is a 3D stereoscopic image, which can be rotated, in order to better embody the depth feature thereof, the depth image shown in a is an image obtained by rotating the acquired depth image to the left.

Therefore, the risk that the living body detection accuracy is influenced because the face in the image is in a tilted state can be avoided by preprocessing the acquired image, namely firstly performing correction processing and then extracting the face image from the image after the correction processing so as to perform the living body detection processing based on the face image.

In order to realize the living body detection based on the mutual characteristic verification, so as to improve the accuracy of the living body detection, in one or more embodiments of the present application, multiple times of characteristic fusion and separation processing are performed between image characteristics of each target image in a living body detection model. Specifically, as shown in FIG. 5, step 106 may include the following steps 106-2 and 106-4:

step 106-2, simultaneously inputting each target image into a pre-trained living body detection model;

and 106-4, performing multiple times of feature fusion and separation processing on the image features of each target image through the living body detection model, and outputting detection result information representing whether the object to be detected is a living body or not based on the result information of the feature fusion and separation processing.

The number of times of the feature fusion and separation processes and the specific processes of the feature fusion and separation processes can be set by themselves in practical application as required, and the application is not limited specifically. Therefore, each target image is simultaneously input into the same living body detection model, and the living body detection processing is simultaneously carried out on the plurality of target images through the living body detection model, so that compared with the prior art that the living body detection processing is respectively carried out on each target image, and the corresponding detection model is respectively trained and deployed for each type of image, the living body detection efficiency is improved, and the model training time and the resources required by the deployment model are reduced. Moreover, the image characteristics of different target images are subjected to multiple times of fusion and separation processing through the in-vivo detection model, mutual verification among the characteristics is realized, and compared with a discrete detection mode in the prior art, namely a mode of respectively carrying out in-vivo detection on each target image, the accuracy of detection is greatly improved.

Further, in order to implement the in-vivo detection processing on a plurality of target images simultaneously by using the same in-vivo detection model, in one or more embodiments of the present application, as shown in fig. 6, step 102 may further include the following steps 100-2 and 100-4 before:

step 100-2, acquiring a plurality of image sample combinations; the image sample combination comprises an infrared image, a depth image and an RGB image which are obtained by carrying out image acquisition on a target object at the same time; images in the image sample combination all comprise human faces;

wherein the target objects include living objects and prosthetic objects.

And 100-4, performing training treatment based on the image sample combination according to a preset training mode to obtain a living body detection model.

Specifically, labeling each acquired image sample combination, and preprocessing each image in the image combination according to a preset mode to obtain an image sample combination to be trained; dividing the combination of image samples to be trained into a training set and a testing set; according to a preset training mode, carrying out training processing based on a training set to obtain an initial model; and testing the initial model by using the test set, and if the test result information of the test processing meets the preset condition, determining the current initial model as the living body detection model.

And performing labeling processing on each acquired image sample combination to label each image sample combination, so as to determine whether the target object corresponding to the corresponding image sample combination is a living object or a prosthetic object according to the label. The process of preprocessing each image in the image combination can refer to the process of preprocessing the acquired infrared image, depth image and RGB image, and the repeated points are not described herein again. Determining that the test result information of the test process satisfies the preset condition may include: and determining the accuracy of the in-vivo detection model according to the test result and the label of each image sample combination in the test set, and if the accuracy is greater than the preset accuracy, determining that the test result information meets the preset condition.

Further, when the test result does not meet the preset condition, the current initial model is subjected to tuning treatment according to a preset tuning mode, and then training treatment is carried out again based on the training set until the final in-vivo detection model is obtained.

In the embodiment of the specification, the acquired infrared image, depth image and RGB image of the object to be detected, which are acquired at the same time, are preprocessed to obtain corresponding target images; and performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain a detection result representing whether the object to be detected is a living body. Therefore, the living body detection processing is simultaneously carried out on the infrared image, the depth image and the RGB image based on the pre-trained living body detection model, and model training and living body detection processing do not need to be respectively carried out on each type of image, so that the detection efficiency is improved, the time required by model training is reduced, and the resources required by model deployment are reduced; moreover, the pre-trained living body detection model carries out living body detection based on a characteristic mutual evidence mode, and realizes the characteristic evidence among the infrared image, the depth image and the RGB image, thereby greatly improving the accuracy of the detection result; in addition, the living body detection process does not need user participation, the living body detection is completed under the condition that the user is not sensitive, and the user experience can be improved.

Based on the same technical concept, one or more embodiments of the present specification further provide a living body detection apparatus, fig. 7 one or more embodiments of the present specification further provide a schematic block composition diagram of the living body detection apparatus, as shown in fig. 7, the apparatus includes:

a memory 201 for storing a pre-trained in-vivo detection model;

the processor 202 is configured to acquire an infrared image, a depth image and an RGB image of an object to be detected; respectively preprocessing the infrared image, the depth image and the RGB image according to a preset mode to obtain corresponding target images; performing living body detection processing based on mutual characteristic verification on each target image through the living body detection model to obtain a detection result representing whether the object to be detected is a living body; the infrared image, the depth image and the RGB image are obtained by image acquisition at the same time and all comprise human faces.

Optionally, the processor 202 is specifically configured to:

correcting the infrared image, the depth image and the RGB image according to a preset correction mode;

respectively carrying out face extraction processing on the infrared image, the depth image and the RGB image after correction processing to obtain corresponding face images;

and determining each face image as the target image.

Optionally, the processor 202 is further specifically configured to:

respectively carrying out face detection processing on the infrared image and the RGB image according to a preset face detection mode to obtain corresponding first detection result information and second detection result information;

converting the depth image according to the first detection result information to obtain a target depth image;

and correcting the infrared image and the target depth image according to the first detection result information, and correcting the RGB image according to the second detection result information.

Optionally, the first detection result information includes coordinate information of a nose; the processor 202 is further specifically configured to:

acquiring a first depth value of a pixel point corresponding to the coordinate information of the nose from the depth image;

performing difference processing on the second depth value and the first depth value of each pixel point of the depth image respectively to obtain a target depth value of each pixel point;

and determining the depth image corresponding to the target depth value as the target depth image.

Optionally, the processor 202 is further specifically configured to:

according to the coordinate information of the eyes included in the first detection result information, correction processing is respectively carried out on the infrared image and the target depth image, so that the vertical coordinates of the left eye and the right eye in the corrected infrared image and the corrected target depth image are the same;

and according to the coordinate information of the eyes included in the second detection result information, performing correction processing on the RGB image so as to enable the vertical coordinates of the left eye and the right eye in the corrected RGB image to be the same.

Optionally, the processor 202 is further specifically configured to:

if the infrared image and the target depth image meet preset correction conditions according to the first detection result information, performing correction processing on the infrared image and the target depth image according to the first detection result information;

the correcting the RGB image according to the second detection result information includes:

and if the correction condition is determined to be met according to the second detection result information, performing correction processing on the RGB graph according to the second detection result information.

Optionally, the processor 202 is further specifically configured to:

determining first coordinate information of a face frame and second coordinate information of face key points in the corrected infrared image;

determining third coordinate information of a face frame and fourth coordinate information of a face key point in the RGB image after correction processing;

according to the first coordinate information and the second coordinate information, respectively carrying out face extraction processing on the infrared image after correction processing and the target depth image after correction processing to obtain corresponding face images;

and according to the third coordinate information and the fourth coordinate information, performing face extraction processing on the RGB image after correction processing to obtain a corresponding face image.

Optionally, the processor 202 is further specifically configured to:

simultaneously inputting each target image into a pre-trained living body detection model;

and performing multiple times of feature fusion and separation processing on the image features of the target image through the living body detection model, and outputting detection result information representing whether the object to be detected is a living body or not based on the result information of the feature fusion and separation processing.

Optionally, the processor 202 is further configured to:

acquiring a plurality of image sample combinations; the image sample combination comprises an infrared image, a depth image and an RGB image which are obtained by carrying out image acquisition on a target object at the same time; images in the image sample combination all comprise human faces;

and training based on the image sample combination according to a preset training mode to obtain the in-vivo detection model.

Optionally, the processor 202 is specifically configured to:

labeling the image sample combination, and preprocessing each image in the image combination according to the preset mode to obtain an image sample combination to be trained;

dividing the image sample combination to be trained into a training set and a testing set;

according to a preset training mode, carrying out training processing based on the training set to obtain an initial model;

and testing the initial model by using the test set, and if the test result information of the test processing meets the preset condition, determining the current initial model as the living body detection model.

The living body detection device provided by the embodiment of the specification preprocesses the acquired infrared image, depth image and RGB image of the object to be detected, which are acquired at the same time, to obtain a corresponding target image; and performing living body detection processing based on mutual characteristic verification on each target image through a pre-trained living body detection model to obtain a detection result representing whether the object to be detected is a living body. Therefore, the living body detection processing is simultaneously carried out on the infrared image, the depth image and the RGB image based on the pre-trained living body detection model, and model training and living body detection processing do not need to be respectively carried out on each type of image, so that the detection efficiency is improved, the time required by model training is reduced, and the resources required by model deployment are reduced; moreover, the pre-trained living body detection model carries out living body detection based on a characteristic mutual evidence mode, and realizes the characteristic evidence among the infrared image, the depth image and the RGB image, thereby greatly improving the accuracy of the detection result; in addition, the living body detection process does not need user participation, the living body detection is completed under the condition that the user is not sensitive, and the user experience can be improved.

In addition, for the above device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to partial description of the method embodiment. Further, it should be noted that, among the respective components of the apparatus of the present invention, the components thereof are logically divided according to the functions to be realized, but the present invention is not limited thereto, and the respective components may be newly divided or combined as necessary.

Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, and referring to fig. 8, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, and may also include hardware required by other services. The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs, and forms the living body detection device on a logic level. Of course, besides the software implementation, the present application does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.

The network interface, the processor and the memory may be interconnected by a bus system. The bus may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.

The memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both read-only memory and random access memory, and provides instructions and data to the processor. The Memory may include a Random-Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory.

The processor is used for executing the program stored in the memory and specifically executing:

The method performed by the living body detecting device disclosed in the embodiment of fig. 7 of the present application can be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.

Based on the same technical concept, embodiments of the present application also provide a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform the living body detection method provided by any one of the corresponding embodiments of fig. 1 to 6.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of in vivo detection, comprising:

2. The method according to claim 1, wherein the preprocessing the infrared image, the depth image and the RGB image according to a preset manner to obtain corresponding target images comprises:

and determining each face image as the target image.

3. The method according to claim 2, wherein the performing a correction process on the infrared image, the depth image and the RGB image according to a preset correction mode comprises:

4. The method according to claim 3, wherein the first detection result information includes coordinate information of a nose;

the converting the depth image according to the first detection result information to obtain a target depth image includes:

acquiring the depth value of a pixel point corresponding to the coordinate information of the nose from the depth image;

performing difference processing on the depth value of each pixel point of the depth image and the depth value of the pixel point corresponding to the coordinate information of the nose respectively to obtain a target depth value corresponding to each pixel point;

5. The method according to claim 3, wherein the performing correction processing on the infrared image and the target depth image according to the first detection result information and performing correction processing on the RGB graph according to the second detection result information includes:

6. The method according to claim 3, wherein the performing correction processing on the infrared image and the target depth image according to the first detection result information includes:

7. The method according to claim 3, wherein the performing face extraction processing on the infrared image, the depth image, and the RGB image after the correction processing to obtain corresponding face images respectively comprises:

8. The method according to claim 1, wherein the performing a feature-based mutual verification living body detection process on each target image through a pre-trained living body detection model to obtain detection result information representing whether the object to be detected is a living body comprises:

9. The method of claim 1, further comprising:

10. The method according to claim 9, wherein the obtaining the in-vivo detection model by performing training processing based on the image sample combination according to a preset training mode comprises:

11. A living body detection device, comprising:

a memory for storing a pre-trained in vivo detection model;

12. An electronic device, comprising: a processor, a communication interface, a memory, and a communication bus; the processor, the communication interface and the memory complete mutual communication through a bus; a memory for storing a computer program; a processor for executing a program stored in a memory to perform the steps of the method of any of claims 1 to 10.

13. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the method of one of the preceding claims 1 to 10.