CN111401348A

CN111401348A - Living body detection method and system for target object

Info

Publication number: CN111401348A
Application number: CN202010504692.9A
Authority: CN
Inventors: 曹佳炯; 李亮
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2020-07-10
Anticipated expiration: 2040-06-05
Also published as: CN111401348B

Abstract

The embodiment of the specification discloses a living body detection method and a living body detection system of a target object, wherein the method comprises the following steps: imaging a target object to be detected to generate an image of at least one modality; identifying the image of the at least one mode based on an attack type identification model, and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set; determining a live body detection model associated with the prejudged attack type; and inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection, and determining whether the target object to be detected is a living body.

Description

Living body detection method and system for target object

Technical Field

The embodiment of the specification relates to the technical field of data processing, in particular to a living body detection method and system of a target object.

Background

With the wide application of the identity recognition technology in various fields, for example, the security of the identity recognition technology is receiving more and more attention from people, such as unlocking access doors, unlocking mobile phones, and online payment by using the identity recognition technologies such as face recognition, palm print recognition, or fingerprint recognition.

At present, many lawbreakers counterfeit living bodies to identify identities, and perform safety behaviors such as harming property, human bodies, public and the like after the identities are successfully identified, wherein the living bodies refer to living objects such as living animals, plants, human bodies and tissues thereof. However, detecting a living body and performing prevention requires high costs.

Therefore, in order to ensure the safety of identity recognition, reduce the cost and improve the identity recognition efficiency, the present specification provides a method and a system for detecting a living body of a target object.

Disclosure of Invention

An aspect of embodiments of the present specification provides a living body detection method of a target object, the method including: imaging a target object to be detected to generate an image of at least one modality; identifying the image of the at least one mode based on an attack type identification model, and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set; determining a live body detection model associated with the prejudged attack type; and inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection, and determining whether the target object to be detected is a living body.

An aspect of embodiments of the present specification provides a living body detection system of a target object, the system including: the imaging module is used for imaging a target object to be detected and generating an image of at least one modality; the prejudgment attack type identification module is used for identifying the image of the at least one mode based on an attack type identification model and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set; a live detection model determination module for determining a live detection model associated with the prejudged attack type; and the living body judgment module is used for inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection and determining whether the target object to be detected is a living body.

One aspect of embodiments of the present specification provides a liveness detection device of a target object, comprising at least one storage medium for storing computer instructions and at least one processor; the at least one processor is configured to execute the computer instructions to implement a liveness detection method of a target object.

Drawings

The present description will be further described by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:

FIG. 1 is a block diagram of a living body detection system of a target object shown in accordance with some embodiments of the present description;

FIG. 2 is a flow diagram of a method of in vivo detection of a target object, shown in accordance with some embodiments of the present description;

FIG. 3 is a flow diagram of a method of training an attack type recognition model according to some embodiments shown in the present description;

FIG. 4 is a flow diagram of a method of training a liveness detection model in accordance with certain embodiments of the present description;

fig. 5 is a flow chart illustrating step 204 according to some embodiments of the present description.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

It should be understood that "system", "device", "unit" and/or "module" as used in this specification is a method for distinguishing different components, elements, parts or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

FIG. 1 is a block diagram of a living body detection system of a target object shown in accordance with some embodiments of the present description.

The living body detecting system 100 of the target object may be used for an internet service platform. In some embodiments, the system 100 may be used in an online service platform that includes an online identity authentication system. For example, a unionpay payment platform, a face-brushing unlocking platform, a face vending platform and the like.

As shown in fig. 1, the living body detection system 100 of the target object may include an imaging module 110, a prejudged attack type recognition module 120, a living body detection model determination module 130, a living body judgment module 140, and a training module 150.

The imaging module 110 may be configured to image a target object to be detected, and generate an image of at least one modality.

The prejudgment attack type identification module 120 may be configured to identify the image of the at least one modality based on an attack type identification model, and determine a prejudgment attack type of the target object to be detected, where the prejudgment attack type is an element in an attack type set, and the image of the at least one modality corresponds to the image of the at least one modality. The set of attack types may consist of at least the labels of the training samples for training the attack type recognition model, see fig. 3 and its associated description in relation to the training samples. In some embodiments, elements of the set of attack types may include, but are not limited to: the attack type is not clear, such as screen attack, printing paper attack, 3D mask attack, 3D silica gel mask attack and the attack type. Correspondingly, the prejudged attack type can be screen attack, printing paper attack, 3D mask attack, 3D silica gel mask attack or undefined attack type and the like.

In some embodiments, the look-ahead attack type identification module 120 may be further configured to: obtaining an image of at least one target modality from the image of the at least one modality, wherein the at least one target modality is matched with a modality of a sample image used for training the attack type recognition model; inputting the image of the at least one target modality into the corresponding attack type identification model, and acquiring at least one probability distribution, wherein a value in the probability distribution represents the probability that the target object to be detected belongs to an element in the attack type set; and determining the pre-judging attack type of the target object to be detected based on the at least one probability distribution.

In some embodiments, the look-ahead attack type identification module 120 may be further configured to: and performing fusion operation or/and voting on the at least one probability distribution, and determining the pre-judging attack type of the target object to be detected based on the result of the fusion operation or/and the result of the voting.

The liveness detection model determination module 130 may be configured to determine a liveness detection model associated with the type of prejudged attack.

The living body judgment module 140 may be configured to input an image of at least one modality of the target object to be detected into the associated living body detection model for detection, and determine whether the target object to be detected is a living body.

In some embodiments, the in-vivo detection system 100 of the target object further includes a training module 150. In some embodiments, the training module 150 may be configured to: obtaining a plurality of first training samples, wherein the first training samples comprise sample images and first labels, the first labels represent attack types of sample target objects corresponding to the sample images, and the attack type set at least comprises the first labels; sample images in the plurality of first training samples correspond to the same modality; and training to obtain the models in the attack type recognition model set based on the plurality of first training samples.

In some embodiments, the training module 150 may also be configured to: acquiring a plurality of second training samples, wherein the second training samples comprise sample images and second labels, the second labels represent whether sample target objects corresponding to the sample images are living bodies, and the sample images of the plurality of second training samples correspond to the same modality; and training to obtain a model in the living body detection model set based on the plurality of second training samples.

In some embodiments, the training module 150 may also be configured to: preprocessing the sample image before training, the preprocessing at least comprising: cutting the sample image; or/and adjusting the sample image or the cut sample image to a preset resolution.

It should be understood that the system and its modules shown in FIG. 1 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules in this specification may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).

It should be noted that the above description of the living body detecting system 100 of the target object and the modules thereof is only for convenience of description and should not limit the present specification to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. For example, the imaging module 110, the prejudged attack type identification module 120, the living body detection model determination module 130, the living body judgment module 140 and the training module 150 disclosed in fig. 1 may be different modules in a system, or may be a module that implements the functions of the two modules. For another example, the modules of the living body detecting system 100 for the target object may share one storage module, and each module may have its own storage module. Such variations are within the scope of the present disclosure.

Fig. 2 is a flow diagram of a method of in vivo detection of a target object, shown in accordance with some embodiments of the present description. As shown in fig. 2, the method 200 may include:

step 202, imaging a target object to be detected, and generating an image of at least one modality. In some embodiments, this step 202 may be performed by imaging module 110.

Living body means an object having a life. By way of example, living organisms may include, but are not limited to: face, palm print, fingerprint, etc.

The attack means to forge an object with life by using an inanimate object, and the attack can be divided into various types. By way of example, the attack types may include, but are not limited to, screen attacks, print paper attacks, 3D paper mask attacks, 3D silicone mask attacks, and the like.

The target object to be detected may be any object that needs to be detected whether it is a living body. It is understood that the target object to be detected may be a living body or an attack. For example, the object to be detected may be a human face, a human face photograph, a mask, or the like.

In some embodiments, the image of at least one modality may be: and under different imaging modes, imaging the target object to be detected to generate a corresponding image. In some embodiments, the imaging module 110 may image the target object to be detected based on different imaging devices (i.e., devices that image based on different imaging modes), generating images of corresponding modalities. The modality of the image corresponds to an imaging mode of the imaging device, the imaging mode includes, but is not limited to, RGB (Red, Green, Blue) imaging, IR (Infrared) imaging, 3D (three dimensional) imaging, and the like, and the image of at least one modality includes, but is not limited to, RGB image, IR image, 3D image, and the like.

And 204, identifying the image of the at least one mode based on an attack type identification model, and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set. In some embodiments, this step 204 may be performed by look-ahead attack type identification module 120.

The set of attack types contains all possible attack types. In some embodiments, elements of the set of attack types may include, but are not limited to: the attack type is not clear, such as screen attack, printing paper attack, 3D mask attack, 3D silica gel mask attack and the attack type.

In some embodiments, the set of attack types may consist of at least the labels of the training samples for training the attack type recognition model, see fig. 3 and its associated description in relation to the training samples. For example, the training samples include 3 kinds of labels, namely attack type 1, attack type 2 and attack type 3, and the attack type set is composed of at least three kinds of labels, and the three labels may be part or all of elements in the set. The set of attack types may also contain content other than tags that may be customized, e.g., the attack type is ambiguous, etc.

In some embodiments, the attack type identification model may be a machine learning model. For example, it may be a machine learning model trained in advance. For example, the attack type identification model may be a model that supports multi-label classification that is constructed using lightweight network structures (e.g., ShuffleNet or MobileNet, etc.).

In some embodiments, the collected images of different modalities may be used as training samples for training the attack type identification models of corresponding modalities in advance, so as to obtain the attack type identification models corresponding to different modalities through training, and form an attack type identification model set. Still taking the above example as an example, the set of attack type identification models may include: the method comprises an RGB attack type recognition model obtained through RGB image training, an IR attack type recognition model obtained through IR image training and a 3D attack type recognition model obtained through 3D image training. For a training process of any one model in the attack type recognition model set, refer to fig. 3 and its related description, which are not described herein again.

In some embodiments, the attack type identification model set further includes models corresponding to all the modalities, and the models are trained based on sample images of all the modalities. The training process can be seen in fig. 3 and its related description, and is not described herein again.

In some embodiments, the attack type identification model may be one of a set of attack type identification models. When the model corresponds to a specific mode, inputting the image of the corresponding mode in the image of at least one mode into the model, and determining the pre-judging attack type of the target object to be detected. When the model corresponds to all the modalities, inputting any one of the images of at least one modality into the model, and determining the prejudgment attack type of the target object to be detected. Refer specifically to fig. 5 and its related description, which are not repeated herein.

In some embodiments, the attack type identification model may be a plurality of models in a set of attack type identification models. The multiple models correspond to different modalities. For example, the plurality of models are IR and RGB attack type recognition models. When the pre-judging attack type is determined, inputting a picture corresponding to the mode of the model, and determining the pre-judging attack type. For example, an IR picture in a picture of at least one modality is input into an IR attack type recognition model, an RGB picture is input into an RGB attack type recognition model, and a prejudged attack type is determined based on output results of the two models. Refer specifically to fig. 5 and its related description, which are not repeated herein.

In order to reduce the hardware requirements on the system 100, reduce the cost, and reduce the overall detection time, preferably, the attack type recognition model may be one of a set of attack type recognition models, i.e., the system 100 determines the prejudged attack type of the target object based on only one model.

Step 206, determining a live body detection model associated with the prejudged attack type. In some embodiments, this step 206 may be performed by the liveness detection model determination module 130.

In some embodiments, the liveness detection model may be a machine learning model, e.g., may be a pre-trained machine learning model. The liveness detection model may be a binary model, for example, the liveness detection model may be a CNN model, which may be, for example, VGG or ResNet18, etc.

In some embodiments, the associated liveness detection model may be one or more models of a set of liveness detection models. In some embodiments, the in-vivo detection models of the corresponding modalities may be trained based on pre-acquired images of different modalities, so that the in-vivo detection models corresponding to different modalities are trained, and a set of in-vivo detection models is formed. Still taking the above example as an example, the set of liveness detection models may include: a set consisting of an RGB liveness detection model obtained by RGB image training, an IR liveness detection model obtained by IR image training, and a 3D liveness detection model obtained by 3D image training. For the training process of any one of the living body detection models in the living body detection model set, refer to fig. 3 and its related description, which are not repeated herein.

Because the imaging principles of different modalities are different, the living body detection models corresponding to different modalities have different detection effects on different attack types, and different attack types are determined to be different in accuracy when being actually attacked, in other words, the defense effects on different attack types are different. For example, the IR liveness detection model is very effective for attacks such as printing 3D masks; for another example, the 3D biopsy model is very effective for screen attacks and plane photographs.

In some embodiments, the liveness detection model determination module 130 may determine the liveness detection model associated with the prejudged attack type based on a preset rule. In some embodiments, the preset rules may be specifically set according to actual requirements. Wherein, the preset rule may be: in order to meet different requirements, the living body detection models of corresponding modalities are set for different attack types. It is understood that correlation refers to the correspondence between the determined attack type and the live body detection models of different modalities based on the requirements.

For example, if the actual demand is low in time consumption and low in precision, the corresponding (or associated) live detection model of the screen attack is an IR live detection model; the corresponding (or associated) live body detection model of the printing paper attack is a 3D live body detection model; the 3D paper mask attack corresponding (or associated) live detection model is an IR live detection model; the 3D silica gel mask attacks the corresponding (or related) living body detection model to be an RGB living body detection model; the corresponding (or associated) liveness detection model is not explicitly of attack type as a 3D liveness detection model.

For another example, if the actual demand is time-consuming and accurate, the live detection models corresponding to (or associated with) the screen attack are IR and 3D live detection models; the live body detection models corresponding to (or associated with) the printing paper attack are a 3D live body detection model and an IR live body detection model; the 3D paper mask attacks corresponding (or related) living body detection models are an RGB living body detection model and an IR living body detection model; the 3D silica gel mask attacks the corresponding (or related) living body detection models which are an RGB living body detection model and an IR living body detection model; the corresponding (or associated) liveness detection models for which the attack type is not explicitly defined are the 3D liveness detection model and the IR liveness detection model.

And 208, inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection, and determining whether the target object to be detected is a living body. In some embodiments, this step 208 may be performed by the liveness determination module 140.

In some embodiments, the living body judgment module 140 may determine whether the target object to be detected is a living body based on an output result of the associated living body detection model. When a plurality of associated living body detection models are provided, the image of at least one modality is input into the associated living body detection model corresponding to the modality, and whether the input image is a living body or not is output.

For example, if the pre-judging attack type is screen attack, and the associated living body detection models are determined to be IR and 3D living body detection models according to the preset rules and the actual requirements, the IR image obtained based on the imaging of the target object to be detected is input into the IR living body detection model and is output as the result of the living body; and inputting a 3D image obtained based on imaging of the target object to be detected into the 3D living body detection model, and outputting a result of not being a living body, so that the result of not being a living body of the target object to be detected can be determined.

As can be seen from the above description, in the embodiments of the present specification, the attack type of the target object to be detected is pre-judged in advance, and then only the living body detection model associated with the pre-judged attack type is used to detect the target object to be detected. On one hand, the problem of inaccurate detection result caused by only adopting a single in-vivo detection model can be avoided; on the other hand, all models in the living body detection model set do not need to be started for detection, the accuracy of a detection result can still be ensured, the corresponding hardware usage is reduced, the extra computational power and the time cost are reduced, and the user experience is improved. Moreover, by setting a corresponding relation between the in-vivo detection model and the prejudged attack type in advance, after the prejudged attack type is determined, the associated in-vivo detection model is determined so as to meet different actual requirements.

FIG. 3 is a flow diagram of a method of training an attack type recognition model, shown in accordance with some embodiments of the present description. As shown in fig. 3, the method 300 may include:

step 302, obtaining a plurality of first training samples, where the first training samples include sample images and first labels, where the first labels represent attack types of sample target objects corresponding to the sample images, and the attack type set at least includes the first labels; the sample images in the plurality of first training samples correspond to the same modality. In some embodiments, this step 302 may be performed by training module 150.

The first training samples may be samples for training an attack type recognition model. In some embodiments, the first training sample includes a sample image and a first label.

In some embodiments, the sample images may correspond to the same modality, e.g., both RGB images, etc., and also, e.g., both IR images.

In some embodiments, the training may be preceded by preprocessing the sample image, the preprocessing including at least: and cutting the sample image. For example, the sample image is cut based on the target of the living body detection and the image corresponding to the living body in the sample image. For example, if the detected target is to determine whether the target object to be detected is a living human face, an image corresponding to the human face in the sample image is cut, and only the human face image is obtained. In some embodiments, the pre-processing may further include: the sample image or the cut sample image (e.g., a face image) is adjusted to a preset resolution. In some embodiments, the preset resolution may be specifically set according to actual requirements. For example 128 x 128.

In some embodiments, the training sample images may include images of all or part of the attack types, and accordingly, the first label includes all or part of the attack types.

In some embodiments, the first label may label an attack type of a sample target object corresponding to a sample image in the first training sample. In some embodiments, the first label may be obtained by manual labeling or automatic labeling.

In some embodiments, training module 150 may obtain the first training sample by reading stored data, calling an associated interface, or otherwise.

And step 304, training to obtain a model in the attack type identification model set based on the plurality of first training samples. In some embodiments, this step 304 may be performed by training module 150.

In some embodiments, the training module 150 may train to derive the models in the attack type recognition model set based on a plurality of first training samples. As previously described, sample images of different modalities may be trained to derive different attack type recognition models. For example, the RGB image training obtains an RGB attack type recognition model. Specifically, for the training of each model in the attack type identification model set, the parameters of the initial attack type identification model may be updated based on the plurality of first training samples, so that the loss function of the model satisfies a preset condition, for example, the loss function converges, or the loss function value is smaller than a preset value. And finishing model training when the loss function meets the preset condition to obtain a trained attack type recognition model.

It is understood that the operations of steps 302 and 304 are performed for any one modality, and an attack type identification model corresponding to any one modality can be obtained. For example, an RGB attack type identification model corresponding to the modality RGB; identifying a model of IR attack types corresponding to the modal IR; and identifying a model by the 3D attack type corresponding to the 3D.

The attack type identification model may also be one model corresponding to all modalities, as depicted in step 204. The model is trained based on sample images of all modalities. Compared with the attack type identification model corresponding to the specific modality, the attack type identification model corresponding to all the modalities has higher accuracy when the prejudgment attack type of the target object to be detected is determined based on the image of the specific modality.

FIG. 4 is a flow diagram of a method of training a liveness detection model in accordance with some embodiments presented herein. As shown in fig. 4, the method 400 may include:

step 402, obtaining a plurality of second training samples, where the second training samples include sample images and second labels, where the second labels represent whether sample target objects corresponding to the sample images are living bodies, and the sample images of the plurality of second training samples correspond to the same modality. In some embodiments, this step 402 may be performed by training module 150.

The second training sample may be a sample for training a biopsy model. In some embodiments, the second training sample includes a sample image and a second label.

In some embodiments, similar to the training attack type recognition model described in fig. 3, the sample images may correspond to the same modality, e.g., both are IR images, etc.

In some embodiments, similar to the training attack type recognition model described in fig. 3, the sample image may be preprocessed before training, and further details regarding preprocessing are referred to in step 302, which are not described herein again.

In some embodiments, the resolution of the images of the training samples of the training liveness detection model may be higher than the sample images of the training attack type recognition model. For example, if the resolution of the sample image in the first training sample is set to 128 × 128 as described above, the resolution of the sample image of the second training sample may be set to 256 × 256. This is because the higher the resolution of the sample image, the better the characteristics of the image learned by the model, the better the model processing, but the slower the model processing speed. The attack type recognition model is only used as an intermediate decision process, not as a final result of whether the target object to be detected is a living body or not, but needs to determine the final result through the living body detection model, so that the resolution of the sample image of the first training sample is set to be lower, and the speed can be improved.

In some embodiments, the second label may represent whether a sample target object corresponding to a sample image in the second training sample is a living body. In some embodiments, the second label may be obtained by manual labeling or automatic labeling.

In some embodiments, training module 150 may obtain the second training sample by reading stored data, calling an associated interface, or otherwise.

In some embodiments, in the training process, when the living body detection models corresponding to different modalities are trained, the occupation ratio of the sample images corresponding to the attack types may be increased based on the main defense objects of the models (i.e., the living body detection models corresponding to different modalities have different defense effects on different attack types as described in step 206), so as to improve the accuracy of the model detection. For example, when the RGB biopsy model is trained, the ratio of the sample images corresponding to the 3D paper mask and the 3D silica gel mask is increased. For another example, when the IR biometric model is trained, the ratio of sample images corresponding to a print paper attack and a 3D paper mask attack is increased. For another example, when the 3D biopsy model is trained, the ratio of sample images corresponding to screen attacks and printing paper attacks is increased.

And 404, training to obtain a model in the living body detection model set based on the plurality of second training samples. In some embodiments, this step 404 may be performed by training module 150.

In some embodiments, the training module may train to derive the model in the in vivo testing model set based on a plurality of second training samples. As previously described, sample images of different modalities may be trained to derive different in vivo testing models. For example, RGB image training obtains an RGB living body detection model. Specifically, for the training of each model in the living body detection model set, the parameters of the initial living body detection model may be updated based on the plurality of second training samples, so that the loss function of the model satisfies a preset condition, for example, the loss function converges, or the loss function value is smaller than a preset value. And finishing model training when the loss function meets the preset condition to obtain a trained living body detection model.

Fig. 5 is a flow chart illustrating step 204 according to some embodiments of the present description. As shown in fig. 5, the flow 500 of step 204 may include:

step 2041, obtaining an image of at least one target modality from the image of at least one modality, where the at least one target modality is matched with a modality of a sample image used for training the attack type recognition model. In some embodiments, this step 2041 may be performed by look-ahead attack type identification module 120.

In some embodiments, the image of the target modality may be an image selected from images of at least one modality that matches the modality of the sample image used by the attack type identification model. As previously mentioned, the attack type identification model may be one or more of a set of attack type identification models. If the number of the images is one, the number of the images of the target modality is also one; if there are a plurality of images, the target modality also has a plurality of images, and the input is also made to correspond to the plurality of images. For example, if the attack type identification model is only a 3D attack type identification model in the attack type identification model set, the image of the target modality is: and imaging the target object to be detected based on the 3D imaging equipment to obtain a 3D image. For another example, if the attack type recognition model is a 3D attack type recognition model and an IR attack type recognition model in the attack type recognition model set, the target modality image includes two images, which are: based on a 3D image obtained by imaging by the 3D imaging device and an IR image obtained by imaging by the IR imaging device, the 3D image is input into the 3D attack type identification model, and the IR image is input into the IR attack type identification model.

Step 2042, inputting the image of the at least one target modality into the corresponding attack type identification model, and obtaining at least one probability distribution, where values in the probability distribution represent the probability that the target object to be detected belongs to elements in the attack type set. In some embodiments, this step 2042 may be performed by look-ahead attack type identification module 120.

In some embodiments, the probability distribution may be an output result of processing the input image of the target modality by the attack type recognition model. In some embodiments, the values in the probability distribution may represent the probability that the target object to be detected belongs to an element in the set of attack types. As mentioned above, the attack type set may be at least formed by the labels of the training samples for training the attack type recognition model, and the attack type set at least includes the first label. For example, the attack type recognition model outputs probability distribution in the form of (X, Y, Z, W), the attack type set includes 4 elements, which are respectively a screen attack, a printing paper attack, a 3D mask attack, and a 3D silica gel mask attack, and X in the attack type set may represent the probability that the target object to be detected belongs to the screen attack; y can represent the probability that the target object to be detected belongs to the attack of the printing paper; z can represent the probability that the target object to be detected belongs to the 3D paper mask attack; w may represent an attack in which the target object to be detected belongs to a 3D silica gel mask. For another example, the attack type recognition model outputs a probability distribution in the form of (X, Y, Z, W), but the attack type set includes ambiguous attack types in addition to the aforementioned 4 attack types, X, Y, Z and W have the same meaning as the aforementioned, and it is ambiguous whether the prejudged attack type is an attack type, and it can be determined according to the values of X, Y, Z and W, for example, whether the maximum value of X, Y, Z and W is greater than a preset threshold, otherwise, it is ambiguous whether the prejudged attack type of the target object to be detected is an attack type.

As mentioned above, the attack type identification model may be one or more of the attack type identification model sets, and if there is one, only one probability distribution is obtained; if there are a plurality of probability distributions, a plurality of probability distributions can be obtained. Continuing with the above example, if the attack type identification model is two models in the attack type identification model set, and is a 3D attack type identification model and an IR attack type identification model, two probability distributions are obtained, which are: and inputting the 3D image into the probability distribution obtained by the 3D attack type identification model and inputting the IR image into the probability distribution obtained by the IR attack type identification model.

Step 2043, determining the pre-judging attack type of the target object to be detected based on the at least one probability distribution. In some embodiments, this step 2043 may be performed by look-ahead attack type identification module 120.

In some embodiments, if at least one probability distribution is only one probability distribution, the predictive attack type identification module 120 may determine the predictive attack type of the target object to be detected based on the probability distribution. For example, the attack type corresponding to the highest probability may be used as the prejudged attack type. For another example, whether the maximum probability value in the probability distribution is greater than a preset threshold value or not can be judged, and if yes, the attack type corresponding to the maximum probability value is determined as the pre-judged attack type of the target object to be detected; if not, the pre-judged attack type of the target object to be detected is a no-clear attack type. In some embodiments, the preset threshold may be specifically set according to actual requirements.

In some embodiments, if at least one probability distribution is a plurality of probability distributions, the predictive attack type identification module 120 may determine the predictive attack type of the target object to be detected based on the plurality of probability distributions. Specifically, fusion operation or/and voting may be performed on the plurality of probability distributions, and the pre-determined attack type of the target object to be detected is determined based on a result of the fusion operation or/and a result of the voting.

In some embodiments, the fusion operation may include, but is not limited to: summing, averaging, etc. any operation that fuses the probability values in different probability distributions. For example, if the probability distribution output by the 3D attack type recognition model is (X1, Y1, Z1, W1) and the probability distribution output by the IR attack type recognition model is (X2, Y2, Z2, W2), the two probability distributions are fused and averaged to obtain a new probability distribution: [ (X1+ X2)/2, (Y1+ Y2)/2, (Z1+ Z2)/2, (W1+ W2)/2 ].

In some embodiments, the prejudgment attack type identification module 120 may determine the prejudgment attack type according to the obtained fusion result, for example, determine whether a maximum probability value in the new probability distribution is greater than a preset threshold, and if so, determine the attack type corresponding to the maximum probability value as the prejudgment attack type of the target object to be detected; if not, the prejudgment of the attack type of the target object to be detected is not clear.

In some embodiments, voting the at least one probability distribution may refer to voting for a predetermined attack type determined by the at least one probability, respectively. For example, the predetermined attack type is determined to be a screen attack based on the probability distribution output by the 3D attack type identification model, the predetermined attack type is determined to be a screen attack based on the probability distribution output by the IR attack type identification model, and the predetermined attack type is determined to be a mask attack based on the probability distribution output by the RGB attack type identification model, so that it can be known that the number of tickets of the screen attack is 2 tickets, and thus, the predetermined attack type is determined to be a screen attack.

Embodiments of the present description also provide a living body detecting apparatus of a target object, including at least one storage medium and at least one processor, the at least one storage medium storing computer instructions; the at least one processor is configured to execute the computer instructions to implement the aforementioned method of in vivo detection of a target object, the method comprising: imaging a target object to be detected to generate an image of at least one modality; identifying the image of the at least one mode based on an attack type identification model, and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set; determining a live body detection model associated with the prejudged attack type; and inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection, and determining whether the target object to be detected is a living body.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.

Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.

Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.

The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.

Computer program code required for operation of portions of the present description may be written in any one or more programming languages, including AN object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional procedural programming language such as C, Visual Basic, Fortran2003, Perl, COBO L, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like.

Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing processing device or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.

For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.

Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims

1. A living body detection method of a target object, comprising:

imaging a target object to be detected to generate an image of at least one modality;

identifying the image of the at least one mode based on an attack type identification model, and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set;

determining a live body detection model associated with the prejudged attack type;

and inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection, and determining whether the target object to be detected is a living body.

2. The method of claim 1, wherein the attack type recognition model is one or more models in a set of attack type recognition models, and training any one model in the set of attack type recognition models comprises:

obtaining a plurality of first training samples, wherein the first training samples comprise sample images and first labels, the first labels represent attack types of sample target objects corresponding to the sample images, and the attack type set at least comprises the first labels; sample images in the plurality of first training samples correspond to the same modality;

and training to obtain the models in the attack type recognition model set based on the plurality of first training samples.

3. The method of claim 1, the associated in-vivo detection model being one or more models of a set of in-vivo detection models, training any one of the set of in-vivo detection models comprising:

acquiring a plurality of second training samples, wherein the second training samples comprise sample images and second labels, the second labels represent whether sample target objects corresponding to the sample images are living bodies, and the sample images of the plurality of second training samples correspond to the same modality;

and training to obtain a model in the living body detection model set based on the plurality of second training samples.

4. The method according to claim 2, wherein the identifying the image of the at least one modality based on the attack type identification model, and determining the prejudged attack type of the target object to be detected corresponding to the image of the at least one modality comprises:

obtaining an image of at least one target modality from the image of the at least one modality, wherein the at least one target modality is matched with a modality of a sample image used for training the attack type recognition model;

inputting the image of the at least one target modality into the corresponding attack type identification model, and acquiring at least one probability distribution, wherein a value in the probability distribution represents the probability that the target object to be detected belongs to an element in the attack type set;

and determining the pre-judging attack type of the target object to be detected based on the at least one probability distribution.

5. The method of claim 4, wherein the attack type recognition models are a plurality of models in an attack type recognition model set, and the determining the prejudged attack type of the target object to be detected based on the at least one probability distribution comprises:

and performing fusion operation or/and voting on the at least one probability distribution, and determining the pre-judging attack type of the target object to be detected based on the result of the fusion operation or/and the result of the voting.

6. The method of claim 2 or 3, the sample images being pre-processed prior to training, the pre-processing comprising at least:

cutting the sample image; or/and

and adjusting the sample image or the cut sample image to a preset resolution.

7. A living body detecting system of a target object, comprising:

the imaging module is used for imaging a target object to be detected and generating an image of at least one modality;

the prejudgment attack type identification module is used for identifying the image of the at least one mode based on an attack type identification model and determining a prejudgment attack type of the target object to be detected corresponding to the image of the at least one mode, wherein the prejudgment attack type is an element in an attack type set;

a live detection model determination module for determining a live detection model associated with the prejudged attack type;

and the living body judgment module is used for inputting the image of at least one mode of the target object to be detected into the associated living body detection model for detection and determining whether the target object to be detected is a living body.

8. The system of claim 7, the attack type recognition model being one or more models of a set of attack type recognition models, the system further comprising a training module to:

9. The system of claim 7, the associated liveness detection model being one or more models of a set of liveness detection models, the training module further to:

10. The system of claim 8, the look-ahead attack type identification module further to:

11. The system of claim 10, the look-ahead attack type identification module further to:

12. The system of claim 8 or 9, the training module further to:

preprocessing the sample image before training, the preprocessing at least comprising:

cutting the sample image; or/and

and adjusting the sample image or the cut sample image to a preset resolution.

13. A liveness detection device of a target object, comprising at least one storage medium for storing computer instructions and at least one processor; the at least one processor is configured to execute the computer instructions to implement the method of any of claims 1-6.