CN113792701A

CN113792701A - Living body detection method and device, computer equipment and storage medium

Info

Publication number: CN113792701A
Application number: CN202111123905.4A
Authority: CN
Inventors: 吴芮葭; 王义飞; 石华峰; 梁鼎; 吴一超
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2021-09-24
Filing date: 2021-09-24
Publication date: 2021-12-14
Anticipated expiration: 2041-09-24
Also published as: CN113792701B

Abstract

The present disclosure provides a method, an apparatus, a computer device and a storage medium for living body detection, wherein the method comprises: acquiring a first detection video corresponding to an object to be detected; acquiring a plurality of first images to be detected from a first detection video, and determining the initial image characteristics of each first image to be detected; the initial image features comprise image features of a first image to be detected in a depth dimension; combining the initial image characteristics of each first image to be detected according to the image sequence of a plurality of first images to be detected in the first detection video to obtain the target image characteristics; and determining the living body detection result of the object to be detected in the first detection video based on the target image characteristics. The method and the device for detecting the living body detection object in the first image to be detected detect the object to be detected in the first image to be detected based on the target image characteristics obtained by combining the plurality of initial image characteristics, and accuracy of a determined living body detection result is improved.

Description

Living body detection method and device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of biopsy and image processing technologies, and in particular, to a biopsy method, an apparatus, a computer device, and a storage medium.

Background

The wide application of the face recognition technology greatly facilitates the life of people, but at the same time, various attack modes aiming at the face recognition technology begin to appear, and the recognition result of the face recognition is influenced in various face counterfeiting modes, for example, a real face is simulated by using a photo copying mode, a 3D mask mode and the like so as to realize the face recognition.

In the prior art, the problem that the accuracy of face recognition is not high due to the fact that the attack mode is difficult to effectively deal with exists.

Disclosure of Invention

The disclosed embodiments provide at least a living body detection method, an apparatus, a computer device and a storage medium.

In a first aspect, an embodiment of the present disclosure provides a method for detecting a living body, including:

acquiring a first detection video corresponding to an object to be detected;

acquiring a plurality of first images to be detected from a first detection video, and determining the initial image characteristics of each first image to be detected; the initial image features comprise image features of the first image to be detected in a depth dimension;

combining the initial image features of each first image to be detected according to the image sequence of the plurality of first images to be detected in the first detection video to obtain target image features;

and determining the living body detection result of the object to be detected in the first detection video based on the target image characteristics.

In this embodiment, based on the acquired initial image features of the first image to be detected, not only can the basic image features of the first image to be detected be obtained, for example, image features that can represent visual features of the first image to be detected and/or semantic features of each pixel point in the first image to be detected be obtained, but also image features in a depth dimension can be obtained, so that the richness of the acquired image features is improved, and then the plurality of initial image features are combined based on the image sequence of the plurality of first images to be detected, so that target image features containing time sequence information can be obtained. Furthermore, the object to be detected in the first image to be detected is detected through the combined target image features, so that the difference information between the target image features and the target image features corresponding to the real object to be detected can be determined based on the target image features, wherein the difference information can comprise the difference of time sequence information and the difference information between the image features, and further, the truth of the object to be detected can be accurately determined based on the determined difference information, so that an accurate in-vivo detection result is obtained.

In a possible embodiment, the first detection video comprises images of the object to be detected taken under a plurality of color light sources;

the acquiring a plurality of first images to be detected from a first detection video includes:

and screening at least one frame of image corresponding to each color light source in the multiple color light sources from the first detection video to obtain multiple first images to be detected.

The reflected effects of the light sources with different colors projected on the object to be detected are different, so that the image characteristics of the corresponding first images to be detected of the same object to be detected under the light sources with different colors are different, the difference of the action information between the objects to be detected in each first image to be detected and the difference between the image characteristics can be accurately determined based on the obtained first images to be detected under the light sources with different colors, and further, the living body detection result of the object to be detected can be accurately determined based on the determined differences and the difference which the real object to be detected should have.

In a possible implementation manner, before the determining, based on the target image feature, a living body detection result of the object to be detected in the first detection video, the method further includes:

acquiring at least one frame of second image to be detected from the first detection video, respectively detecting key points of each frame of second image to be detected, and respectively determining the key points corresponding to the first preset part of the object to be detected in each frame of second image to be detected;

selecting a preset number of target key points from the key points;

determining a first detection probability corresponding to the object to be detected in each frame of second image to be detected based on each frame of second image to be detected and the target key point corresponding to each frame of second image to be detected;

the determining a living body detection result of the object to be detected in the first detection video based on the target image feature includes:

and determining the living body detection result of the object to be detected in the first detection video based on the first detection probability corresponding to each frame of second image to be detected and the target image characteristics.

The characteristic information of the pixel points corresponding to the target key points can prominently reflect the image characteristics corresponding to the object to be detected, and the key point detection is carried out on the single-frame second image to be detected, so that the first preset part of the object to be detected in the image and the corresponding key points can be determined, and the accurate target key points can be screened out. Then, the object to be detected is detected based on the target key point and the second image to be detected, so that the attention to the global image characteristics corresponding to the second image to be detected and the attention to the local image characteristics corresponding to the target key point can be realized in the detection process, and therefore, the accurate first detection probability can be obtained.

In a possible implementation manner, determining, based on the target key point and the second image to be detected, a first detection probability corresponding to an object to be detected in the second image to be detected includes:

determining a preset coordinate corresponding to the first preset part; determining actual coordinates corresponding to the first preset part based on coordinates of pixel points corresponding to the target key points;

determining a target conversion relation based on the preset coordinates and the actual coordinates;

based on the target conversion relation, performing coordinate conversion on the pixel point corresponding to each key point;

and determining a first detection probability of the object to be detected in the second image to be detected based on the second image to be detected and each key point after coordinate conversion.

The preset coordinates are coordinates corresponding to target key points corresponding to a first preset part of an object to be detected in a second image to be detected, which are expected to be obtained, the actual coordinates are real coordinates corresponding to the target key points corresponding to the first preset part of the object to be detected in the second image to be detected, coordinate conversion of pixel points corresponding to each key point can be achieved through a determined target conversion relation, the coordinates meeting the expectation are obtained, the object to be detected is detected through the converted coordinates, and due to the fact that the converted coordinates meet the expectation, the learning difficulty of a deep learning network can be reduced, and therefore the accuracy of the determined first detection probability is improved.

In a possible implementation manner, the determining, based on the second image to be detected and each of the key points after coordinate transformation, a first detection probability of the object to be detected in the second image to be detected includes:

based on the target conversion relation, performing coordinate conversion on the second image to be detected to obtain a coordinate conversion image;

based on the coordinate conversion image and each key point after coordinate conversion, intercepting a target image corresponding to a first preset part of the object to be detected from the coordinate conversion image;

and determining a first detection probability of the object to be detected in the second image to be detected based on the target image and the second image to be detected.

Based on each key point after coordinate conversion and the coordinate conversion image, a target image corresponding to a first preset part of the object to be detected, which is expected, can be obtained, based on the target image, local image characteristics, which are expected, can be determined, based on the second image to be detected, global image characteristics before conversion can be determined, and further, the difference between the characteristics of the object to be detected and real characteristics can be better determined through the image characteristics corresponding to the two images, so that accurate first detection probability is obtained.

In a possible embodiment, the determining a first detection probability of the object to be detected in the second image to be detected based on the target image and the second image to be detected includes:

determining first channel characteristic information of the target image and second channel characteristic information of the second image to be detected;

splicing the first channel characteristic information and the second channel characteristic information to obtain third channel characteristic information;

and determining a first detection probability of the object to be detected in the second image to be detected based on the third channel characteristic information.

By splicing the channel characteristic information, the third channel characteristic information comprising the local channel characteristic information corresponding to the target image and the global channel characteristic information corresponding to the second image to be detected can be obtained, the comprehensiveness of the channel characteristic information for detecting the object to be detected is improved, and therefore the more accurate first detection probability is favorably obtained. And the spliced third channel characteristic information is processed, so that the synchronous processing of the first channel characteristic information and the second channel characteristic information can be realized, and the detection efficiency is improved.

In a possible implementation manner, the determining a living body detection result of the object to be detected in the first detection video based on the first detection probability corresponding to each frame of the second image to be detected and the target image feature includes:

determining a second detection probability corresponding to the object to be detected based on the target image characteristics;

and determining the living body detection result of the object to be detected based on the first detection probability and the second detection probability.

Based on the target image characteristics, the fused second detection probability of the object to be detected corresponding to the multiple first images to be detected can be determined, and then the living body detection result is determined based on the first detection probability corresponding to the single-frame second image to be detected and the fused second detection probability, so that the determined living body detection result can reflect image characteristics under multiple dimensions, such as depth dimension, channel dimension and the like, and the reasonability and the accuracy of the determined living body detection result are improved.

In a possible implementation manner, the determining a living body detection result of the object to be detected based on the first detection probability and the second detection probability includes:

acquiring a first preset weight corresponding to a first detection probability and a second preset weight corresponding to a second detection probability;

determining a target probability based on the first detection probability, the first preset weight, the second detection probability and the second preset weight;

and under the condition that the target probability is greater than a preset threshold value, determining that the living body detection result comprises that the object to be detected is a living body object.

The detection probabilities are fused based on the preset weight to determine the in-vivo detection result, so that the accuracy and the reasonability of the determined in-vivo detection result can be effectively improved.

In a possible implementation manner, the acquiring a first detection video corresponding to an object to be detected includes:

acquiring a second detection video corresponding to the object to be detected;

determining an initial detection result of the object to be detected based on the second detection video and a preset action sequence;

and acquiring the first detection video corresponding to the object to be detected under the condition that the initial detection result indicates that the object to be detected is a living object.

Through the second detection video, the object to be detected can be pre-detected, and under the condition that the object to be detected does not accurately execute any preset action corresponding to the preset action sequence, the authenticity of the object to be detected can be directly determined, so that the efficiency and the accuracy of determining the living body detection result are improved.

In a possible implementation manner, the determining an initial detection result of the object to be detected based on the second detection video and a preset action sequence includes:

performing action identification on the to-be-detected object in each frame of third to-be-detected image in the second detection video to obtain the to-be-detected action corresponding to the to-be-detected object in the third to-be-detected image;

and determining an initial detection result of the object to be detected based on the action to be detected corresponding to each frame of the third image to be detected and the preset action sequence.

The action recognition is carried out on the object to be detected, the action to be detected executed by the object to be detected can be accurately determined, and then whether the preset action corresponding to the preset action sequence is executed by the object to be detected or not can be accurately determined through the action to be detected and the preset action sequence, so that the accurate initial detection result is determined.

In one possible embodiment, the preset action sequence comprises a first target action sequence; the action to be detected comprises a first action to be detected;

the determining the initial detection result of the object to be detected based on the action to be detected corresponding to each frame of the third image to be detected and the preset action sequence comprises the following steps:

screening a fourth image to be detected, of which the first action to be detected is matched with the first target action sequence, from the third image to be detected based on the first action to be detected and the first target action sequence of the object to be detected in each frame of the third image to be detected;

and determining the initial detection result of the object to be detected in the second detection video based on the number of the fourth images to be detected and the number of the third images to be detected.

Executing each preset action corresponding to the preset action sequence requires a certain execution time, that is, if the preset action corresponding to the preset action sequence is executed by the object to be detected in the second detection video, the preset action corresponds to a plurality of fourth images to be detected, which are matched with the execution time, so that the initial detection result of the object to be detected can be accurately determined based on the determined number of the fourth images to be detected and the determined number of the third images to be detected.

In a possible embodiment, the preset action sequence comprises a second target action sequence; the action to be detected comprises a second action to be detected;

determining an angle numerical value of a second preset part of the object to be detected corresponding to a preset angle based on a second action to be detected of the object to be detected in each frame of the third image to be detected;

determining a maximum angle numerical value and a minimum angle numerical value corresponding to the object to be detected based on the angle numerical value corresponding to the object to be detected in each frame of the third image to be detected;

and determining an initial detection result of the object to be detected in the second detection video based on the maximum angle value, the minimum angle value and the angle difference value corresponding to the second target action sequence.

Based on the determined maximum angle value and the determined minimum angle value, the value difference between the two angle values can be determined, and then the size relation between the two angle values can be accurately determined by comparing the angle difference value and the value difference, so that an accurate initial detection result is obtained.

In a possible implementation manner, after the determining the living body detection result of the object to be detected in the first detection video, the method further includes:

and under the condition that the object to be detected is a living object, determining that the object to be detected has a preset authority, and triggering a function matched with the preset authority, wherein the preset authority comprises at least one of access to an access control system, access to first target content, editing second target content and executing transaction operation.

In a second aspect, an embodiment of the present disclosure further provides a living body detection apparatus, including:

the acquisition module is used for acquiring a first detection video corresponding to an object to be detected;

the first determining module is used for acquiring a plurality of first images to be detected from the first detection video and determining the initial image characteristics of each first image to be detected; the initial image features comprise image features of the first image to be detected in a depth dimension;

the combination module is used for combining the initial image characteristics of each first image to be detected according to the image sequence of the first images to be detected in the first detection video to obtain the target image characteristics;

and the second determination module is used for determining the living body detection result of the object to be detected in the first detection video based on the target image characteristics.

the first determining module is configured to screen at least one frame of image corresponding to each color light source of the multiple color light sources from the first detection video to obtain multiple first images to be detected.

In one possible implementation, the apparatus further includes a third determining module:

the third determining module is configured to, before determining a living body detection result of the object to be detected in the first detection video based on the target image feature, acquire at least one frame of second image to be detected from the first detection video, perform key point detection on each frame of the second image to be detected, and determine key points corresponding to a first preset portion of the object to be detected in each frame of the second image to be detected;

selecting a preset number of target key points from the key points;

the second determining module is configured to determine a living body detection result of the object to be detected in the first detection video based on the first detection probability corresponding to each frame of the second image to be detected and the target image feature.

In a possible implementation manner, the third determining module is configured to determine a preset coordinate corresponding to the first preset portion; determining actual coordinates corresponding to the first preset part based on coordinates of pixel points corresponding to the target key points;

In a possible implementation manner, the third determining module is configured to perform coordinate transformation on the second image to be detected based on the target transformation relationship to obtain a coordinate transformation image;

In a possible implementation manner, the third determining module is configured to determine first channel feature information of the target image and second channel feature information of the second image to be detected;

In a possible implementation manner, the second determining module is configured to determine, based on the target image feature, a second detection probability corresponding to the object to be detected;

In a possible implementation manner, the second determining module is configured to obtain a first preset weight corresponding to a first detection probability and a second preset weight corresponding to a second detection probability;

In a possible implementation manner, the obtaining module is configured to obtain a second detection video corresponding to the object to be detected;

In a possible implementation manner, the obtaining module is configured to perform motion recognition on an object to be detected in each frame of a third image to be detected in the second detection video, so as to obtain a motion to be detected corresponding to the object to be detected in the third image to be detected;

the acquisition module is configured to screen a fourth image to be detected, where the first motion to be detected matches the first target motion sequence, from the third image to be detected based on the first motion to be detected and the first target motion sequence of the object to be detected in each frame of the third image to be detected;

the acquisition module is used for determining an angle numerical value of a second preset part of the object to be detected, which corresponds to a preset angle, based on a second action to be detected of the object to be detected in each frame of the third image to be detected;

In a possible embodiment, the apparatus further comprises:

and the processing module is used for determining that the object to be detected has a preset authority and triggering a function matched with the preset authority under the condition that the object to be detected is a living object after the living detection result of the object to be detected in the first detection video is determined, wherein the preset authority comprises at least one of access to an access control system, access to first target content, editing second target content and executing transaction operation.

In a third aspect, this disclosure also provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the processor, the machine-readable instructions are executed by the processor to perform the steps in the first aspect or any one of the possible implementations of the first aspect.

In a fourth aspect, this disclosure also provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.

For the description of the effects of the above-described living body detecting apparatus, the computer device, and the computer-readable storage medium, reference is made to the description of the above-described living body detecting method, which is not repeated herein.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

FIG. 1 illustrates a flow chart of a method of live detection provided by an embodiment of the present disclosure;

fig. 2 shows a flowchart of a method for acquiring a first detection video according to an embodiment of the present disclosure;

fig. 3 is a flowchart illustrating a method for determining a first detection probability corresponding to an object to be detected in a second image to be detected according to an embodiment of the present disclosure;

fig. 4 is a schematic flowchart illustrating a specific implementation of determining a live detection result of an object to be detected according to an embodiment of the present disclosure;

FIG. 5 illustrates a schematic view of a living body detection apparatus provided by an embodiment of the present disclosure;

fig. 6 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

Furthermore, the terms "first," "second," and the like in the description and in the claims, and in the drawings described above, in the embodiments of the present disclosure are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein.

Reference herein to "a plurality or a number" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

Research shows that the wide application of the face recognition technology greatly facilitates the life of people, but various attack modes aiming at the face recognition technology begin to appear at the same time, and the recognition result of the face recognition is influenced in various face counterfeiting modes, for example, real faces are simulated by using photo reproduction, 3D masks and other modes to pass the face recognition.

Based on the above research, the present disclosure provides a living body detection method, an apparatus, a computer device, and a storage medium, based on an initial image feature of an acquired first image to be detected, not only can a basic image feature of the first image to be detected be obtained, for example, an image feature that can represent a visual feature of the first image to be detected and/or a semantic feature of each pixel point in the first image to be detected, but also an image feature in a depth dimension can be obtained, so that richness of the acquired image feature is improved, and then a plurality of initial image features are combined based on an image sequence of a plurality of first images to be detected, so that a target image feature including time sequence information can be obtained. Furthermore, the object to be detected in the first image to be detected is detected through the combined target image features, so that the difference information between the target image features and the target image features corresponding to the real object to be detected can be determined based on the target image features, wherein the difference information can comprise the difference of time sequence information and the difference information between the image features, and further, the truth of the object to be detected can be accurately determined based on the determined difference information, so that an accurate in-vivo detection result is obtained.

The defects of the application schemes of the current face recognition technology are the results obtained after the inventor practices and researches carefully, so the discovery process of the above problems and the solution proposed by the present disclosure in the following aiming at the above problems should be the contribution of the inventor to the present disclosure in the process of the present disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

It should be noted that specific terms mentioned in the embodiments of the present disclosure include:

MobileNet network: the method is a lightweight convolutional network which is concentrated in a mobile terminal or embedded equipment, and compared with the traditional convolutional neural network, the method greatly reduces model parameters and operation amount on the premise of low accuracy.

And (3) SDK: software Development Kit, a collection of Development tools used by Software engineers to build application Software for a particular Software package, Software framework, hardware platform, operating system, etc.

To facilitate understanding of the present embodiment, a detailed description is first given of a living body detection method disclosed in the embodiments of the present disclosure, where an execution subject of the living body detection method provided in the embodiments of the present disclosure is generally a computer device with certain computing power, and in some possible implementations, the living body detection method may be implemented by a processor calling a computer-readable instruction stored in a memory.

The living body detection method provided by the embodiment of the present disclosure is described below by taking an execution subject as a computer device as an example.

As shown in fig. 1, a flowchart of a method for detecting a living body provided by an embodiment of the present disclosure may include the following steps:

s101: and acquiring a first detection video corresponding to the object to be detected.

Here, the object to be detected may be an object that needs face recognition authentication, for example, a residential house that needs to pass through an access control recognition system, a customer that needs face payment, and the like. The first detection video may be a video including an object to be detected, which is captured by using a camera device corresponding to the face recognition device, where a capturing duration corresponding to the video to be detected may be set according to a recognition requirement, where the recognition requirement may include a recognition security level, a current recognition environment (night recognition or day recognition), and the like, for example, when the recognition security level is high and/or the recognition environment is night, the capturing duration may be longer, for example, 3 seconds, 5 seconds, and the like; in the case of identifying a low security level and/or identifying an environment as daytime, the shooting duration may be relatively short, such as 1.5 seconds, 2 seconds, and so on.

In specific implementation, under the condition that the object to be detected needs to be subjected to face identification authentication, the face identification can be performed on the face identification device, and then the video containing the object to be detected can be shot by the corresponding camera device of the face identification device, that is, the first detection video corresponding to the object to be detected is obtained.

S102: acquiring a plurality of first images to be detected from the first detection video, and determining the initial image characteristics of each first image to be detected.

The initial image features comprise image features of the first image to be detected in the depth dimension.

Here, the first detection video may include a plurality of frames of images to be detected, and the number of the images to be detected may be determined by the duration and the size corresponding to the first detection video.

The initial image feature is an image feature corresponding to the first image to be detected, and specifically, may include an image feature of the first image to be detected in a depth dimension and an image feature of the first image to be detected in a base dimension. The image features in the depth dimension may include depth image feature information of a first image to be detected, and the image features in the base dimension may include image features of the first image to be detected corresponding to four dimensions of the NCHW, where the N dimension represents the number of images, the C dimension represents an image channel, the H dimension represents an image height, and the W dimension represents an image width. In addition, the image characteristics of the first image to be detected in the image channel dimension may include color values (such as RGB values) of the pixels, position information, structural information, and the like of the pixels, where the structural information of the pixels is used to represent a relevant position relationship between the pixels. For example, the pixel point a and the pixel point B are two adjacent pixel points, the pixel point a and the pixel point B are located in the same row, and the column of the pixel point a is located on the left of the column of the pixel point B.

In specific implementation, after the first detection video is acquired, a plurality of first images to be detected can be acquired from a plurality of frames of images to be detected corresponding to the first detection video. For example, based on the image sequence of each image to be detected in the plurality of images to be detected appearing in the first detection video and the preset screening sequence, a plurality of images to be detected which match the preset screening sequence are screened out. The preset screening order corresponds to the image sequence of each first image to be detected to be screened. Specifically, the image sequence of each image to be detected appearing in the first detection video can be used as the image sorting order of each image to be detected, and then, each image to be detected with the image sorting order matched with the preset sorting order can be screened out from the multi-frame images to be detected corresponding to the first detection video by using the preset screening order and the image sorting order, and each screened image to be detected is used as the first image to be detected.

Or, determining the screening time period to which each image to be detected belongs according to the shooting time of each image to be detected in the plurality of images to be detected and a plurality of preset screening time periods, and then respectively screening at least one image to be detected from the images to be detected corresponding to each screening time period to obtain a plurality of screened first images to be detected.

Furthermore, for each first image to be detected, image feature information of the first image to be detected may be extracted to obtain image features of the first image to be detected corresponding to different dimensions, and the image features are used as initial image features of the first image to be detected.

S103: and combining the initial image characteristics of each first image to be detected according to the image sequence of the plurality of first images to be detected in the first detection video to obtain the target image characteristics.

Here, the target image feature may be an image feature including initial image features corresponding to the plurality of first images to be detected and higher-dimensional features corresponding to the plurality of initial image features, respectively.

In this step, based on the first to-be-detected video, an image sequence corresponding to each first to-be-detected image in the plurality of first to-be-detected images can be determined, wherein the image sequence is the sequence of the first to-be-detected images appearing in the first detection video. Specifically, the image order of each first image to be detected in the first detection video may be determined based on the time when each first image to be detected appears in the first detection video.

Then, the initial image features of each first image to be detected may be combined in the depth dimension according to the image sequence of each first image to be detected, and the combination may be such that a plurality of initial image features are arranged together, thereby obtaining the target image features corresponding to a plurality of first images to be detected. In addition, in the process of combining the initial image features, the image features with higher dimensionality can be extracted from a plurality of initial image features to obtain image features corresponding to higher dimensionality, and then the obtained image features corresponding to higher dimensionality and the combined initial image features can be used as target image features.

In another embodiment, the multiple first images to be detected may be directly fused in the depth dimension according to the image sequence of the multiple first images to be detected, so as to obtain a fused target detection image. Specifically, the features corresponding to the same attribute of each first image to be detected may be fused in the depth dimension according to the image sequence, for example, depth feature information corresponding to the color attribute of the pixel points at the same position may be fused in the depth dimension, and then, the fused target detection image may be obtained. The target detection image can reflect depth change information corresponding to the object to be detected in the first image to be detected.

In addition, in an embodiment, the multiple first images to be detected may be fused in a time-series dimension, and specifically, the time-series feature corresponding to the multiple initial image features in time series may be determined based on the time-series information corresponding to each initial image feature and each initial image feature. For example, the feature difference between two time-series adjacent initial image features corresponding to the time-series information is determined, then the feature difference between every two time-series adjacent initial image features can be obtained, and then all the obtained feature differences can be used as final time-series features. Then, the determined time-series characteristic and the image characteristic corresponding to each initial image characteristic can be used as the target image characteristic.

S104: and determining the living body detection result of the object to be detected in the first detection video based on the target image characteristics.

Here, the living body detection result is used to represent whether the object to be detected in the first detection video can pass the face authentication, that is, to represent whether the object to be detected is a legal living body object. Specifically, the living body detection result may include both a result that the object to be detected is a legitimate living body object and a result that the object to be detected is an illegitimate living body object. In addition, in an embodiment, the living body detection result may further include that it is not possible to determine whether the object to be detected is a legitimate living body object.

In specific implementation, the target image features may be input into a first neural network trained in advance, the first neural network is used to process the target image features, and a second detection probability that the output target image features are standard target image features corresponding to an object to be detected is predicted, where the first neural network trained in advance may be a lightweight 3D convolutional network, such as a Multi-Fiber network. And the second detection probability is the probability that the object to be detected belongs to a legal object.

The legal object is a real object which is passed through the living body verification and is the object to be detected; the illegal object is an object which does not pass the living body authentication, for example, the illegal object may be an object which does not store authentication information used for living body authentication, or may be an object in a photograph, an object wearing a mask, or the like.

In addition, the first neural network may further output a third detection probability corresponding to the object to be detected, where the third detection probability is a probability that the object to be detected does not belong to a legal object, and a sum of the second detection probability and the third detection probability is equal to 1.

In another embodiment, if the target detection image is obtained, the target detection image may also be directly input to a first neural network trained in advance, and the first neural network is used to process the target detection image, so as to obtain a second detection probability corresponding to the object to be detected in the target detection image.

Or the target detection image and the target image features can be input into a first neural network trained in advance, and the first neural network is used for synchronously processing the target detection image and the target image features to obtain a second detection probability corresponding to the object to be detected.

And finally, determining whether the object to be detected in the first detection video is a legal living object or not according to the determined second detection probability, thereby obtaining a living detection result of the object to be detected. For example, it may be determined that the object to be detected is a legal living object if it is determined that the second detection probability is greater than the preset probability threshold, and otherwise, it is determined that the object to be detected is an illegal living object.

In one embodiment, the first detection video includes images of the object to be detected taken under a plurality of color light sources. Here, the reflected effects of the light sources with different colors projected on the object to be detected are different, so that the image characteristics of the corresponding first images to be detected of the same object to be detected under the light sources with different colors are different, the difference of the motion information between the objects to be detected in each first image to be detected and the difference between the image characteristics can be accurately determined based on the obtained first images to be detected under the light sources with different colors, and further, the living body detection result of the object to be detected can be accurately determined based on the determined difference and the difference which the real object to be detected should have.

In the specific implementation process, in the process of carrying out face authentication on the object to be detected in front of the face recognition device, the recognition screen corresponding to the face recognition device is controlled to present different colors at certain time intervals, so that the image pickup device of the face recognition device can shoot an image to be detected corresponding to the object to be detected shot under various color light sources, and a first detection video is obtained. Here, the obtained first detection video is a dazzling video, and may include a preset number of color light sources with different colors and at least one to-be-detected image corresponding to each color light source, where a certain difference needs to exist between colors corresponding to each color light source. For example, 5 color light sources may be included, such as yellow, blue, green, red, and black in sequence, the shooting duration corresponding to each color light source is 0.6 seconds, and the shooting duration corresponding to the first detection video is 3 seconds, which includes 25 frames of images to be detected.

Then, for S102, at least one frame of image corresponding to each color light source of the multiple color light sources may be screened from the first detection video, so as to obtain multiple first images to be detected.

Here, at least one frame of image to be detected can be selected from the images to be detected corresponding to each color light source in the multiple color light sources, so that at least one first image to be detected corresponding to each color light source can be obtained. Continuing the above example, a frame of image to be detected can be selected from the images to be detected corresponding to the color light sources of the five colors, respectively, to obtain a first image to be detected corresponding to each color light source.

In addition, in an embodiment, for S101, the first detection video may also be obtained according to a method shown in fig. 2, and as shown in fig. 2, a flowchart of a method for obtaining a first detection video provided for an embodiment of the present disclosure may include the following steps:

s201: and acquiring a second detection video corresponding to the object to be detected.

Here, the second detection video may be a video including an object to be detected, which is acquired before the first detection video is acquired.

S202: and determining an initial detection result of the object to be detected based on the second detection video and the preset action sequence.

Here, the preset action sequence is a sequence corresponding to a preset series of preset actions for authenticating the object to be detected. The initial detection result is also used to represent whether the object to be detected in the second detection video is a living object, and specifically, the initial detection result may include two results, that the object to be detected is a living object and that the object to be detected is not a living object.

In this step, the action to be detected executed by the object to be detected in the video may be determined according to the acquired second detection video, then, the action to be detected is compared with the preset action corresponding to the preset action sequence, whether the action to be detected matches any preset action corresponding to the preset action sequence is determined, that is, whether the action to be detected matches the preset action sequence is determined, and an initial detection result of the object to be detected is determined based on the determination result.

In specific implementation, the initial detection result of the object to be detected can be determined according to the following steps:

step one, performing action recognition on an object to be detected in each frame of a third image to be detected in a second detection video to obtain a action to be detected corresponding to the object to be detected in the third image to be detected.

Here, for each frame of the third image to be detected in the second detection video, the motion detection model may be used to perform motion recognition on each frame of the third image to be detected, and determine the motion to be detected corresponding to the object to be detected in each frame of the third image to be detected.

And secondly, determining an initial detection result of the object to be detected based on the action to be detected and a preset action sequence corresponding to each frame of the third image to be detected.

In this step, for each frame of the third image to be detected, the action to be detected corresponding to the third image to be detected and the preset action corresponding to the preset action sequence can be compared, and it is determined whether the action to be detected corresponding to the third image to be detected matches with any preset action corresponding to the preset action sequence, so as to obtain the action detection result corresponding to the third image to be detected.

Further, action detection results corresponding to each third image to be detected can be obtained, and an initial detection result corresponding to the object to be detected is determined according to each action detection result. For example, the number of the contained actions to be detected and the number of the third images to be detected matched with the preset action sequence can be determined according to the action detection result corresponding to each third image to be detected, and the proportion of the contained actions to be detected and the third images to be detected matched with the preset action sequence can be determined according to the number and the total number of the third images to be detected. And then determining that the object to be detected is a legal object under the condition that the proportion is determined to exceed the preset proportion value, otherwise, determining that the object to be detected is an illegal object, and determining an initial detection result corresponding to the object to be detected.

S203: and under the condition that the initial detection result indicates that the object to be detected is a living object, acquiring a first detection video corresponding to the object to be detected.

In this step, when the initial detection result indicates that the object to be detected is not a living object, it indicates that the object to be detected does not belong to a legal living object, and may be an object that is desired to be identified by a face falsification method, and therefore, it can be directly determined that the living detection result corresponding to the object to be detected is: the object to be detected is an illegal living object, and further, the object to be detected is not allowed to pass through face recognition authentication.

Under the condition that the initial detection result indicates that the object to be detected is a living object, the possibility that the object to be detected is a legal living object is indicated, and the final living object detection result can be determined only by further detecting the object to be detected, so that a first detection video corresponding to the object to be detected can be obtained, and the object to be detected is further identified by utilizing the first detection video. Therefore, the second detection video can be used for realizing the pre-detection of the object to be detected, and under the condition that the object to be detected does not accurately execute the preset action corresponding to the preset action sequence, the authenticity of the object to be detected can be directly determined, so that the efficiency of determining the living body detection result is improved.

In addition, in one embodiment, the second detection video and the first detection video may be the same video, and may be the same dazzling video. Therefore, the colorful video can be used for detecting the action of the object to be detected to determine the initial detection result of the object to be detected, and the colorful video is continuously used for further detecting the object to be detected under the condition that the initial detection result indicates that the object to be detected is the living object to determine the final living body detection result. Or, in the case that the initial detection result indicates that the object to be detected is not a living object, directly determining a final living detection result corresponding to the object to be detected. Therefore, the final in-vivo detection result corresponding to the object to be detected can be determined only by acquiring one video to be detected corresponding to the object to be detected, and the quantity of video resources required to be acquired is reduced.

In one embodiment, the preset action sequence comprises a first target action sequence; the action to be detected comprises a first action to be detected.

Here, the first target action sequence may be a sequence corresponding to an action that needs to be performed by a specified object to be detected, and includes a plurality of first preset actions, and the first action to be detected may be an action that is performed by the object to be detected with respect to the first preset action corresponding to the first target action sequence.

Aiming at the second step, determining an initial detection result of the object to be detected based on the action to be detected and the preset action sequence corresponding to each frame of the third image to be detected, which can be executed according to the following steps:

s1: and screening a fourth image to be detected with the first action to be detected matched with the first target action sequence from the third image to be detected based on the first action to be detected and the first target action sequence of the object to be detected in each frame of the third image to be detected.

Here, the first target motion sequence may be one of motion sequences corresponding to a blinking motion and a mouth opening motion, and in specific implementation, the motion detection SDK may be used to perform motion detection on each frame of the third image to be detected, determine an execution result of a first to-be-detected motion of the object to be detected in each frame of the third image to be detected, and further, based on the determined execution result, screen out a fourth image to be detected in which the first to-be-detected motion matches the first target motion sequence from the third image to be detected, that is, screen out a fourth image to be detected in which the first to-be-detected motion matches a first preset motion corresponding to the first target motion sequence from the third image to be detected.

S2: and determining an initial detection result of the object to be detected in the second detection video based on the number of the fourth images to be detected and the number of the third images to be detected.

Here, a ratio corresponding to the two numbers may be determined according to the number of the fourth images to be detected and the number of the third images to be detected, and then, in a case where it is determined that the ratio is greater than a preset ratio, it is determined that the object to be detected is a legal living object.

For example, in the case that the first target motion sequence is a motion sequence corresponding to a blinking motion and a mouth opening motion, whether the object to be detected in each frame of the third image to be detected performs the blinking and mouth opening motions may be determined by using the motion detection SDK, the number of the third images to be detected performing the blinking and mouth opening motions may be determined, and then, the ratio of the number to the total number of the third images to be detected may be determined, and the initial detection result of the object to be detected may be determined based on the ratio.

In another embodiment, the preset action sequence comprises a second target action sequence; the actions to be detected comprise a second action to be detected.

s3: and determining an angle numerical value of a second preset part of the object to be detected corresponding to the preset angle based on a second action to be detected of the object to be detected in each frame of the third image to be detected.

In this step, the second target motion sequence can be at least one of the motion sequences corresponding to the nodding motion and the shaking motion respectively, and includes a plurality of second preset motions, the second preset position can be the face of the object to be detected, and the preset angle can include the pitch angle pitch matched with the nodding motion and the yaw angle yaw matched with the shaking motion. Specifically, the face pose estimation SDK may be used to identify an object to be detected in the third image to be detected, and determine angle values of the second preset portion corresponding to each preset angle when the object to be detected in each frame of the third image to be detected performs the second action to be detected matched with the second target action sequence.

S4: and determining the maximum angle numerical value and the minimum angle numerical value corresponding to the object to be detected based on the angle numerical value corresponding to the object to be detected in each frame of the third image to be detected.

Here, for the nodding action corresponding to the second target action sequence, a maximum angle value and a minimum angle value corresponding to the nodding action may be determined based on the determined angle value corresponding to the second preset portion of the object to be detected in each third image to be detected matching with the nodding action.

Similarly, for the head shaking motion corresponding to the second target motion sequence, the maximum angle value and the minimum angle value corresponding to the head shaking motion can be determined based on the determined angle value corresponding to the second preset part of the object to be detected in each third image to be detected matched with the head shaking motion.

S5: and determining an initial detection result of the object to be detected in the second detection video based on the maximum angle value, the minimum angle value and the angle difference value corresponding to the second target action sequence.

Here, the angle difference value corresponding to the second target motion sequence may include a first angle difference value corresponding to the nodding motion and a second preset difference value corresponding to the panning motion, for example, the first angle difference value may be an angle difference value between a first second preset motion and a last second preset motion corresponding to the nodding motion in the second target motion sequence, and the first angle difference value may be an angle difference value between one second preset motion and a last second preset motion corresponding to the panning motion in the second target motion sequence.

Then, for the nodding action corresponding to the second target action sequence, a first difference value between a maximum angle value and a minimum angle value corresponding to the nodding action may be determined, and then, under the condition that it is determined that the first difference value is greater than the first angle difference value, it is determined that the object to be detected performs a second preset action (nodding action) corresponding to the second target action sequence, otherwise, it is determined that the object to be detected does not perform the second preset action (nodding action) corresponding to the second target action sequence.

And determining a second difference value between a maximum angle value and a minimum angle value corresponding to the head shaking motion aiming at the head shaking motion corresponding to the second target motion sequence, and then determining that the object to be detected executes a second preset motion (head shaking motion) corresponding to the second target motion sequence under the condition that the second difference value is greater than the second angle difference value, otherwise, determining that the object to be detected does not execute the second preset motion (head shaking motion) corresponding to the second target motion sequence.

Furthermore, in a case that the second target action sequence only includes one of the action sequences corresponding to the nodding action or the nodding action, if it is determined that the object to be detected performs the second preset action corresponding to the second target action sequence, it may be determined that the object to be detected has an initial detection result that is a valid living object, and otherwise, it is determined that the object to be detected is an invalid living object.

And under the condition that the second target action sequence comprises action sequences respectively corresponding to the nodding action and the nodding action, determining that the object to be detected is a legal living object as an initial detection result under the condition that the object to be detected is determined to execute the two actions simultaneously, or else, determining that the object to be detected is an illegal living object.

In addition, in the process of detecting the object to be detected by using the second detection video, the object to be detected may be detected by using one or more first preset actions corresponding to the first target action sequence and one or more second preset actions corresponding to the second target action sequence, so as to determine an initial detection result of the object to be detected, which is not limited herein.

In one embodiment, before performing S104, the method further includes a step of determining a second detection probability of the object to be detected in at least one frame of the second image to be detected in the first detection video:

the method comprises the steps of firstly, obtaining at least one frame of second image to be detected from a first detection video, respectively carrying out key point detection on each frame of second image to be detected, and respectively determining key points corresponding to a first preset part of an object to be detected in each frame of second image to be detected.

Here, the first preset part may be a face part of the object to be detected.

In this step, at least one frame of second image to be detected may be selected from the acquired first detection video, and specifically, all the images to be detected in the first detection video may be used as the selected second image to be detected.

For each second image to be detected, the key point detection model can be used for carrying out face recognition on the second image to be detected, so as to determine key points corresponding to the face of the object to be detected, and particularly 106 groups of key points corresponding to the face of the object to be detected can be determined.

And step two, selecting a preset number of target key points from the key points.

In specific implementation, the key points corresponding to the five sense organs can be selected from the 106 groups of key points as the target key points. Alternatively, any 3 key points can be selected from the key points corresponding to the five sense organs as the target key points.

Or after the key points corresponding to the five sense organs are determined, the middle point corresponding to the connecting line formed by the two key points can be determined according to the key points of the left eye and the right eye, and the middle point is used as the target key point. And the middle point of the connecting line formed by the key points corresponding to the left mouth corner and the right mouth corner respectively can be used as a target key point, and further, the two target key points and the key point corresponding to the nose tip can be used as the finally determined target key point.

And thirdly, determining a first detection probability corresponding to the object to be detected in each frame of second image to be detected based on each frame of second image to be detected and the target key point corresponding to each frame of second image to be detected.

Here, the second image to be detected of one frame corresponds to a first detection probability that is used to represent the probability that the object to be detected in the second image to be detected is a legitimate living object.

And for each frame of second image to be detected, determining a first detection probability corresponding to the object to be detected in the second image to be detected according to the determined image characteristics of the second image to be detected and the image characteristics of the image area corresponding to the target key point.

In an embodiment, the first detection probability corresponding to the object to be detected in the second image to be detected may be determined according to a method shown in fig. 3, and as shown in fig. 3, a flowchart of a method for determining the first detection probability corresponding to the object to be detected in the second image to be detected provided by the embodiment of the present disclosure may include the following steps:

s301: determining a preset coordinate corresponding to the first preset part; and determining the actual coordinates corresponding to the first preset part based on the coordinates of the pixel points corresponding to each target key point.

Here, the preset coordinate may be a coordinate of a target key point corresponding to a first preset portion of the to-be-detected object, which is expected to be obtained, in the second to-be-detected image, at this time, an angle of the roll angle roll corresponding to the first preset portion is 0, and specifically, the preset coordinate may be a coordinate of a pixel point corresponding to the target key point when the coordinate is located in a central region of the second to-be-detected image. For example, the preset coordinates may be coordinates in which the key points corresponding to the five sense organs of the object to be detected correspond to the standard identification photo.

And according to the determined target key points and the second image to be detected, the coordinates of the pixel points of each target key point in the second image to be detected can be determined, and then the coordinates can be used as the actual coordinates corresponding to the first preset part.

S302: and determining a target conversion relation based on the preset coordinates and the actual coordinates.

Here, the target conversion relationship is a conversion relationship for converting the actual coordinates of the key points into coordinates conforming to expectations, and specifically, the target conversion relationship may be a certain conversion matrix.

In specific implementation, a conversion matrix between the actual coordinates and the preset coordinates can be determined according to the actual coordinates corresponding to each target key point and the preset coordinates corresponding to each target key point of the first preset part, that is, a target conversion relationship is obtained.

S303: and based on the target conversion relationship, performing coordinate conversion on the pixel point corresponding to each key point.

During specific implementation, the coordinates of pixel points corresponding to each key point corresponding to the first preset part in the second image to be detected can be determined, and then the coordinate conversion can be performed on the pixel points corresponding to each key point based on the target conversion relation, so that the converted coordinates are obtained.

S304: and determining the first detection probability of the object to be detected in the second image to be detected based on the second image to be detected and each key point after coordinate conversion.

In a specific implementation, S304 may be performed according to the following steps:

and step one, performing coordinate conversion on the second image to be detected based on the target conversion relation to obtain a coordinate conversion image.

Here, the coordinates of each pixel point in the second image to be detected may be converted into coordinates meeting expectations by using a target conversion relationship, and then, an image formed by the pixel points after the coordinates are converted may be used as a coordinate conversion image obtained after the coordinates of the second image to be detected are converted.

And secondly, intercepting a target image corresponding to the first preset part of the object to be detected from the coordinate conversion image based on the coordinate conversion image and each key point after coordinate conversion.

In this step, the position of each key point after coordinate conversion corresponding to the coordinate conversion image can be determined according to the coordinates of each pixel point in the coordinate conversion image and the coordinates of each key point after coordinate conversion, then, the image area corresponding to the first preset part can be determined according to the position of each key point after coordinate conversion corresponding to the coordinate conversion image, then, the coordinate conversion image can be intercepted according to the image area, and the target image corresponding to the first preset part of the object to be detected is obtained. In addition, in a specific implementation, in the process of determining the image area corresponding to the first preset portion, a certain degree of area expansion may be performed, and the expanded area may be used as a final image area. Therefore, partial image backgrounds in the coordinate conversion image can be intercepted in the process of intercepting the target image according to the image area, and then the object to be detected is identified based on the image background information corresponding to the image background, so that the richness of identification information for identification is increased, and the accuracy of the determined first detection probability can be improved.

And thirdly, determining the first detection probability of the object to be detected in the second image to be detected based on the target image and the second image to be detected.

Here, the image feature of the second image to be detected in the basic dimension may be determined according to the target image, and in the same way, the image feature of the second image to be detected in the basic dimension may be determined, and then, the image feature corresponding to the target image and the image feature corresponding to the second image to be detected may be fused in the channel dimension to obtain the fused image feature, and further, the fused image feature may be input to a lightweight classification network (such as a MobileNet series network), and processed by using the classification network, and the first detection probability of the object to be detected in the second image to be detected may be output.

In an embodiment, for the third step, first channel feature information of the target image and second channel feature information of the second image to be detected may be determined.

The channel feature information includes image features of the image corresponding to the basic dimensions, and feature information of the image corresponding to multiple modalities, for example, feature information in a grayscale map mode, feature information in an RGB map mode, and the feature information may include color features, pixel point structural features, and the like.

Furthermore, the first channel characteristic information and the second channel characteristic information can be spliced in the channel dimension to obtain third channel characteristic information. Here, the stitching in the channel dimension may be stitching of only the first channel characteristic information and the first channel characteristic information, may be stitching of only the target image and the second image to be detected, or may be mixed stitching of the two kinds of information, and is not limited herein.

And then, inputting the third channel characteristic information into a lightweight classification network, processing the third channel characteristic information by using the classification network, and outputting a first detection probability corresponding to the object to be detected in a second image to be detected corresponding to the third channel characteristic information. In addition, the classification network may further output a fourth detection probability that the object to be detected is an illegal living object, wherein a sum of the first detection probability and the fourth detection probability is 1.

Furthermore, based on the above steps, a first detection probability corresponding to the object to be detected in each frame of the second image to be detected can be determined.

And then, determining the living body detection result of the object to be detected in the first detection video according to the first detection probability and the target image characteristic corresponding to each frame of the second image to be detected.

Here, as can be seen from the above-described embodiments, the second detection probabilities corresponding to the plurality of first to-be-detected images can be determined based on the target image feature, and further, the biopsy result of the object to be detected can be determined based on the second detection probabilities and the first detection probabilities corresponding to each of the second to-be-detected images. Therefore, based on the combined processing of each single-frame second image to be detected and the multiple frames of first images to be detected, more abundant image characteristics can be obtained, and the accuracy of the determined in-vivo detection result is improved.

In an embodiment, after determining the second detection probability and the first detection probability, a first preset weight corresponding to the first detection probability and a second preset weight corresponding to the second probability may be obtained.

The preset weight can be set according to the number of the first images to be detected and the number of the second images to be detected. Taking the number of images to be detected included in the first detected video as 25 frames, the number of the second images to be detected as 25, and the number of the first images to be detected as 5 as an example, the first preset weight may be set to 1/25, and the second preset weight may be set to 5/25. The preset weight may be set according to actual needs, and is not limited herein.

Then, the target probability may be determined based on the first detection probability, the first preset weight, the second detection probability, and the second preset weight. Here, the detection probabilities may be weighted and summed according to a preset weight to determine the target probability, and specifically, each first detection probability may be weighted by a first preset weight, and the weighted results are summed to obtain the first target probability. Meanwhile, the second detection probability may be weighted by a second preset weight to determine a second target probability. Further, the first target probability and the second target probability may be summed to obtain the target probability.

Finally, the target probability may be compared with a preset threshold, and a magnitude relationship between the target probability and the preset threshold may be determined. The preset threshold is the preset minimum target probability corresponding to the object to be detected as a legal living object. Furthermore, when the target probability is determined to be greater than the preset threshold, the living body detection result includes that the object to be detected is a living body object, that is, the object to be detected is determined to be a legal living body object. And otherwise, taking the living object to be detected as an illegal living object as a final living object detection result.

As shown in fig. 4, a flowchart of a specific implementation of determining a biopsy result of an object to be detected according to an embodiment of the present disclosure is shown. After the second detection video corresponding to the object to be detected is obtained, the object to be detected can be verified by using the preset action corresponding to the preset action sequence, specifically, the object to be detected in each frame of the third image to be detected in the second detection video can be subjected to action recognition, so as to determine whether the object to be detected executes the action to be detected which is consistent with the preset action corresponding to the preset action sequence. Under the condition that the object to be detected is determined to execute the action to be detected which is consistent with the preset action corresponding to the preset action sequence, the identification screen corresponding to the face recognition equipment is controlled to present different colors at certain time intervals, a first detection video of the object to be detected corresponding to light sources with different colors is obtained, each frame of second image to be detected in the first detection video is identified, the first detection probability of the object to be detected in each frame of second image to be detected is determined, in addition, a plurality of first images to be detected can be selected from the first detection video, and the second detection probability is determined based on the target image characteristics corresponding to the plurality of first images to be detected. And then. And fusing the second detection probability and each first detection probability by using a preset weight to determine the target probability. Finally, a liveness detection result may be determined based on the target probability. In addition, under the condition that the object to be detected is determined not to execute the action to be detected consistent with the preset action corresponding to the preset action sequence, the living body detection result can be directly determined.

In an embodiment, after the live body detection result of the object to be detected in the first detection video is determined, in the case that the live body detection result indicates that the object to be detected is a live body object, it indicates that the object to be detected can pass through the live body identification verification, and further, it can be determined that the object to be detected has the preset authority. The preset authority may include at least one of entering the access control system, accessing the first target content (e.g., accessing a target document, data, etc.), editing the second target content (e.g., editing the second target content by modifying, deleting, adding, etc., such as editing encrypted data, editing an encrypted document, etc.), and performing a transaction operation (e.g., may include a payment operation, a payment collection operation, a payment cancellation operation, etc.).

Here, the preset authority may include, but is not limited to, the above various authorities, and when implemented, the preset authority may be set according to a specific application scenario of the living body detection method, and is not specifically limited herein. For example, for a specific scene, different objects may have different permissions, and the different permissions may intersect with each other or be independent of each other (i.e., the permissions do not overlap with each other).

Further, a function matched with the preset authority can be triggered, that is, the object to be detected is allowed to use the function matched with the preset authority. For example, the user is allowed to access the target document.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same inventive concept, the embodiment of the present disclosure further provides a living body detection device corresponding to the living body detection method, and since the principle of solving the problem of the device in the embodiment of the present disclosure is similar to the living body detection method in the embodiment of the present disclosure, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.

As shown in fig. 5, a schematic diagram of a living body detection apparatus provided in an embodiment of the present disclosure includes:

an obtaining module 501, configured to obtain a first detection video corresponding to an object to be detected;

a first determining module 502, configured to obtain multiple first images to be detected from a first detection video, and determine an initial image feature of each first image to be detected; the initial image features comprise image features of the first image to be detected in a depth dimension;

the combining module 503 is configured to combine the initial image features of each first image to be detected according to the image sequence of the plurality of first images to be detected in the first detection video, so as to obtain target image features;

a second determining module 504, configured to determine a living body detection result of the object to be detected in the first detection video based on the target image feature.

the first determining module 502 is configured to screen at least one frame of image corresponding to each color light source of the multiple color light sources from the first detection video, so as to obtain multiple first images to be detected.

In a possible implementation, the apparatus further includes a third determining module 505:

the third determining module 505 is configured to, before determining a living body detection result of the object to be detected in the first detection video based on the target image feature, obtain at least one frame of second image to be detected from the first detection video, perform key point detection on each frame of the second image to be detected, and determine key points corresponding to a first preset portion of the object to be detected in each frame of the second image to be detected;

selecting a preset number of target key points from the key points;

the second determining module 504 is configured to determine a living body detection result of the object to be detected in the first detection video based on the first detection probability corresponding to each frame of the second image to be detected and the target image feature.

In a possible implementation manner, the third determining module 505 is configured to determine a preset coordinate corresponding to the first preset portion; determining actual coordinates corresponding to the first preset part based on coordinates of pixel points corresponding to the target key points;

In a possible implementation manner, the third determining module 505 is configured to perform coordinate transformation on the second image to be detected based on the target transformation relationship to obtain a coordinate transformation image;

In a possible implementation, the third determining module 505 is configured to determine first channel feature information of the target image and second channel feature information of the second image to be detected;

In a possible implementation manner, the second determining module 504 is configured to determine, based on the target image feature, a second detection probability corresponding to the object to be detected;

In a possible implementation manner, the second determining module 504 is configured to obtain a first preset weight corresponding to a first detection probability and a second preset weight corresponding to a second detection probability;

In a possible implementation manner, the obtaining module 501 is configured to obtain a second detection video corresponding to the object to be detected;

In a possible implementation manner, the obtaining module 501 is configured to perform motion recognition on an object to be detected in each frame of a third image to be detected in the second detection video, so as to obtain a motion to be detected corresponding to the object to be detected in the third image to be detected;

the obtaining module 501 is configured to screen a fourth image to be detected, where the first motion to be detected matches the first target motion sequence, from the third image to be detected based on the first motion to be detected and the first target motion sequence of the object to be detected in each frame of the third image to be detected;

the obtaining module 501 is configured to determine an angle value of a second preset portion of the object to be detected, which corresponds to a preset angle, based on a second action to be detected of the object to be detected in each frame of the third image to be detected;

In a possible embodiment, the apparatus further comprises:

a processing module 506, configured to determine that the object to be detected has a preset right and trigger a function matched with the preset right after determining the live detection result of the object to be detected in the first detection video and when the object to be detected is a live object, where the preset right includes at least one of entering an access control system, accessing a first target content, editing a second target content, and performing a transaction operation.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

An embodiment of the present disclosure further provides a computer device, as shown in fig. 6, which is a schematic structural diagram of a computer device provided in an embodiment of the present disclosure, and includes:

a processor 61 and a memory 62; the memory 62 stores machine-readable instructions executable by the processor 61, the processor 61 being configured to execute the machine-readable instructions stored in the memory 62, the processor 61 performing the following steps when the machine-readable instructions are executed by the processor 61: s101: acquiring a first detection video corresponding to an object to be detected; s102: acquiring a plurality of first images to be detected from a first detection video, and determining the initial image characteristics of each first image to be detected; s103: combining the initial image features of each first image to be detected according to the image sequence of the first images to be detected in the first detection video to obtain the target image features, and S104: and determining the living body detection result of the object to be detected in the first detection video based on the target image characteristics.

The memory 62 includes a memory 621 and an external memory 622; the memory 621 is also referred to as an internal memory, and temporarily stores operation data in the processor 61 and data exchanged with the external memory 622 such as a hard disk, and the processor 61 exchanges data with the external memory 622 via the memory 621.

For the specific execution process of the above instruction, reference may be made to the steps of the living body detection method described in the embodiments of the present disclosure, and details are not described here.

The disclosed embodiments also provide a computer-readable storage medium having stored thereon a computer program, which, when executed by a processor, performs the steps of the living body detection method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The computer program product of the biopsy method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the steps of the biopsy method described in the above method embodiments, which may be referred to specifically for the above method embodiments, and are not described herein again.

The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implementing, and for example, a plurality of units or components may be combined, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method of in vivo detection, comprising:

acquiring a first detection video corresponding to an object to be detected;

2. The method according to claim 1, wherein the first detection video comprises images of the object to be detected taken under a plurality of color light sources;

3. The method according to claim 1 or 2, wherein before the determining the live body detection result of the object to be detected in the first detection video based on the target image feature, the method further comprises:

selecting a preset number of target key points from the key points;

4. The method according to claim 3, wherein determining the first detection probability corresponding to the object to be detected in the second image to be detected based on the target key point and the second image to be detected comprises:

5. The method according to claim 4, wherein the determining a first detection probability of the object to be detected in the second image to be detected based on the second image to be detected and each of the key points after coordinate transformation comprises:

6. The method according to claim 5, wherein the determining a first detection probability of the object to be detected in the second image to be detected based on the target image and the second image to be detected comprises:

7. The method according to any one of claims 3 to 5, wherein the determining the in-vivo detection result of the object to be detected in the first detection video based on the first detection probability corresponding to each frame of the second image to be detected and the target image feature comprises:

8. The method according to claim 7, wherein the determining the in-vivo detection result of the object to be detected based on the first detection probability and the second detection probability comprises:

9. The method according to any one of claims 2 to 8, wherein the obtaining of the first detection video corresponding to the object to be detected includes:

acquiring a second detection video corresponding to the object to be detected;

10. The method according to claim 9, wherein the determining an initial detection result of the object to be detected based on the second detection video and a preset action sequence comprises:

11. The method of claim 10, wherein the preset sequence of actions comprises a first target sequence of actions; the action to be detected comprises a first action to be detected;

12. The method according to claim 10 or 11, wherein the preset action sequence comprises a second target action sequence; the action to be detected comprises a second action to be detected;

13. The method according to any one of claims 1 to 12, further comprising, after the determining the live body detection result of the object to be detected in the first detection video:

14. A living body detection device, comprising:

15. A computer device, comprising: a processor, a memory storing machine-readable instructions executable by the processor, the processor to execute the machine-readable instructions stored in the memory, the processor to perform the steps of the liveness detection method of any one of claims 1 to 13 when the machine-readable instructions are executed by the processor.

16. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a computer device, performs the steps of the liveness detection method according to any one of claims 1 to 13.