CN114973426B

CN114973426B - Living body detection method, device and equipment

Info

Publication number: CN114973426B
Application number: CN202110619560.5A
Authority: CN
Inventors: 彭琨; 丁小波; 蔡茂贞; 钟地秀; 刘井安; 李小青
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Internet Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Internet Co Ltd
Priority date: 2021-06-03
Filing date: 2021-06-03
Publication date: 2023-08-15
Anticipated expiration: 2041-06-03
Also published as: CN114973426A

Abstract

The invention discloses a living body detection method, a living body detection device and living body detection equipment, wherein the living body detection method comprises the following steps: acquiring a video to be processed containing a user-specified action; wherein, the designated action is the action made by the user according to the action prompt; determining a target video according to the contrast and brightness of the video to be processed; performing static living body detection on the target video through a pre-trained living body recognition model to obtain a detection value output by the living body recognition model, and determining whether a user in the target video is a living body according to whether the detection value belongs to a target detection value range; the target detection value range is obtained by adjusting a preset initial detection value range according to the contrast and brightness of the target video.

Description

Living body detection method, device and equipment

Technical Field

The present invention relates to the field of detection, and in particular, to a living body detection method, apparatus and device.

Background

In recent years, with the development of face recognition technology, the application of "face brushing" has been more and more, such as face brushing payment, face brushing card-punching sign-in, face brushing unlocking electronic equipment, face brushing unlocking access control, face brushing authentication and the like. As a technology of vital importance in the face recognition technology, living body detection plays an important role in distinguishing the authenticity of images, resisting spoofing attacks, and protecting the security of the whole face recognition system.

In the related art, in the living body detection process, a video image is generally collected by a camera, and then the living body detection is directly performed according to the video image collected by the camera.

Because the related technology too relies on the video image fed back by the camera, once the camera is attacked, for example, an attacker replaces the video image acquired by the camera with the pre-recorded video image, the living body detection is failed or the living body detection is wrong, so that loss is brought to the user, and the safety is lower. In addition, since the living body detection is directly performed based on the video image acquired by the camera, the video image acquired by the camera is easily affected by external factors, so that the accuracy and the robustness of the living body detection are low.

Disclosure of Invention

The embodiment of the invention provides a living body detection method, a living body detection device and living body detection equipment, which are used for solving the problems of low safety, accuracy and robustness of the living body detection technology of the related technology.

In order to solve the technical problems, the invention is realized as follows:

in a first aspect, there is provided a method of in vivo detection, the method comprising:

acquiring a video to be processed containing a user-specified action; wherein the specified action is an action made by the user according to the action prompt;

Determining a target video according to the contrast and brightness of the video to be processed;

performing static living body detection on the target video through a pre-trained living body recognition model to obtain a detection value output by the living body recognition model, and determining whether a user in the target video is a living body according to whether the detection value belongs to a target detection value range; the target detection value range is obtained by adjusting a preset initial detection value range according to the contrast and brightness of the target video.

In a second aspect, there is provided a living body detection apparatus, the apparatus comprising:

the acquisition module is used for acquiring the video to be processed containing the action appointed by the user; wherein the specified action is an action made by the user according to the action prompt;

the determining module is used for determining a target video according to the contrast and brightness of the video to be processed;

the detection module is used for carrying out static living body detection on the target video through a pre-trained living body recognition model, obtaining a detection value output by the living body recognition model, and determining whether a user in the target video is a living body according to whether the detection value belongs to a target detection value range; the target detection value range is obtained by adjusting a preset initial detection value range according to the contrast and brightness of the target video.

In a third aspect, there is provided an apparatus comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the method as described in the first aspect above.

In a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method as described in the first aspect above.

The above at least one technical scheme provided by the embodiment of the invention can achieve the following technical effects:

when in living detection, the technical scheme provided by the embodiment of the invention can carry out living detection based on the video to be processed containing the user-specified action, thereby effectively solving the problem of failure in living detection caused by that an attacker replaces a video image acquired by a camera with a pre-recorded video image in the related art and improving the safety; in addition, the technical scheme provided by the embodiment of the invention can also combine the contrast and brightness of the video to be processed to carry out living body detection on the video to be processed, and because the external factors which possibly influence the living body detection result are considered, the problems of low accuracy and low robustness of living body detection caused by the fact that the external factors are not considered can be effectively solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a flow chart of a method for in-vivo detection according to an embodiment of the present invention;

fig. 2 is a schematic block diagram of a living body detection apparatus 200 according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a hardware configuration of a living body detection apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to specific embodiments of the present invention and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The following describes in detail the technical solutions provided by the embodiments of the present invention with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a flow chart of a living body detection method according to an embodiment of the invention, as shown in fig. 1, the method includes the following steps:

step 102: acquiring a video to be processed containing a user-specified action; wherein the designated action is an action performed by the user according to the action prompt.

Step 104: and determining the target video according to the contrast and brightness of the video to be processed.

Step 106: performing static living body detection on the target video through a pre-trained living body recognition model to obtain a detection value output by the living body recognition model, and determining whether a user in the target video is a living body according to whether the detection value belongs to a target detection value range; the target detection value range is obtained by adjusting a preset initial detection value range according to the contrast and brightness of the target video.

In the embodiment of the invention, the video to be processed containing the user-specified action can be acquired first, wherein the specified action can be the action performed by the user according to the action prompt.

In one embodiment, the action cues may be played to the user through the terminal, where the action cues may be one or more action cues selected randomly from a plurality of preset action cues, such as a blink cue, a head shaking cue, a head nodding cue, and the like. After the action prompt is played to the user through the terminal, the action of the user according to the action prompt can be shot through a camera of the terminal, and the video to be processed containing the action appointed by the user is obtained.

It should be noted that the embodiment of the present invention may be applied to a terminal, and may also be applied to a device that establishes a communication connection with a terminal. When the method is applied to the terminal, the terminal can play action prompts to a user through a loudspeaker, a receiver or a Bluetooth and other modules capable of outputting sound, and a camera of the terminal is used for shooting to-be-processed processing containing user specified actions; when applied to a device that establishes a communication connection with a terminal, the device may play action cues to a user and capture a pending video containing user-specified actions by invoking a speaker, earpiece, camera, or the like of the terminal.

After the video to be processed including the user-specified action is acquired, the target video may be determined according to the contrast and brightness of the video to be processed.

In one embodiment of the invention, a linear fit relationship between exposure level value, contrast, and brightness may be obtained first. Then, the exposure degree value of the video to be processed is determined according to the linear fitting relation among the exposure degree value, the contrast and the brightness, and the contrast and the brightness of the video to be processed.

After determining the exposure degree value of the video to be processed, determining whether the exposure degree value of the video to be processed belongs to a preset standard degree value range, and if the exposure degree value of the video to be processed belongs to the preset standard degree value range, determining the video to be processed as a target video; if the exposure degree value of the video to be processed does not belong to the preset standard degree value range, prompting a user to change the video shooting angle and/or shooting position, obtaining the video to be processed containing the user-specified action after changing the video shooting angle and/or shooting position, returning to determine the exposure degree value of the video to be processed according to the linear fitting relation among the preset exposure degree value, contrast and brightness, and the contrast and brightness of the video to be processed, and repeatedly executing until the exposure degree value of the video to be processed belongs to the preset standard degree value range or the repeated execution times reaches the preset times threshold, and determining the current video to be processed as the target video.

In the embodiment of the invention, the exposure degree value of the video to be processed can be determined, and whether the exposure degree value of the video to be processed belongs to a preset standard degree value range or not can be determined, wherein the preset standard degree value range can represent that the corresponding video or image is normally exposed. Therefore, when the exposure degree value of the video to be processed does not belong to the preset standard degree value range, it can be determined whether the video to be processed is overexposed or underexposed. At this time, the user may be guided to adjust the photographing angle and/or photographing position through the terminal to reduce the influence of overexposure or underexposure on the living body detection.

In one example, when the user is guided to adjust the photographing angle and/or the photographing position through the terminal, the user may be prompted to adjust the photographing angle and/or the photographing position by voice through a speaker or an earpiece of the terminal or a module of bluetooth or the like that can output sound. Of course, the user can also be prompted by text through the screen of the terminal to adjust the shooting angle and/or shooting position.

After prompting the user to adjust the shooting angle and/or the shooting position, the terminal can be used for re-shooting the video, and at the moment, the video to be processed containing the user-specified action after the user changes the shooting angle and/or the shooting position can be obtained. Then, the exposure degree value of the currently shot video to be processed, namely the video to be processed containing the action appointed by the user after the user changes the video shooting angle and/or shooting position, can be obtained, and whether the exposure degree value of the currently shot video to be processed belongs to a preset standard degree value range or not is determined.

If the exposure degree value of the currently shot video to be processed belongs to a preset standard degree value range, the currently shot video to be processed can be determined to be a target video; if the exposure level value of the currently shot video to be processed does not belong to the preset standard level value range, that is, the currently shot video to be processed is underexposed or overexposed, the user can be prompted to replace the video shooting angle and/or the shooting position again, then, the video to be processed including the user specified action after the user replaces the video shooting angle and/or the shooting position is shot, whether the exposure level value of the currently shot video belongs to the preset standard level value range is judged, if not, the user is prompted to replace the video shooting angle and/or the shooting position … … continuously until the exposure level value of the currently shot video to be processed belongs to the preset standard level value range or the number of times of repeatedly executing the prompting operation and the determining operation of the exposure level value reaches the preset times threshold, and at this time, the currently shot video to be processed can be determined as the target video.

It should be noted that, if the exposure level value of the shot video to be processed still does not belong to the preset standard level value range under the condition of reminding the user to adjust the shooting angle and/or the shooting position for multiple times, the preset standard level value range can be adjusted according to the video environment of the shot video to be processed, and the preset initial detection value range used in the subsequent static living detection can be adapted.

In one embodiment of the present invention, when determining the exposure level value of the video to be processed according to the predetermined linear fitting relationship among the exposure level value, the contrast and the luminance, and the contrast and the luminance of the video to be processed, the target skin tone type of the user in the video to be processed may be identified first, then, from the predetermined linear fitting relationship among the exposure level value, the contrast and the luminance corresponding to each skin tone type according to the target skin tone type, the linear fitting relationship among the exposure level value, the contrast and the luminance corresponding to the target skin tone type may be determined, and then, the exposure level value of the video to be processed may be determined according to the linear fitting relationship among the exposure level value, the contrast and the luminance corresponding to the target skin tone type and the contrast and the luminance of the video to be processed.

In one example, when determining the linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to each skin tone type, a data set of faces with different skin tone types can be obtained first, wherein the data set can be a data set consisting of a plurality of images or a data set consisting of a plurality of videos. And then dividing the data set according to skin color types to obtain a plurality of sub-data sets, further determining the contrast value range, the brightness value range and the exposure degree value of the human face in each sub-data set, and performing linear fitting on the contrast value range, the brightness value range and the exposure degree value of the human face in each sub-data set to obtain a linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to each skin color type.

In the embodiment of the invention, the skin color type can at least comprise black, yellow and white. A dataset comprising faces of different skin tone types may be obtained, such as a plurality of datasets comprising black faces, a plurality of datasets comprising yellow faces, and a plurality of datasets comprising white faces. The data sets may then be partitioned according to skin tone type and the contrast and brightness of all data sets calculated separately. And when the contrast and brightness of the data sets are obtained, the data sets with different skin color types can be subjected to contrast and brightness averaging, and the distribution range of the contrast and brightness corresponding to the data sets with different skin color types is obtained.

And then, labeling each image or video in the data set with an exposure degree value, and representing whether the corresponding image or video is overexposed, normal or underexposed according to the labeled exposure degree value. After the labeling is completed, the exposure degree value, the contrast and the brightness of each image or video can be linearly fitted to obtain a linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to each skin color type.

In one example, the linear fit relationship between exposure level value, contrast, and brightness may be:

y＝ax ₁ +bx ₂ +c equation 1

Wherein y can be exposure degree value, x ₁ And x ₂ May be contrast and brightness, and a, b, c may be coefficients in a linear fit equation.

In one embodiment of the invention, after obtaining a target video, static living body detection can be performed on the target video through a pre-trained living body recognition model to obtain a detection value output by the living body recognition model, and whether a user in the target video is a living body is determined according to whether the detection value belongs to a target detection value range; the target detection value range is obtained by adjusting a preset initial detection value range according to the contrast and brightness of the target video.

In one embodiment, the target detection value range may be acquired first, wherein the trained living body detection model may be acquired first when the target detection value range is acquired.

In one example, the trained living body detection model may be trained from training samples, where the training samples may include at least a training video and a detection value identifying whether a user in the training video is a living body. The detection value may be annotated by the relevant person in accordance with the content of the training video.

In one example, the in-vivo detection model may be an SVM (Support Vector Machine ) model trained based on facial features in training videos.

After obtaining the trained living body recognition model, the specified videos with different exposure degree values may be input to the living body recognition model first, so as to obtain detection values corresponding to the specified videos with different exposure degree values output by the living body recognition model (in order to distinguish detection values corresponding to target videos output by a subsequent living body recognition model, the detection values obtained according to the exposure degree values are called predicted values hereinafter), wherein all users in the specified videos are living bodies or none of them is a living body. Then, the exposure degree value of the appointed video and the predicted value corresponding to the exposure degree value of the appointed video can be subjected to linear fitting, so that the linear fitting relation between the exposure degree value and the predicted value is obtained.

In one example, the linear fit relationship of the exposure level value and the predicted value may be:

t=my+n equation 2

Where t may be a predicted value, y may be an exposure equation shown in equation 1, and m and n may be coefficients in a linear fitting manner.

After the linear fitting relation between the exposure degree value and the predicted value is obtained, the preset initial detection value range of the living body recognition model can be adjusted according to the linear fitting relation between the exposure degree value and the predicted value and the exposure degree value of the target video, so that the target detection value range is obtained.

After the target detection value range is obtained, the target video may be input into the trained living body recognition model, and the detection value output by the living body recognition model is obtained, and then whether the user in the target video is a living body may be determined according to whether the detection value output by the living body recognition model belongs to the target detection value range.

In one example, when the detection value output by the living body recognition model does not belong to the target detection value range, it may be determined that the user in the target video is not a living body; when the detection value output by the living body recognition model belongs to the target detection value range, the user in the target video can be determined to be a living body.

In the embodiment of the present invention, after determining whether the user in the target video is a living body according to whether the detected value belongs to the target detected value range, it may also be determined that the entire living body detection process is directed to the same user, that is, whether the user is himself or herself is further determined in the case where it is determined that the user is a person.

In one embodiment, at least two frames of images containing faces can be obtained from the video to be processed and the target video, face features in the images containing faces of each frame are extracted, and then whether the faces in the images containing faces of the at least two frames are from the same user is determined according to the extracted face features.

In the embodiment of the invention, a face picture can be randomly intercepted from each video containing the user appointed action in the living body detection flow, then the face features are extracted from the face pictures, the feature comparison of different face pictures is carried out, the intercepted face pictures are ensured to be the same person, and the face feature comparison can be realized by using a depth model such as FaceNet. In the embodiment of the invention, the operation of judging whether the person is the same can improve the safety of the living body detection system.

In one embodiment of the present invention, before the static living body detection is performed on the target video, dynamic living body detection may be further performed on the target video, where the dynamic living body detection may include at least one of the following: face key point detection, continuous frame motion detection, head shaking detection, mouth opening detection, head spot detection, blink detection and face position detection.

In the embodiment of the invention, the dynamic living body detection requires the user to complete specific actions according to the action prompt, such as the shaking action, the opening action and the like, and then judges whether the user is a real person or not according to the action completion degree.

Corresponding to the above living body detection method, the embodiment of the present invention further provides a living body detection device, and fig. 2 is a schematic diagram of a living body detection module 200 provided by the embodiment of the present invention, as shown in fig. 2, where the living body detection device 200 includes:

a first obtaining module 201, configured to obtain a video to be processed including a user-specified action; wherein the specified action is an action made by the user according to the action prompt;

a first determining module 202, configured to determine a target video according to the contrast and brightness of the video to be processed;

the detection module 203 is configured to perform static living body detection on the target video through a pre-trained living body recognition model, obtain a detection value output by the living body recognition model, and determine whether a user in the target video is a living body according to whether the detection value belongs to a target detection value range; the target detection value range is obtained by adjusting a preset initial detection value range according to the contrast and brightness of the target video.

Optionally, the first determining module 202 is configured to:

determining the exposure degree value of the video to be processed according to a linear fitting relation among a predetermined exposure degree value, contrast and brightness, and the contrast and brightness of the video to be processed;

Determining whether the exposure degree value of the video to be processed belongs to a preset standard degree value range;

if the exposure degree value of the video to be processed belongs to a preset standard degree value range, determining the video to be processed as a target video;

if the exposure degree value of the video to be processed does not belong to the preset standard degree value range, prompting a user to change the video shooting angle and/or shooting position, obtaining the video to be processed containing the user-specified action after changing the video shooting angle and/or shooting position, returning to the linear fitting relation among the preset exposure degree value, contrast and brightness, and the contrast and brightness of the video to be processed, determining the exposure degree value of the video to be processed, and repeating the steps until the exposure degree value of the video to be processed belongs to the preset standard degree value range or the number of repeated steps reaches the preset number of times threshold, and determining the current video to be processed as the target video.

Optionally, the first determining module 202 is further configured to:

identifying the target skin color type of the user in the video to be processed;

according to the target skin tone type, determining a linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to the target skin tone type from a linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to each predetermined skin tone type;

And determining the exposure degree value of the video to be processed according to the linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to the target skin tone type and the contrast and the brightness of the video to be processed.

Optionally, the apparatus 200 further comprises (not shown in fig. 2):

a second obtaining module 204, configured to obtain a dataset including faces of different skin tone types before determining, according to the target skin tone type, a linear fitting relationship among the exposure level value, the contrast and the brightness corresponding to the target skin tone type from a predetermined linear fitting relationship among the exposure level value, the contrast and the brightness corresponding to each skin tone type;

the dividing module 205 is configured to divide the data set according to skin color types to obtain multiple sub-data sets;

a second determining module 206, configured to determine a contrast value range, a brightness value range, and an exposure degree value of the face in each sub-dataset;

the third determining module 207 is configured to perform linear fitting on the contrast value range, the brightness value range, and the exposure level value of the face in each sub-data set, so as to obtain a linear fitting relationship among the exposure level value, the contrast, and the brightness corresponding to each skin color type.

Optionally, the apparatus 200 further comprises (not shown in fig. 2):

a third obtaining module 208, configured to input a specified video with different exposure degree values to the living body recognition model before determining whether the user in the target video is a living body according to whether the detection value belongs to a target detection value range, and obtain predicted values corresponding to the different exposure degree values output by the living body recognition model; wherein, the users in the appointed video are living bodies or are not living bodies;

a fourth obtaining module 209, configured to perform linear fitting on the exposure level value of the specified video and a predicted value corresponding to the exposure level value of the specified video, to obtain a linear fitting relationship between the exposure level value and the predicted value;

a fifth obtaining module 210, configured to adjust a preset initial detection value range of the living body recognition model according to the linear fitting relation between the exposure degree value and the predicted value and the exposure degree value of the target video, so as to obtain a target detection value range; the training samples at least comprise training videos and detection values for identifying whether users in the training videos are living bodies or not.

Optionally, the apparatus 200 further comprises (not shown in fig. 2):

A sixth obtaining module 211, configured to obtain at least two frames of images including a face from the video to be processed and the target video after determining whether the user in the target video is a living body according to whether the detection value belongs to a target detection value range;

an extracting module 212, configured to extract a face feature in an image in which each frame includes a face;

a third determining module 213, configured to determine whether faces in the at least two frames of images including the faces are from the same user according to the extracted face features.

Optionally, the detection module 203 is configured to:

performing dynamic living body detection on the target video; wherein the dynamic in-vivo detection comprises at least one of: face key point detection, continuous frame motion detection, shaking detection, mouth opening detection, nodding detection, blink detection and face position detection;

after the target video passes through the dynamic living body detection, the static living body detection is carried out on the target video through a pre-trained living body recognition model, and a detection value output by the living body recognition model is obtained.

Corresponding to the above living body detection method, the embodiment of the present invention further provides a living body detection device, and fig. 3 is a schematic hardware structure of the living body detection device according to one embodiment of the present invention.

The living body detection apparatus may be a terminal apparatus or a server or the like for detecting a living body provided in the above-described embodiment.

The in-vivo detection device may be of a relatively large variety due to different configurations or capabilities, may include one or more processors 301 and memory 302, and may have one or more stored applications or data stored in memory 302. Wherein the memory 302 may be transient storage or persistent storage. The application program stored in the memory 302 may include one or more modules (not shown in the figures), each of which may include a series of computer-executable instructions for use in the in-vivo detection device. Still further, the processor 301 may be configured to communicate with the memory 302 and execute a series of computer executable instructions in the memory 302 on the in-vivo detection device. The biopsy device may also include one or more power supplies 303, one or more wired or wireless network interfaces 304, one or more input/output interfaces 305, and one or more keyboards 306.

In particular, in this embodiment, the in-vivo detection apparatus includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the in-vivo detection apparatus, and be configured to be executed by the one or more processors.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present invention.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present invention and is not intended to limit the present invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are to be included in the scope of the claims of the present invention.

Claims

1. A method of in vivo detection, the method comprising:

2. The method of claim 1, wherein the determining the target video based on the contrast and brightness of the video to be processed comprises:

3. The method of claim 2, wherein determining the exposure level value of the video to be processed based on a linear fit relationship between the predetermined exposure level value, the contrast, and the brightness, and the contrast and the brightness of the video to be processed, comprises:

4. The method of claim 3, wherein prior to said determining a linear fit relationship between exposure level value, contrast, and brightness for said target flesh tone type from a linear fit relationship between exposure level value, contrast, and brightness for each predetermined flesh tone type according to said target flesh tone type, said method further comprises:

Acquiring a data set comprising faces of different skin color types;

dividing the data set according to skin color types to obtain a plurality of sub data sets;

determining a contrast value range, a brightness value range and an exposure degree value of a human face in each sub-data set;

and performing linear fitting on the contrast value range, the brightness value range and the exposure degree value of the human face in each sub-data set to obtain a linear fitting relation among the exposure degree value, the contrast and the brightness corresponding to each skin color type.

5. The method according to claim 2, wherein before said determining whether the user in the target video is a living body based on whether the detection value belongs to a target detection value range, the method further comprises:

inputting specified videos with different exposure degree values into the living body identification model, and obtaining predicted values corresponding to the different exposure degree values output by the living body identification model; wherein, the users in the appointed video are living bodies or are not living bodies;

performing linear fitting on the exposure degree value of the specified video and a predicted value corresponding to the exposure degree value of the specified video to obtain a linear fitting relation of the exposure degree value and the predicted value;

According to the linear fitting relation between the exposure degree value and the predicted value and the exposure degree value of the target video, adjusting a preset initial detection value range of the living body identification model to obtain a target detection value range; the living body recognition model is trained according to training samples, and the training samples at least comprise training videos and detection values for identifying whether users in the training videos are living bodies.

6. The method according to claim 1, wherein after said determining whether the user in the target video is a living body based on whether the detection value belongs to a target detection value range, the method further comprises:

acquiring at least two frames of images containing human faces from the video to be processed and the target video;

extracting face features in images of each frame containing faces;

and determining whether the faces in the at least two frames of images containing the faces are from the same user according to the extracted face features.

7. The method according to any one of claims 1 to 6, wherein the performing static living body detection on the target video by a pre-trained living body recognition model to obtain a detection value output by the living body recognition model includes:

8. A living body detection apparatus, characterized in that the apparatus comprises:

the first acquisition module is used for acquiring the video to be processed containing the action appointed by the user; wherein the specified action is an action made by the user according to the action prompt;

the first determining module is used for determining a target video according to the contrast and brightness of the video to be processed;

9. A living body detecting apparatus, characterized by comprising: memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, performs the steps of the method according to any one of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the steps of the method according to any one of claims 1 to 7.