CN115457665A

CN115457665A - Living body detection method, living body detection apparatus, living body detection device, storage medium, and program product

Info

Publication number: CN115457665A
Application number: CN202211164572.4A
Authority: CN
Inventors: 许腾; 刘丽娟; 廖敏飞; 康亚冰; 任肖丽; 郑桂浩
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2022-12-09

Abstract

The application provides a method, an apparatus, a device, a storage medium and a program product for detecting a living body. Relates to the technical field of artificial intelligence recognition and classification. Acquiring text information when a face is detected in a screen designated area; displaying the digital sequence at different point positions of a screen in sequence within a specified time range, and generating first prompt information for reminding a user to read the digital sequence; shooting user operation behaviors in a specified time range to obtain a video to be detected; recognizing the voice information to obtain effective voice and effective time points, and extracting a target image frame matched with the effective time points from the image information; and performing living body detection on the consistency of the effective voice and the target image frame according to the text information. When the user is guided to perform corresponding operation according to the instruction, the video to be detected containing the user operation behavior is obtained, and the in-vivo detection is performed on the consistency of the voice information and the image information in the video to be detected, so that the situation that false identity authentication is performed through counterfeit video is effectively avoided.

Description

Living body detection method, living body detection apparatus, living body detection device, storage medium, and program product

Technical Field

The present application relates to artificial intelligence recognition and classification, and more particularly, to a method, an apparatus, a device, a storage medium, and a program product for detecting a living body.

Background

Face recognition has been widely applied to various authentication scenes, and living body recognition is the central importance in these scenes, and currently, face recognition of a terminal mainly adopts action living bodies for anti-counterfeiting detection, such as nodding head, shaking head and opening mouth according to instructions, and living body detection based on background color change is also available.

However, the live body detection can be easily bypassed by aiming at the current fake video technology based on the deep fake technology, so that an attacker generates a mouth-opening video, a head-nodding video and a head-shaking video by only one static image, a camera call is intercepted by using a technical means, and the fake video stream can be replaced to successfully bypass the existing detection based on the silence live body and the action live body, thereby bringing great challenges to numerous business transactions.

Disclosure of Invention

The application provides a living body detection method, a living body detection device, living body detection equipment, a storage medium and a program product, which are used for solving the technical problem of inaccurate living body detection in the prior art.

In a first aspect, the present application provides a method of in vivo detection, comprising: acquiring text information when a face is detected in a designated area of a screen, wherein the text information comprises a number sequence;

displaying the digital sequence at different point positions of the screen in sequence within a specified time range, and generating first prompt information for prompting a user to read the digital sequence;

shooting the user operation behavior in the appointed time range to obtain a video to be detected, wherein the video to be detected comprises voice information and image information;

recognizing the voice information to obtain effective voice and effective time points, and extracting a target image frame matched with the effective time points from the image information;

and performing living body detection on the consistency of the effective voice and the target image frame according to the text information.

In a second aspect, the present application provides a living body detection apparatus comprising: the system comprises a text information acquisition module, a display module and a display module, wherein the text information acquisition module is used for acquiring text information when a human face is detected in a designated area of a screen, and the text information comprises a digital sequence;

the digital sequence display module is used for sequentially displaying the digital sequence at different point positions of the screen within a specified time range and generating first prompt information for reminding a user to read the digital sequence;

the to-be-detected video acquisition module is used for shooting the user operation behavior in the appointed time range to acquire a to-be-detected video, wherein the to-be-detected video comprises voice information and image information;

the effective voice and target image frame acquisition module is used for identifying the voice information to acquire effective voice and effective time points and extracting a target image frame matched with the effective time points from the image information;

and the living body detection module is used for carrying out living body detection on the consistency of the effective voice and the target image frame according to the text information.

In a third aspect, the present application provides an electronic device, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to implement the methods described herein.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, are configured to implement the method described herein.

In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, performs the method described herein.

According to the method, the device, the equipment, the storage medium and the program product, when the user is guided to perform corresponding operation according to the instruction, the video to be detected containing the user operation behavior is obtained, and the living body detection is performed on the consistency of the voice information and the image information in the video to be detected, namely the voice information is consistent with the corresponding user operation behavior in the image information, the success of the living body detection can be determined, and the consistency of the voice information and the image information can not be realized under the condition that the forged video cannot be prompted according to real-time information, so that when the voice information is inconsistent with the image information, the situation that the forged video living body detection is not passed possibly adopted at present can be determined, and the situation that the false identity authentication is performed through the forged video is effectively avoided.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

FIG. 1 is a first flowchart of a method for in-vivo detection provided in an embodiment of the present application;

FIG. 2 is a second flowchart of a method for detecting a living body according to an embodiment of the present application;

FIG. 3 is a third flowchart of a biopsy method according to an embodiment of the present disclosure;

FIG. 4 is a fourth flowchart of a living body detecting method according to an embodiment of the present disclosure;

FIG. 5 is a schematic structural diagram of a biopsy device according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings. In the technical scheme of the application, the data acquisition, storage, use, processing and the like all conform to relevant regulations of national laws and regulations

Fig. 1 is a flowchart of a living body detection method according to an embodiment of the present application. As shown in fig. 1, the living body detecting method includes the steps of:

in step S101, text information is acquired when a face is detected in a screen-specified area.

Optionally, before acquiring the text information when the face is detected in the designated area of the screen, the method further includes: receiving a detection starting instruction input by a user; and displaying a face detection frame on a screen according to the detection starting instruction, taking the face detection frame as a designated area, and generating second prompt information for guiding the user to look ahead at the camera and placing the face area in the face detection frame.

Specifically, in this embodiment, when a user needs to use a specific application through a terminal and perform identity authentication, by triggering a detection switch for the specific application, the terminal receives a detection start instruction input by the user, displays a face detection frame on a screen according to the detection start instruction, and uses the face detection frame as a specific area, where the face detection frame may be a circular area with a screen center as a circular point and a screen 1/4 width as a radius.

It should be noted that, while the face detection frame is displayed, second prompt information for guiding the user to see the camera and place the face region in the face detection frame is also generated, for example, "please place the face region in the face detection frame", and the second prompt information may be specifically played in a voice form or displayed on the screen in a text form.

Optionally, acquiring text information when a face is detected in the designated area of the screen includes: when a face is detected in a face detection frame of a screen, sending a text information calling instruction to a database; and receiving text information fed back by the database according to the calling instruction.

Specifically, in the process of displaying a face detection frame on a screen and guiding a user to look at the camera through second prompt information, shooting is carried out through the camera to obtain a camera video stream in real time, the obtained video stream is detected one by one image frame, when a face effective part, such as a complete face part, is detected in the face detection frame, a text information calling instruction is sent to a database, and text information fed back by the database is obtained, wherein the text information comprises a number sequence, such as 4 numbers in the text information, and a specific display position of each number on the screen.

And S102, displaying the digital sequence at different point positions of a screen in sequence within a specified time range, and generating first prompt information for reminding a user to read the digital sequence.

Specifically, the terminal displays the number sequence in the acquired text information at different point locations of the screen in sequence within a specified time range, for example, the point locations specifically include an upper left corner, an upper right corner, a lower left corner, a lower right corner, and the like of the screen, for example, the number sequence includes 1, 2, 3, and 4 numbers, and is sequentially displayed at the upper left corner, the upper right corner, the lower left corner, and the lower right corner. In the process of displaying the digital sequence on the screen, first prompt information for reminding a user to read the digital sequence is also synchronously generated, for example, "please read out the appearing numbers in sequence", and the first prompt information can be played in a sound form or displayed on the screen in a text form.

And step S103, shooting the user operation behavior in the appointed time range to acquire the video to be detected.

Specifically, in the process of guiding the user to read the digital sequence displayed on the screen through the first prompt message in the embodiment, the terminal shoots the user operation behavior within the specified time range through the camera, so as to obtain the video to be detected including the voice information and the image information.

It should be noted that, after the terminal device acquires the video to be detected, in order to avoid the situation that the video to be detected is forged due to the camera being attacked, live body detection is performed on the video to be detected, so as to ensure that the video to be detected is real and is shot by a user according to an instruction.

Optionally, recognizing the voice information to obtain an effective voice and an effective time point, and extracting a target image frame matched with the effective time point from the image information, includes: recognizing the voice information, and taking the voice information containing the number sequence as the effective voice, wherein the number of the effective voice is the same as the number of the numbers contained in the number sequence; and acquiring an effective time point corresponding to the effective voice, and extracting a target image frame matched with the effective time point from the image information.

Specifically, in the embodiment, the voice information is subjected to anti-counterfeit detection, specifically, the voice information is identified, the voice information including the number sequence is used as effective voice, and the number of the effective voice is the same as the number of the numbers included in the number sequence. For example, when four digits are included in the digit sequence, the number of valid voices obtained is also four.

Since the voice information and the image information are on the same time axis, in order to check the consistency between the voice and the image, effective time points corresponding to the effective voices are acquired, and a target image frame matched with the effective time points is extracted from the image information. For example, if the valid time points corresponding to the valid voices are t1, t2, t3, and t4, respectively, the target image frame 1, the target image frame 2, the target image frame 3, and the target image frame 4 respectively matching the valid time points t1, t2, t3, and t4 are extracted from the image information.

And step S105, performing living body detection on the consistency of the effective voice and the target image frame according to the text information.

Optionally, performing living body detection on consistency between the effective voice and the target image frame according to the text information, including: and judging whether the effective voice is matched with the digital sequence, if so, determining that the voice information is verified to be passed, and performing living body detection on the target image frame, otherwise, determining that the living body detection is not passed.

Optionally, the performing living body detection on the target image frame includes: acquiring image information of a designated part of a human face in a target image frame, wherein the designated part comprises a mouth and eyes; judging whether the shape of the mouth in the image information of the mouth of the human face is in an appointed state, if so, determining that the image information of the mouth of the human face passes the verification, and carrying out the living body detection on the image information of the eyes of the human face, otherwise, determining that the living body detection fails.

Optionally, the live body detection is performed on image information of the eyes of the human face, and includes: acquiring a first deviation direction of image information of human faces and eyes in each connected target image frame; acquiring adjacent effective time points corresponding to adjacent target image frames; acquiring a second deviation direction of digital point positions displayed on a screen in adjacent effective time points; and performing living body detection according to the first deviation direction and the second deviation direction.

Optionally, performing the living body detection according to the first deviation direction and the second deviation direction, includes: and judging whether the first deviation direction is consistent with the second deviation direction, if so, determining that the living body detection passes, and otherwise, determining that the living body detection does not pass.

Specifically, when the living body detection is performed on the consistency of the effective voice and the target image frame according to the text information, the living body detection is determined to pass only if the content of the effective voice is the same as the content of the text information and the user behaviors in the effective voice and the target image frame are consistent, so that the accuracy of the living body detection is ensured.

In the embodiment, when the user is guided to perform corresponding operation according to the instruction, the video to be detected including the user operation behavior is acquired, and live body detection is performed on the consistency of the voice information and the image information in the video to be detected, that is, the consistency of the voice information and the corresponding user operation behavior in the image information is required to be consistent, so that the success of the live body detection can be determined, and the consistency of the voice information and the image information can be realized under the condition that the forged video cannot be prompted according to real-time information.

Fig. 2 is a second flowchart of a method for living body Jin Ce according to the present disclosure. As shown in fig. 2, the in-vivo detection method includes the following steps:

in step S201, text information is acquired when a face is detected in a screen designation area.

Step S202, the digital sequence is displayed at different point positions of the screen in sequence within a specified time range, and first prompt information for reminding a user to read the digital sequence is generated.

Step S203, shooting the user operation behavior in the appointed time range to acquire the video to be detected.

Step S204, identifying the voice information to obtain effective voice and effective time points, and extracting target image frames matched with the effective time points from the image information.

Step S205, determine whether the valid voice matches the number sequence, if yes, go to step S206, otherwise, go to step S211.

Specifically, in this embodiment, after the valid voice is obtained, the valid voice is compared with the number sequence, and whether the valid voice is matched with the number sequence is determined, for example, the valid voice includes numbers 1, 2, 6, and 4, and the numbers included in the number sequence are 1, 2, 3, and 4, respectively, so that it can be known that the numbers included in the valid voice are different from the numbers included in the number sequence, and thus it can be determined that the valid voice is not matched with the number sequence; if the valid speech contains the numbers 1, 2, 3 and 4 respectively, the numbers contained in the valid speech and the numbers contained in the number sequence can be determined to be the same, so that the valid speech and the number sequence can be determined to be matched, namely, from the aspect of speech, the user can be determined to read the number sequence shown on the screen according to the first prompt message.

In step S206, the voice detection is passed, and the image information of the face designated portion in the target image frame is obtained.

When the valid voice is matched with the digital sequence, the voice detection can be determined to be passed, and at the moment, the image information of the designated part of the human face in the target image frame is acquired, wherein the designated part comprises a mouth and eyes, namely, the image information of the mouth and the image information of the eyes of the human face are acquired, so that the living body detection can be conveniently carried out according to the image information of the designated part.

Step S207, determining whether the shape of the mouth in the image information of the face mouth is a designated shape, if so, executing step S208, otherwise, executing step S211.

After the image information of the mouth of the human face in the target image frame is acquired, it is determined whether the shape of the mouth in the image information of the mouth of the human face is a specified shape, for example, the specified shape includes that the mouth is open. When the mouth is opened, the user reads data according to the first prompt information when the number is displayed on the screen, and therefore the image information of the face mouth passes the verification; when the mouth is closed, the user does not perform data reading according to the first prompt information when the numbers are displayed on the screen, so that the video to be detected can be basically determined to be fake and the live body detection is not passed.

Step S209, determining whether the image information of the human face and the eyes passes the verification, if so, performing step S210, otherwise, performing step S211.

Specifically, when the images of the eyes of the human face are checked, it is specifically determined whether the viewing directions of the eyes of the human face are correspondingly moved when the mouth of the human face is changed, if the eyes of the human face are moved, that is, in the process of displaying text information at different point positions of the screen, when the user reads the numbers displayed on the screen according to the prompt information, whether the viewing directions of the eyes of the user are changed therewith is also determined.

In step S210, the living body detection is passed.

In step S211, the living body detection fails.

Fig. 3 is a flowchart of a biopsy method provided in the embodiment of the present application. As shown in fig. 3, the in-vivo detection method includes the following steps:

in step S301, text information is acquired when a face is detected in a screen designation area.

And step S302, displaying the digital sequence at different point positions of the screen in sequence within a specified time range, and generating first prompt information for reminding a user to read the digital sequence.

Step S303, shooting the user operation behavior in the appointed time range to acquire the video to be detected.

Step S304, identifying the voice information to obtain effective voice and effective time point, and extracting the target image frame matched with the effective time point from the image information.

In step S305, it is determined whether the valid speech matches the digit sequence, if yes, step S306 is executed, otherwise, step S312 is executed.

Step S306, the voice detection is passed, and the image information of the face designated part in the target image frame is obtained.

Step S307, determining whether the shape of the mouth in the image information of the face mouth is a designated shape, if so, performing step S308, otherwise, performing step S312.

Step S308, the image information of the human face mouth passes the verification, and the first deviation direction of the image information of the human face eyes in each connected target image frame is obtained.

When the image information of the human face mouth passes the verification, a first deviation direction of the image information of the human face eyes in each adjacent target image frame is obtained, for example, the eye pupil in the target image frame 1 is located at the center of the eye, and the eye pupil in the next adjacent target image frame 2 is located in the direction to the right of the center of the eye, so that the first deviation direction can be determined to be the horizontal movement direction to the right, and therefore, the eye movement condition in the target image frame can be obtained according to the first deviation direction.

Step S309, acquiring adjacent effective time points corresponding to each adjacent target image frame, and acquiring a second deviation direction of digital point locations displayed on the screen in the adjacent effective time points.

Specifically, in this embodiment, adjacent valid time points corresponding to each adjacent target image frame are also obtained, for example, if the valid time point corresponding to the target image frame 1 is t1, and the valid time point corresponding to the target image frame 2 is t2, then the digital point locations displayed on the screens of the valid time points t1 and t2 are obtained, for example, the valid time point t1 shows the number 1 in the upper left corner of the screen, and the valid time point t2 shows the number 2 in the upper right corner of the screen, so that it can be obtained that the second deviation direction of the digital point locations displayed on the screens of the valid time points t1 and t2 is horizontal movement to the right.

Step S310, determining whether the first deviation direction and the second deviation direction are consistent, if yes, performing step S311, otherwise, performing step S312.

When the first deviation direction is consistent with the second deviation direction, it is described that when the images are displayed at different point positions on the screen, the eyes of the user move along with the displayed numbers, so that when the user reads the images according to the numbers randomly displayed on the screen, the eyes also move synchronously according to the dynamically changed numbers on the screen. Thereby determining that the live body detection passes when the valid speech and the target image frame coincide

In step S311, the living body detection is passed.

In step S312, the living body detection fails.

Fig. 4 is a fourth flowchart of a living body detection method according to an embodiment of the present application. As shown in fig. 4, the living body detecting method includes the steps of:

in step S401, text information is acquired when a face is detected in a screen-specified area.

And S402, displaying the digital sequence at different point positions of the screen in sequence within a specified time range, and generating first prompt information for reminding a user to read the digital sequence.

And step S403, shooting the user operation behavior in the specified time range to acquire the video to be detected.

Step S404, identifying the voice information to obtain effective voice and effective time point, and extracting the target image frame matched with the effective time point from the image information.

Step S405, performing living body detection on the consistency of the effective voice and the target image frame according to the text information.

In step S406, the obtained result of the living body detection is displayed on the screen.

In the present embodiment, after the biopsy result is obtained according to the biopsy, the obtained biopsy result is displayed on the screen, for example, when the biopsy passes, a "biopsy passes, current environment is safe" is displayed on the screen; when the living body detection is not passed, the method can display that the living body detection is not passed and the current environment has risk on a screen "

FIG. 5 is a schematic structural diagram of a biopsy device according to an embodiment of the present disclosure. As shown in fig. 5, the living body detecting apparatus includes:

the text information obtaining module 510 is configured to obtain text information when a human face is detected in a designated area of a screen, where the text information includes a number sequence;

the digital sequence display module 520 is configured to sequentially display the digital sequence at different point locations of the screen within a specified time range, and generate first prompt information for prompting a user to read the digital sequence;

the to-be-detected video acquisition module 530 is configured to capture a user operation behavior within a specified time range to acquire a to-be-detected video, where the to-be-detected video includes voice information and image information;

an effective speech and target image frame acquiring module 540, configured to recognize speech information to acquire effective speech and effective time points, and extract a target image frame matched with the effective time points from the image information;

and a living body detection module 550, configured to perform living body detection on the consistency of the effective voice and the target image frame according to the text information.

Optionally, the apparatus further includes a second prompt information generating module, configured to receive a detection start instruction input by the user;

and displaying the face detection frame on the screen according to the detection starting instruction, taking the face detection frame as a designated area, and generating second prompt information for guiding the user to look ahead the camera and placing the face area in the face detection frame.

Optionally, the text information obtaining module is configured to send a text information calling instruction to the database when a face is detected in a face detection box of the screen;

and receiving the text information fed back by the database according to the calling instruction.

Optionally, the valid voice and target image frame acquiring module is configured to identify voice information and use the voice information including a number sequence as valid voice, where the number of the valid voice is the same as the number of digits included in the number sequence;

and acquiring effective time points corresponding to the effective voices, and extracting target image frames matched with the effective time points from the image information.

Optionally, the live body detection module is configured to determine whether the valid voice matches the digital sequence, determine that the voice information check passes and perform live body detection on the target image frame if the valid voice matches the digital sequence, and otherwise determine that the live body detection fails.

Optionally, the living body detection module is configured to acquire image information of a designated part of a human face in the target image frame, where the designated part includes a mouth and eyes;

and judging whether the shape of the mouth part in the image information of the human face mouth part is in an appointed state, if so, determining that the image information of the human face mouth part passes the verification, and performing living body detection on the image information of the human face eyes, otherwise, determining that the living body detection fails.

Optionally, the living body detection module is configured to acquire a first deviation direction of image information of the human face and the eyes in each connected target image frame;

acquiring adjacent effective time points corresponding to adjacent target image frames;

acquiring a second deviation direction of digital point positions displayed on a screen in adjacent effective time points;

and performing living body detection according to the first deviation direction and the second deviation direction.

Optionally, the live body detection module is configured to determine whether the first deviation direction and the second deviation direction are consistent, if so, determine that the live body detection passes, and otherwise, determine that the live body detection fails.

Optionally, the apparatus further includes a display module, configured to display the obtained result of the in-vivo test on a screen.

The biopsy device provided by the embodiment of the application can be used for executing the technical scheme of the biopsy method in the embodiment, the implementation principle and the technical effect are similar, and details are not repeated here.

It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

FIG. 6 illustrates a schematic structural diagram of an electronic device 10 that may be used to implement an embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 6, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The processor 11 performs the various methods and processes described above, such as the liveness detection method.

In some embodiments, the liveness detection method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the above-described liveness detection method may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the liveness detection method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Computer programs for implementing the methods of the present invention can be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

Embodiments of the present invention further provide a computer program product, including a computer program, which, when executed by a processor, implements the living body detection method as provided in any of the embodiments of the present application.

Computer program product in implementing the computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and including conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method of in vivo detection, comprising:

acquiring text information when a face is detected in a designated area of a screen, wherein the text information comprises a number sequence;

identifying the voice information to obtain effective voice and effective time points, and extracting a target image frame matched with the effective time points from the image information;

2. The method according to claim 1, wherein before the acquiring the text information when the face is detected in the screen specified area, further comprising:

receiving a detection starting instruction input by a user;

and displaying a face detection frame on the screen according to the detection starting instruction, taking the face detection frame as the designated area, and generating second prompt information for guiding a user to look ahead at the camera and placing the face area in the face detection frame.

3. The method according to claim 2, wherein the acquiring text information when a face is detected in a designated area of a screen comprises:

when a face is detected in the face detection frame of the screen, sending a text information calling instruction to a database;

4. The method according to claim 1, wherein the recognizing the voice information to obtain effective voice and effective time point and extracting a target image frame matching the effective time point from the image information comprises:

recognizing the voice information, and taking the voice information containing the number sequence as the effective voice, wherein the number of the effective voice is the same as the number of the numbers contained in the number sequence;

and acquiring an effective time point corresponding to the effective voice, and extracting a target image frame matched with the effective time point from the image information.

5. The method of claim 1, wherein the live body detection of the coincidence of the valid speech and the target image frame according to the text information comprises:

and judging whether the effective voice is matched with the digital sequence, if so, determining that the voice information is verified to be passed, and performing living body detection on the target image frame, otherwise, determining that the living body detection is not passed.

6. The method of claim 5, wherein the live body detection of the target image frame comprises:

acquiring image information of a designated part of a human face in the target image frame, wherein the designated part comprises a mouth and eyes;

7. The method according to claim 6, wherein the live body detection of the image information of the human face and eyes comprises:

acquiring a first deviation direction of image information of human faces and eyes in each connected target image frame;

acquiring a second deviation direction of the digital point location displayed on the screen in the adjacent effective time points;

8. The method of claim 7, wherein the performing a biopsy based on the first deviation direction and the second deviation direction comprises:

and judging whether the first deviation direction and the second deviation direction are consistent, if so, determining that the living body detection is passed, and otherwise, determining that the living body detection is not passed.

9. The method of claim 1, wherein after the live body detection of the coincidence of the valid speech and the target image frame according to the text information, further comprising:

and displaying the obtained living body detection result on the screen.

10. A living body detection device, comprising:

the system comprises a text information acquisition module, a display module and a display module, wherein the text information acquisition module is used for acquiring text information when a human face is detected in a designated area of a screen, and the text information comprises a digital sequence;

the video to be detected acquisition module is used for shooting the user operation behavior in the appointed time range to acquire a video to be detected, wherein the video to be detected comprises voice information and image information;

11. The apparatus according to claim 10, further comprising a second prompt message generation module for receiving a detection start instruction input by a user;

12. The apparatus according to claim 11, wherein the text information obtaining module is configured to send a text information retrieving instruction to a database when a face is detected in the face detection box of the screen;

13. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to implement the method of any of claims 1-9.

14. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, are configured to implement the method of any one of claims 1-9.

15. A computer program product, characterized in that it comprises a computer program which, when being executed by a processor, carries out the method of any one of claims 1-9.